SlideShare a Scribd company logo
Chris Lott
Senior Director, Engineering
Qualcomm Technologies, Inc.
January 31, 2023
@QCOMResearch
Solving unsolvable
combinatorial
problems with AI
3
Today’s agenda
• The need for combinatorial optimization
• Solving combinatorial optimization
problems with AI
• Improved chip design with AI
• Improved compilers with AI
• Future directions
• Questions?
4
How do you find an
optimal solution
when faced with
many choices?
Some problems have
more possible solutions
than a game of Go:
~10170
5
Supply chain optimization Hardware-specific compiler
Chip design
Airline network planning
Combinatorial
optimization
problems are
all around us
Finding solutions can
provide significant benefits:
• Reducing cost
• Reducing time
• Increasing performance
6
Traveling Salesman
Problem (TSP)
Given a set of N cities with travel
distance between each pair, find
the shortest path that visits each
city exactly once
Example:
Using a brute force search method,
if a computer can check a solution
in a microsecond, then it would take
2 microseconds to solve 3 cities,
3.6 seconds for 11 cities, and
3857 years for 20 cities.
Exemplifies the combinatorial
optimization problem
7
7
How to approach solving TSP with brute force
An instance of the
traveling salesman problem
Search space
a 100 b
75 100
125
75
50
125
c
125
50
e
100
d
300
d e c e
e
375
a
abcdea
375
a
abceda
425
d
250
d
150
c
b
100
a
125
c
100
d
75
e
e
Each tree leaf node represents
one full tour of the cities
There are (N-1)! paths (tours) in the search
tree. Full enumeration quickly becomes
infeasible as N grows
Search space
(represented as a tree)
• Start at any city
• Choose from N-1 next cities
• Choose from N-2 next cities
• ......
• Choose last city and connect
to start city
8
a 100 b
75 100
125
75
50
125
c
125
50
e
100
d
300
d e c e
e
375
a
abcdea
375
a
abceda
425
d
250
d
150
c
b
100
a
125
c
100
d
75
e
e
N1
cost = 35
N2
cost = 53
N3
cost = 25
N4
cost = 31
N1
cost = 28
N2
cost = 50
N4
cost = 36
N2
cost = 52
N4
cost = 28
N2
cost = 28
N0
cost = 25
Brute force method
An instance of the
traveling salesman problem Search space
• Full path enumeration
• Naive
• Scales as (N-1)!
• infeasible for N>20
Heuristic methods
• The Nearest Neighbor method
• N-opt iterative path improvement
• Not guaranteed optimal
• Heuristics are problem-specific
(human-designed)
Exact solver methods
• Dynamic programming
• Formulate as Integer Linear Programming
(ILP), use Branch and Cut (B&C)
• Uses branch and bound to rule out
whole solution subspaces
• Combine with problem-specific cutting planes
• Scales up to 1000’s of nodes,
but at high computational cost
• Problem-specific
C 7 B
4 7
8
6
6
7
5
9
E
A
D 3
C 7 B
4 7
8
6
6
7
5
9
E
A
D 3
C 7 B
4
7
8
6
6
7
5
9
E
A
D 3
C 7 B
4
7
8
6
6
7
5
9
E
A
D 3
A
C 7 B
4 7
8
6
6
7
5
9
E
D 3
C 7 B
4 7
8
6
6
7
5
9
E
A
D 3
Existing TSP solutions face challenges
8
B&C
ruled out
subspaces
9
Existing combinatorial optimization
techniques have limitations
How can we improve this situation with AI?
Scale
Search heuristics don’t scale to larger problems in
acceptable compute time and cost, and do not
guarantee satisfaction of all the constraints resulting in
expensive manual intervention
Learning
Techniques don’t incorporate knowledge learned
from solving many problems. They start each
new problem instance from scratch.
10
Notable prior work: “Attention, learn to solve routing problems!”, ICLR 2019
AI addresses
challenges of
traditional combinatorial
optimization solutions
Leverages learned
problem structure
Scales to larger instances
Offers a general framework
Can achieve desired outcome
with resource, cost, and time
constraints
Optimization
metric
AI Solver
Develop an AI algorithm that can learn
correlation between design parameters
and optimization metrics from a limited
set of problem instances
𝒑𝒂𝒓𝒂𝒎𝒆𝒕𝒆𝒓𝒔
For new instances, the AI algorithm uses an
existing solver more efficiently by reducing
parameter search space
Heuristics
AI
Standard process
AI process
Exploring Bayesian optimization to reduce combinatorial search space
Optimizing chip design
with AI
“Bayesian Optimization for Macro Placement”, ICML 2022
12
12
Competing combinatorial optimization objectives in chip design
Need to account for all business metrics
Chip
Area
Yield – more
complex fab process
Test
Time
Production
cost reduction
Power-performance
optimization
System-level power Chip-level power
Design
efficiency
# of tools iterations License cost
Capital
expense
(Capex)
# of compute servers Emulation platforms
13
Semiconductor
scaling advantage
is approaching a cliff
PPA = Power, Performance, Area
Foundation
Chips
Design
automation
Computing
Key elements
Moore’s Law
PPA scaling
Combinatorial
optimization
Cloud
servers
Disruption
50%
less gain
100x
PVT corners
10x
expense
14
0,01
0,1
1
2012 2014 2016 2018 2020 2022 2024 2026 2028 2030
Scaling
Year
Stdcell
SRAM bitcell
analog/IO
CO needs to compensate for diminishing technology gains and
PVT corner increase within acceptable compute expense 14
Theoretical Moore’s Law
0.5X / 2yr
Analog/IO
Memory
Digital logic
Area scaling over time
15
Macros (memories)
0 (10 — 100)
AND
Standard cells (logic gates)
0 (107
— 109
)
AND
1000 x
Point-like
Challenges
in chip
placement:
How can we solve the chip optimization problem?
1. Placing mixed-size blocks –
standard cells and macros
(memories)
Minimize power and area while
satisfying timing constraints within
limited design resources (people
and compute servers)
2. Scale with increasing
complexity of design (# of
blocks) and constraints
(e.g., PVT corners)
16
Each iteration can take up to several weeks for state-of-the-art designs and technologies
Chip design is
comprised of
iterative macro
and standard
cell placement
A complex and very
computationally
intensive part of
chip design
Outer loop
Hours-days
Macro placement Cell placement
Inner loop
No-overlap 2D:
𝑁! !
Designers manually select a macro
placement and use solvers to
optimize the standard cell
placement (inner loop) and then
manually iterate (outer loop)
1.
Fit a
probabilistic
surrogate
model
3.
Evaluate
and
back to 1
2.
Minimize cheap
acquisition
function
Goal
Find minimum of
an expensive function
⋆: Data
Cost
True function
Surrogate
Surrogate
uncertainty
Search space
Exploration
of highly uncertain areas
Exploitation
of promising current
minimal areas
vs
(tradeoff)
Next
point
Bayesian optimization efficiently solves problems iteratively
How to apply to
macro placement?
Cost
Cost
Search space
Search space
18
Bayesian optimization learns a surrogate function, which maps each macro placement to a
PPA quality metric, and uses it to narrow down the search over the large macro placement space
Bayesian
optimization
for macro
layout
Inner loop optimization
incorporated into
surrogate function
Outer loop
Hours-days
Macro placement Cell placement
Inner loop
(50!)2 ≈ 10128
PPA
Surrogate function
“Bayesian Optimization for Macro Placement”, ICML 2022
19
Simulated annealing
Bayesian optimization
Number of evaluations Number of evaluations Number of evaluations
Wire
length
Results are for public MCNC benchmark for layouts (without standard cells)
Three different chip designs: hp, ami33, ami49
Optimization objective is to minimize HPWL (wire length)
This is a simpler objective than the Inner Loop PPA
Further research aimed at generalizing this technique for production designs, across
all PPA metrics and with the inclusion of design constraints
Bayesian
optimization can
converge faster
with better design metrics
compared to conventional
simulated annealing heuristics
for an unconstrained version
of the problem
19
“Bayesian Optimization for Macro Placement”, ICML 2022
20
Slow, costly,
but accurate
for sign-off and
automation
Algorithmic optimization
based on analytic
solutions needs
human guidance
Fast, cheap,
and
accurate for
optimization
Data-driven AI optimization
can aid designers with fast
evaluations and guide
algorithmic optimization
with optimal inputs
AI can improve core components of compilers, including
sequencing, scheduling, tiling, and placement
AI compilers can be
optimized with AI
“Neural topological ordering for computation graphs”, NeurIPS 2022
22
Qualcomm AI Stack, Qualcomm Neural Processing SDK ,and Qualcomm AI Engine Direct are products of Qualcomm Technologies, Inc. and/or its subsidiaries. AIMET Model Zoo is a product of Qualcomm Innovation Center, Inc.
Infrastructure:
Programming Languages
Virtual platforms
Core Libraries
Math Libraries
Profilers & Debuggers
Compilers
System Interface SoC, accelerator drivers
Qualcomm® AI Engine Direct
Emulation Support
AI Frameworks
Auto
XR Robotics
IoT
ACPC
Smartphones Cloud
Platforms
AI Runtimes
Qualcomm® Neural Processing SDK TF Lite
TF Lite Micro Direct ML
AIMET
AIMET
Model Zoo
NAS
Model
analyzers
Qualcomm AI
Model Studio
Tools:
23
Deployment
Compiler
Tiling
Sequencing
Scheduling
…
Placement
Tiling and placement
Splits net blocks into efficient
code Ops and places them
on multiple compute devices
Sequencing
Determines the best compute
ordering of the nodes
Scheduling
Parallelizes across compute
engines and sets final timing
Deployment
Puts the resulting
generated code onto
the target hardware
Our example here will focus
on the sequencing problem
The AI compiler converts an input neural net into an
efficient program that runs on target hardware
23
Optimizing for latency, performance, and power
Neural net Sequencing
24
The runtime
on the target
device is
expensive
to evaluate
• It can take minutes to
set up the compiler with
all your decisions and run
one computation graph
• Quite a large
decision space
• Goal: minimizing the
runtime of the pipeline
over the space of tiling,
sequencing, and
scheduling choices
• Metrics: Double Data Rate
(DDR) and Tightly Coupled
Memories (TCM)
25
Minimizing
memory usage:
• Maximizes inferences/sec
by reducing waits for
external memory access
and by allowing for
compute parallelism
• Minimizes energy/inference
by reducing data movement
Sequencing has a big impact on memory usage
CONV_A_0 CONV_A_1
CONV_B_0 CONV_B_1
POOL_0 POOL_1
ACT_0 ACT_1
WT_B
WT_A
0
1
2
3
4
5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Memory
Time
Option #1: Data pre-fetch
High memory utilization,
Peak > 4MB
Memory utilization
CONV_A_0 CONV_A_1
CONV_B_0 CONV_B_1
POOL_0 POOL_1
ACT_0 ACT_1
WT_B
WT_A
0
1
2
3
4
5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Memory
Time
Memory utilization
Lowest memory utilization,
Peak = 3.25 MB
Option #2: Low memory utilization
26
• # sequences is
# node permutations
that follow precedence
constraints
• Grows exponentially
in # nodes in
compute graph
“Neural Topological Ordering for Computation Graphs”, NeurIPS 2022
Input
A computation graph for a neural net
(up to 10k’s of nodes)
Path cost during execution
progression of improvements
Objective
Find a sequence that minimizes
the peak memory usage
Application
Reduce the inference time of AI models on chips
by minimizing access to external memory
External memory access can be 100x local memory
and compute times, and is energy-intensive
Computation graph (Ops)
Sequencing computation graphs to minimize memory usage
26
27
End-to-end machine learning (ML) for sequencing
Formulation originally motivated by “Attention, learn to solve routing problems!”, ICLR 2019
Initial embeddings
Use node properties based
on graph structure as initial
node embeddings
27
Key components
Encoder
Use custom graph NN to capture
the graph topology in embeddings
while allowing for arbitrary graphs
Decoder
Generates a distribution
in the sequence space
𝑃! 𝑠𝑒𝑞|𝐺 and autoregressively
generates a sequence
Objective
min. 𝔼"#$~&!
𝐶𝑜𝑠𝑡 𝑠𝑒𝑞 ,
use RL to train encoder
decoder architecture
end-to-end
ML-based sequencer
Embedding Propagator Sequencer
Sequence
probabilities
Embed Encoder Decoder
Policy
Net
RL-trained
agent
Input
compute
graph
Search
sampling, beam
search, greedy
Ordering
28
28
A combination of real and synthetic graphs used for training
We developed a novel approach to generating realistic synthetic graphs since real graph data is scarce
Real graphs
Synthetic graphs
We released the algorithm to make these
29
BS: Beam Search; DP: Dynamic Programming; BFS: Breadth-First Sequence; DFS: Depth-First Sequence
Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
Memory
usage
gap
%
Real graphs test set (23 graphs)
Sequence generation time
Our model
generalizes well
and beats baselines
comprehensively
• Dataset of 115
representative graphs
• Size: few dozen
to 1k nodes
• Our model performs
better and is much faster
• Results on Snapdragon 8
30
“Neural DAG Scheduling via One-Shot Priority Sampling”, NeurIPS 2022 Optimization Workshop, to appear at ICLR 2023
End-to-end ML
for scheduling
Input
A computation graph with
Op durations, set of compute devices
Output
Schedule: Define the start time
and device allocation for all nodes
Objective
Find a schedule that minimizes
the runtime (latency)
Applications
Reduce the inference time
of ML models on chips
Directed Acyclic Graph
(DAG)
duration
1.0
1.0
2.0
1.0
1
3
2
4
Final schedule
1 3
2
4
Maximize utilization of
parallel hardware while
maintaining sequence
31
31
CP: Critical Path; SPT: Shortest Processing Time; MOPNR: Most Operations Remaining
Our end-to-end ML scheduler achieves state-of-the-art results
Can be optimized for different performance metrics, such as latency
Algorithm
Node graphs
200 – 500 500 – 700 700 – 1000
SpeedUp SpeedUp SpeedUp
Baseline
schedulers
CP 3.17 2.80 2.74
SPT 3.11 2.87 2.66
MOPNR 3.18 2.82 2.74
Our AI
scheduler
S(256) 3.28 3.20 2.86
End-to-end ML structure
similar to previous for
sequencing, train to
minimize latency
Results for a set of compute
graphs of different sizes
Achieves better speedup
(inversely proportional to
latency, higher is better)
SpeedUp – higher is better
32
AI for full end-to-end compiler
Self-supervised learning for CO
Dynamical systems and AI
Combine efficiency of dynamic models with AI learning
Scale AI up to larger graphs Multi-core compiler
AI-augmentation of solvers
Combine best of solvers and AI methods
Robustify learning to distribution shift
Ensure good results for diverse inputs
Need to efficiently extract
global graph structure
Further develop AI approaches
targeting many distributed cores
Broad range
of research directions
for AI combinatorial
optimization
Learn efficient problem
representation to solve easier
Include all aspects including
compute primitives and tiling
33
Improved combinatorial
optimization techniques
offer benefits for a variety of
use cases across industries
Qualcomm AI Research has
achieved state-of-the-art results
in combinatorial optimization for
chip design and AI compilers
We are enabling combinatorial
optimization technology at scale
to address challenging problems
Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
34
www.qualcomm.com/research/artificial-intelligence @QCOMResearch
www.qualcomm.com/news/onq
www.youtube.com/c/QualcommResearch www.slideshare.net/qualcommwirelessevolution
Connect with us
Questions
Follow us on:
For more information, visit us at:
qualcomm.com & qualcomm.com/blog
Thank you
Nothing in these materials is an offer to sell any of the components
or devices referenced herein.
©2018-2023 Qualcomm Technologies, Inc. and/or its affiliated
companies. All Rights Reserved.
Qualcomm and Snapdragon are trademarks or registered trademarks of Qualcomm
Incorporated. Other products and brand names may be trademarks
or registered trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm Incorporated,
Qualcomm Technologies, Inc., and/or other subsidiaries or business units within
the Qualcomm corporate structure, as applicable. Qualcomm Incorporated
includes our licensing business, QTL, and the vast majority of our patent portfolio.
Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates,
along with its subsidiaries, substantially all of our engineering, research and
development functions, and substantially all of our products and services businesses,
including our QCT semiconductor business.

More Related Content

What's hot

Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022
Qualcomm Research
 
5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation
Takayuki Yamazaki
 
Intelligently connecting our world in the 5G era
Intelligently connecting our world in the 5G eraIntelligently connecting our world in the 5G era
Intelligently connecting our world in the 5G era
Qualcomm Research
 
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving SchemesBeginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
3G4G
 
Device-level AI for 5G and beyond
Device-level AI for 5G and beyondDevice-level AI for 5G and beyond
Device-level AI for 5G and beyond
3G4G
 
Next-Generation Wireless Overview & Outlook Update 12/8/21
Next-Generation Wireless Overview & Outlook Update 12/8/21Next-Generation Wireless Overview & Outlook Update 12/8/21
Next-Generation Wireless Overview & Outlook Update 12/8/21
Mark Goldstein
 
6G: Potential Use Cases and Enabling Technologies
6G: Potential Use Cases and Enabling Technologies6G: Potential Use Cases and Enabling Technologies
6G: Potential Use Cases and Enabling Technologies
3G4G
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G future
Qualcomm Research
 
Introduction to 6G, prepare now training
Introduction to 6G, prepare now trainingIntroduction to 6G, prepare now training
Introduction to 6G, prepare now training
Tonex
 
Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing
IRS srl
 
5G Marketting
5G  Marketting5G  Marketting
5G Marketting
ssuser220dc6
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
Qualcomm Research
 
5G Services Story
5G Services Story5G Services Story
5G Services Story
Ericsson
 
5G and 6G.pptx
5G and 6G.pptx5G and 6G.pptx
5G and 6G.pptx
nassmah
 
Beginners: Open RAN, White Box RAN & vRAN
Beginners: Open RAN, White Box RAN & vRANBeginners: Open RAN, White Box RAN & vRAN
Beginners: Open RAN, White Box RAN & vRAN
3G4G
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
Qualcomm Research
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18
Qualcomm Research
 
5G and IoT Security
5G and IoT Security5G and IoT Security
5G and IoT Security
NUS-ISS
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
Qualcomm Research
 
Huawei 5G Overview.pdf
Huawei 5G Overview.pdfHuawei 5G Overview.pdf
Huawei 5G Overview.pdf
MuthuramanElangovan
 

What's hot (20)

Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022
 
5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation
 
Intelligently connecting our world in the 5G era
Intelligently connecting our world in the 5G eraIntelligently connecting our world in the 5G era
Intelligently connecting our world in the 5G era
 
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving SchemesBeginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
Beginners: Energy Consumption in Mobile Networks - RAN Power Saving Schemes
 
Device-level AI for 5G and beyond
Device-level AI for 5G and beyondDevice-level AI for 5G and beyond
Device-level AI for 5G and beyond
 
Next-Generation Wireless Overview & Outlook Update 12/8/21
Next-Generation Wireless Overview & Outlook Update 12/8/21Next-Generation Wireless Overview & Outlook Update 12/8/21
Next-Generation Wireless Overview & Outlook Update 12/8/21
 
6G: Potential Use Cases and Enabling Technologies
6G: Potential Use Cases and Enabling Technologies6G: Potential Use Cases and Enabling Technologies
6G: Potential Use Cases and Enabling Technologies
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G future
 
Introduction to 6G, prepare now training
Introduction to 6G, prepare now trainingIntroduction to 6G, prepare now training
Introduction to 6G, prepare now training
 
Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing
 
5G Marketting
5G  Marketting5G  Marketting
5G Marketting
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 
5G Services Story
5G Services Story5G Services Story
5G Services Story
 
5G and 6G.pptx
5G and 6G.pptx5G and 6G.pptx
5G and 6G.pptx
 
Beginners: Open RAN, White Box RAN & vRAN
Beginners: Open RAN, White Box RAN & vRANBeginners: Open RAN, White Box RAN & vRAN
Beginners: Open RAN, White Box RAN & vRAN
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18
 
5G and IoT Security
5G and IoT Security5G and IoT Security
5G and IoT Security
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 
Huawei 5G Overview.pdf
Huawei 5G Overview.pdfHuawei 5G Overview.pdf
Huawei 5G Overview.pdf
 

Similar to Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI

Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Eray Cakici
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
John Holden
 
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
enriquealbabaena6868
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
Shivalik college of engineering
 
Syste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.pptSyste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.ppt
monzhalabs
 
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTXDecision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
SanjayKPrasad2
 
AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)
byteLAKE
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
NECST Lab @ Politecnico di Milano
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Joachim Schlosser
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Codemotion
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
TigerGraph
 
realtime_ai_systems_academia.pptx
realtime_ai_systems_academia.pptxrealtime_ai_systems_academia.pptx
realtime_ai_systems_academia.pptx
gopikahari7
 
lec01.pdf
lec01.pdflec01.pdf
lec01.pdf
BeiYu6
 
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회 [TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
NAVER D2 STARTUP FACTORY
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
TigerGraph
 
AMIT PATIL- Embedded OS Professional
AMIT PATIL- Embedded OS ProfessionalAMIT PATIL- Embedded OS Professional
AMIT PATIL- Embedded OS ProfessionalAmit Patil
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
LEGATO project
 
System On Chip
System On ChipSystem On Chip
System On Chip
A B Shinde
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
mbreternitz
 
Ag o product overview
Ag o product overviewAg o product overview
Ag o product overviewManoj Nagesh
 

Similar to Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI (20)

Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
 
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
Syste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.pptSyste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.ppt
 
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTXDecision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
 
AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
realtime_ai_systems_academia.pptx
realtime_ai_systems_academia.pptxrealtime_ai_systems_academia.pptx
realtime_ai_systems_academia.pptx
 
lec01.pdf
lec01.pdflec01.pdf
lec01.pdf
 
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회 [TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
AMIT PATIL- Embedded OS Professional
AMIT PATIL- Embedded OS ProfessionalAMIT PATIL- Embedded OS Professional
AMIT PATIL- Embedded OS Professional
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
 
Ag o product overview
Ag o product overviewAg o product overview
Ag o product overview
 

More from Qualcomm Research

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Qualcomm Research
 
The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
Qualcomm Research
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdf
Qualcomm Research
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensing
Qualcomm Research
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdf
Qualcomm Research
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)
Qualcomm Research
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5G
Qualcomm Research
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge
Qualcomm Research
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
Qualcomm Research
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
Qualcomm Research
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous driving
Qualcomm Research
 
Pioneering 5G broadcast
Pioneering 5G broadcastPioneering 5G broadcast
Pioneering 5G broadcast
Qualcomm Research
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RAN
Qualcomm Research
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave?
Qualcomm Research
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
Qualcomm Research
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...
Qualcomm Research
 
5G spectrum innovations and global update
5G spectrum innovations and global update5G spectrum innovations and global update
5G spectrum innovations and global update
Qualcomm Research
 
The essential role of technology standards
The essential role of technology standardsThe essential role of technology standards
The essential role of technology standards
Qualcomm Research
 
Smart transportation | Intelligent transportation system (ITS)
Smart transportation | Intelligent transportation system (ITS)Smart transportation | Intelligent transportation system (ITS)
Smart transportation | Intelligent transportation system (ITS)
Qualcomm Research
 
Pushing the boundaries of AI research
Pushing the boundaries of AI researchPushing the boundaries of AI research
Pushing the boundaries of AI research
Qualcomm Research
 

More from Qualcomm Research (20)

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdf
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensing
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdf
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5G
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous driving
 
Pioneering 5G broadcast
Pioneering 5G broadcastPioneering 5G broadcast
Pioneering 5G broadcast
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RAN
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave?
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...
 
5G spectrum innovations and global update
5G spectrum innovations and global update5G spectrum innovations and global update
5G spectrum innovations and global update
 
The essential role of technology standards
The essential role of technology standardsThe essential role of technology standards
The essential role of technology standards
 
Smart transportation | Intelligent transportation system (ITS)
Smart transportation | Intelligent transportation system (ITS)Smart transportation | Intelligent transportation system (ITS)
Smart transportation | Intelligent transportation system (ITS)
 
Pushing the boundaries of AI research
Pushing the boundaries of AI researchPushing the boundaries of AI research
Pushing the boundaries of AI research
 

Recently uploaded

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 

Recently uploaded (20)

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 

Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI

  • 1. Chris Lott Senior Director, Engineering Qualcomm Technologies, Inc. January 31, 2023 @QCOMResearch Solving unsolvable combinatorial problems with AI
  • 2. 3 Today’s agenda • The need for combinatorial optimization • Solving combinatorial optimization problems with AI • Improved chip design with AI • Improved compilers with AI • Future directions • Questions?
  • 3. 4 How do you find an optimal solution when faced with many choices? Some problems have more possible solutions than a game of Go: ~10170
  • 4. 5 Supply chain optimization Hardware-specific compiler Chip design Airline network planning Combinatorial optimization problems are all around us Finding solutions can provide significant benefits: • Reducing cost • Reducing time • Increasing performance
  • 5. 6 Traveling Salesman Problem (TSP) Given a set of N cities with travel distance between each pair, find the shortest path that visits each city exactly once Example: Using a brute force search method, if a computer can check a solution in a microsecond, then it would take 2 microseconds to solve 3 cities, 3.6 seconds for 11 cities, and 3857 years for 20 cities. Exemplifies the combinatorial optimization problem
  • 6. 7 7 How to approach solving TSP with brute force An instance of the traveling salesman problem Search space a 100 b 75 100 125 75 50 125 c 125 50 e 100 d 300 d e c e e 375 a abcdea 375 a abceda 425 d 250 d 150 c b 100 a 125 c 100 d 75 e e Each tree leaf node represents one full tour of the cities There are (N-1)! paths (tours) in the search tree. Full enumeration quickly becomes infeasible as N grows Search space (represented as a tree) • Start at any city • Choose from N-1 next cities • Choose from N-2 next cities • ...... • Choose last city and connect to start city
  • 7. 8 a 100 b 75 100 125 75 50 125 c 125 50 e 100 d 300 d e c e e 375 a abcdea 375 a abceda 425 d 250 d 150 c b 100 a 125 c 100 d 75 e e N1 cost = 35 N2 cost = 53 N3 cost = 25 N4 cost = 31 N1 cost = 28 N2 cost = 50 N4 cost = 36 N2 cost = 52 N4 cost = 28 N2 cost = 28 N0 cost = 25 Brute force method An instance of the traveling salesman problem Search space • Full path enumeration • Naive • Scales as (N-1)! • infeasible for N>20 Heuristic methods • The Nearest Neighbor method • N-opt iterative path improvement • Not guaranteed optimal • Heuristics are problem-specific (human-designed) Exact solver methods • Dynamic programming • Formulate as Integer Linear Programming (ILP), use Branch and Cut (B&C) • Uses branch and bound to rule out whole solution subspaces • Combine with problem-specific cutting planes • Scales up to 1000’s of nodes, but at high computational cost • Problem-specific C 7 B 4 7 8 6 6 7 5 9 E A D 3 C 7 B 4 7 8 6 6 7 5 9 E A D 3 C 7 B 4 7 8 6 6 7 5 9 E A D 3 C 7 B 4 7 8 6 6 7 5 9 E A D 3 A C 7 B 4 7 8 6 6 7 5 9 E D 3 C 7 B 4 7 8 6 6 7 5 9 E A D 3 Existing TSP solutions face challenges 8 B&C ruled out subspaces
  • 8. 9 Existing combinatorial optimization techniques have limitations How can we improve this situation with AI? Scale Search heuristics don’t scale to larger problems in acceptable compute time and cost, and do not guarantee satisfaction of all the constraints resulting in expensive manual intervention Learning Techniques don’t incorporate knowledge learned from solving many problems. They start each new problem instance from scratch.
  • 9. 10 Notable prior work: “Attention, learn to solve routing problems!”, ICLR 2019 AI addresses challenges of traditional combinatorial optimization solutions Leverages learned problem structure Scales to larger instances Offers a general framework Can achieve desired outcome with resource, cost, and time constraints Optimization metric AI Solver Develop an AI algorithm that can learn correlation between design parameters and optimization metrics from a limited set of problem instances 𝒑𝒂𝒓𝒂𝒎𝒆𝒕𝒆𝒓𝒔 For new instances, the AI algorithm uses an existing solver more efficiently by reducing parameter search space Heuristics AI Standard process AI process
  • 10. Exploring Bayesian optimization to reduce combinatorial search space Optimizing chip design with AI “Bayesian Optimization for Macro Placement”, ICML 2022
  • 11. 12 12 Competing combinatorial optimization objectives in chip design Need to account for all business metrics Chip Area Yield – more complex fab process Test Time Production cost reduction Power-performance optimization System-level power Chip-level power Design efficiency # of tools iterations License cost Capital expense (Capex) # of compute servers Emulation platforms
  • 12. 13 Semiconductor scaling advantage is approaching a cliff PPA = Power, Performance, Area Foundation Chips Design automation Computing Key elements Moore’s Law PPA scaling Combinatorial optimization Cloud servers Disruption 50% less gain 100x PVT corners 10x expense
  • 13. 14 0,01 0,1 1 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 Scaling Year Stdcell SRAM bitcell analog/IO CO needs to compensate for diminishing technology gains and PVT corner increase within acceptable compute expense 14 Theoretical Moore’s Law 0.5X / 2yr Analog/IO Memory Digital logic Area scaling over time
  • 14. 15 Macros (memories) 0 (10 — 100) AND Standard cells (logic gates) 0 (107 — 109 ) AND 1000 x Point-like Challenges in chip placement: How can we solve the chip optimization problem? 1. Placing mixed-size blocks – standard cells and macros (memories) Minimize power and area while satisfying timing constraints within limited design resources (people and compute servers) 2. Scale with increasing complexity of design (# of blocks) and constraints (e.g., PVT corners)
  • 15. 16 Each iteration can take up to several weeks for state-of-the-art designs and technologies Chip design is comprised of iterative macro and standard cell placement A complex and very computationally intensive part of chip design Outer loop Hours-days Macro placement Cell placement Inner loop No-overlap 2D: 𝑁! ! Designers manually select a macro placement and use solvers to optimize the standard cell placement (inner loop) and then manually iterate (outer loop)
  • 16. 1. Fit a probabilistic surrogate model 3. Evaluate and back to 1 2. Minimize cheap acquisition function Goal Find minimum of an expensive function ⋆: Data Cost True function Surrogate Surrogate uncertainty Search space Exploration of highly uncertain areas Exploitation of promising current minimal areas vs (tradeoff) Next point Bayesian optimization efficiently solves problems iteratively How to apply to macro placement? Cost Cost Search space Search space
  • 17. 18 Bayesian optimization learns a surrogate function, which maps each macro placement to a PPA quality metric, and uses it to narrow down the search over the large macro placement space Bayesian optimization for macro layout Inner loop optimization incorporated into surrogate function Outer loop Hours-days Macro placement Cell placement Inner loop (50!)2 ≈ 10128 PPA Surrogate function “Bayesian Optimization for Macro Placement”, ICML 2022
  • 18. 19 Simulated annealing Bayesian optimization Number of evaluations Number of evaluations Number of evaluations Wire length Results are for public MCNC benchmark for layouts (without standard cells) Three different chip designs: hp, ami33, ami49 Optimization objective is to minimize HPWL (wire length) This is a simpler objective than the Inner Loop PPA Further research aimed at generalizing this technique for production designs, across all PPA metrics and with the inclusion of design constraints Bayesian optimization can converge faster with better design metrics compared to conventional simulated annealing heuristics for an unconstrained version of the problem 19 “Bayesian Optimization for Macro Placement”, ICML 2022
  • 19. 20 Slow, costly, but accurate for sign-off and automation Algorithmic optimization based on analytic solutions needs human guidance Fast, cheap, and accurate for optimization Data-driven AI optimization can aid designers with fast evaluations and guide algorithmic optimization with optimal inputs
  • 20. AI can improve core components of compilers, including sequencing, scheduling, tiling, and placement AI compilers can be optimized with AI “Neural topological ordering for computation graphs”, NeurIPS 2022
  • 21. 22 Qualcomm AI Stack, Qualcomm Neural Processing SDK ,and Qualcomm AI Engine Direct are products of Qualcomm Technologies, Inc. and/or its subsidiaries. AIMET Model Zoo is a product of Qualcomm Innovation Center, Inc. Infrastructure: Programming Languages Virtual platforms Core Libraries Math Libraries Profilers & Debuggers Compilers System Interface SoC, accelerator drivers Qualcomm® AI Engine Direct Emulation Support AI Frameworks Auto XR Robotics IoT ACPC Smartphones Cloud Platforms AI Runtimes Qualcomm® Neural Processing SDK TF Lite TF Lite Micro Direct ML AIMET AIMET Model Zoo NAS Model analyzers Qualcomm AI Model Studio Tools:
  • 22. 23 Deployment Compiler Tiling Sequencing Scheduling … Placement Tiling and placement Splits net blocks into efficient code Ops and places them on multiple compute devices Sequencing Determines the best compute ordering of the nodes Scheduling Parallelizes across compute engines and sets final timing Deployment Puts the resulting generated code onto the target hardware Our example here will focus on the sequencing problem The AI compiler converts an input neural net into an efficient program that runs on target hardware 23 Optimizing for latency, performance, and power Neural net Sequencing
  • 23. 24 The runtime on the target device is expensive to evaluate • It can take minutes to set up the compiler with all your decisions and run one computation graph • Quite a large decision space • Goal: minimizing the runtime of the pipeline over the space of tiling, sequencing, and scheduling choices • Metrics: Double Data Rate (DDR) and Tightly Coupled Memories (TCM)
  • 24. 25 Minimizing memory usage: • Maximizes inferences/sec by reducing waits for external memory access and by allowing for compute parallelism • Minimizes energy/inference by reducing data movement Sequencing has a big impact on memory usage CONV_A_0 CONV_A_1 CONV_B_0 CONV_B_1 POOL_0 POOL_1 ACT_0 ACT_1 WT_B WT_A 0 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Memory Time Option #1: Data pre-fetch High memory utilization, Peak > 4MB Memory utilization CONV_A_0 CONV_A_1 CONV_B_0 CONV_B_1 POOL_0 POOL_1 ACT_0 ACT_1 WT_B WT_A 0 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Memory Time Memory utilization Lowest memory utilization, Peak = 3.25 MB Option #2: Low memory utilization
  • 25. 26 • # sequences is # node permutations that follow precedence constraints • Grows exponentially in # nodes in compute graph “Neural Topological Ordering for Computation Graphs”, NeurIPS 2022 Input A computation graph for a neural net (up to 10k’s of nodes) Path cost during execution progression of improvements Objective Find a sequence that minimizes the peak memory usage Application Reduce the inference time of AI models on chips by minimizing access to external memory External memory access can be 100x local memory and compute times, and is energy-intensive Computation graph (Ops) Sequencing computation graphs to minimize memory usage 26
  • 26. 27 End-to-end machine learning (ML) for sequencing Formulation originally motivated by “Attention, learn to solve routing problems!”, ICLR 2019 Initial embeddings Use node properties based on graph structure as initial node embeddings 27 Key components Encoder Use custom graph NN to capture the graph topology in embeddings while allowing for arbitrary graphs Decoder Generates a distribution in the sequence space 𝑃! 𝑠𝑒𝑞|𝐺 and autoregressively generates a sequence Objective min. 𝔼"#$~&! 𝐶𝑜𝑠𝑡 𝑠𝑒𝑞 , use RL to train encoder decoder architecture end-to-end ML-based sequencer Embedding Propagator Sequencer Sequence probabilities Embed Encoder Decoder Policy Net RL-trained agent Input compute graph Search sampling, beam search, greedy Ordering
  • 27. 28 28 A combination of real and synthetic graphs used for training We developed a novel approach to generating realistic synthetic graphs since real graph data is scarce Real graphs Synthetic graphs We released the algorithm to make these
  • 28. 29 BS: Beam Search; DP: Dynamic Programming; BFS: Breadth-First Sequence; DFS: Depth-First Sequence Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries. Memory usage gap % Real graphs test set (23 graphs) Sequence generation time Our model generalizes well and beats baselines comprehensively • Dataset of 115 representative graphs • Size: few dozen to 1k nodes • Our model performs better and is much faster • Results on Snapdragon 8
  • 29. 30 “Neural DAG Scheduling via One-Shot Priority Sampling”, NeurIPS 2022 Optimization Workshop, to appear at ICLR 2023 End-to-end ML for scheduling Input A computation graph with Op durations, set of compute devices Output Schedule: Define the start time and device allocation for all nodes Objective Find a schedule that minimizes the runtime (latency) Applications Reduce the inference time of ML models on chips Directed Acyclic Graph (DAG) duration 1.0 1.0 2.0 1.0 1 3 2 4 Final schedule 1 3 2 4 Maximize utilization of parallel hardware while maintaining sequence
  • 30. 31 31 CP: Critical Path; SPT: Shortest Processing Time; MOPNR: Most Operations Remaining Our end-to-end ML scheduler achieves state-of-the-art results Can be optimized for different performance metrics, such as latency Algorithm Node graphs 200 – 500 500 – 700 700 – 1000 SpeedUp SpeedUp SpeedUp Baseline schedulers CP 3.17 2.80 2.74 SPT 3.11 2.87 2.66 MOPNR 3.18 2.82 2.74 Our AI scheduler S(256) 3.28 3.20 2.86 End-to-end ML structure similar to previous for sequencing, train to minimize latency Results for a set of compute graphs of different sizes Achieves better speedup (inversely proportional to latency, higher is better) SpeedUp – higher is better
  • 31. 32 AI for full end-to-end compiler Self-supervised learning for CO Dynamical systems and AI Combine efficiency of dynamic models with AI learning Scale AI up to larger graphs Multi-core compiler AI-augmentation of solvers Combine best of solvers and AI methods Robustify learning to distribution shift Ensure good results for diverse inputs Need to efficiently extract global graph structure Further develop AI approaches targeting many distributed cores Broad range of research directions for AI combinatorial optimization Learn efficient problem representation to solve easier Include all aspects including compute primitives and tiling
  • 32. 33 Improved combinatorial optimization techniques offer benefits for a variety of use cases across industries Qualcomm AI Research has achieved state-of-the-art results in combinatorial optimization for chip design and AI compilers We are enabling combinatorial optimization technology at scale to address challenging problems Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
  • 34. Follow us on: For more information, visit us at: qualcomm.com & qualcomm.com/blog Thank you Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2018-2023 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm and Snapdragon are trademarks or registered trademarks of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.