SlideShare a Scribd company logo
1 of 67
Avesta Sasan
Associate Professor
University of California, Davis
RTLDESIGN IN MLWORLD
I am Avesta Sasan!
UC Davis
Electrical and Computer Engineering
NICE TO MEET YOU!
3
2004
Clean
Room
Automation
IP
Technology
Migration
Clean Room, Automation, and Technology Mapping
Industry Experience
4
2004 2008
Memory
Design
Memory Compiler
Design
Memory and Compiler Design
Industry Experience
5
2004 2008 2009
EDA
Design
Physical
Design
VLSI
Design
Signal & Power
Integrity
Physical Design, SOC Design, and EDA Design 20+ tapeouts!
Industry Experience
6
2004 2008 2009 2014
EDA
Design
Timing
Signoff
R&D
Yield
Analysis
IR
Analysis
R&D, and in-house EDA Design
Industry Experience
7
2004 2008 2009 2014
Joining Academia
2016
Academic Experience
8
2004 2008 2009 2014
Moving to Davis
2016 2021
Academic Experience
Research Focus
9
Applied, efficient &
Accelerated ML
VLSI Design and
Hardware Security
10
z
RTL Services
to
Accelerate
AI
Let’s first talk about some
problems!
11
12
AI Evolves Faster Than HW
Hardware vs AI Trend
13
Source: Ark Investments
Transformer (213M param) trained with NAS
Time
to
Market
Design RTL/Physical Manufactu
re
Test/Pkg/Ing
Days to Months –
Months – to years
Hardware vs AI Trend
n HW improvement reduce the cost of training 37% per year
n The model size has grown at the paste of 10X per year
n AI training cost continues to climb quickly
14
Source: Ark Investments
15
Source: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/
AI is Energy Demanding
2.0 – Round trip flight NY/SF (1 passenger)
11.0 -- Human life (1 year)
36.2 -- American life (1 year)
126.0 -- US car (1 year)
Transformer (213M param) trained with NAS
CO2
Emission
(x1000lbs)
626.2
Carbon Footprint of AI and DL
n Eye opening study:
q Energy of training the model for 1-day was computed
q Scaled using the data in the paper on the number of GPU-days took for training
q The cost was computed based on average energy cost in US
q This is result for a one training run
16
Source: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/
BERT carbon footprint =~
1,400lb of CO2 ~ close to a
round-trip trans-America
flight for one person
= round-trip trans-
America flight for 416
persons
Growth in AI Energy Requirements
n Models are becoming larger
q GPT-2 à 1.5B parameters and a few
petaflop-days to train
q GPT-3 à 175B parameters
q PaLM à 540B parameters
q GPT4 à 100s of B parameters!
q What is next?
17
Source: compute trends across three areas of machine learning, Servila et al, arXiv 2022
Growth in AI Energy Requirements
n Datasets are becoming significantly larger
q 3B words (training set) à training of BERT
q 32B words à XLNet
q 40B words à GPT-2
q 500B words à GPT-3
18
Source: https://epochai.org/blog/trends-in-training-dataset-sizes
19
AI User/App Base is J Growing!
20
Source: https://www.analyticsvidhya.com/blog/2022/03/the-carbon-footprint-of-ai-and-deep-learning/#
Increase in number of ML publication is exponential
# researchers + entities
investigating ML and its
application à Exponential
Growth
21
What Can We Do?
Improve AI Speed and Efficiency!
Need Everyone's Contribution!
22
Model Compiler Hardware
RTL/Physical
Design
Process
Design
? ?
?
?
?
Need Everyone's Contribution!
23
Model
Architecture
Compiler
Hardware
Architecture
RTL/Physical
Design
Process
Design
Pruning
Quantization
Sparse.
Efficient Architectures
Low-rank Fact. Alg.
Early Stopping
Adaptive Computation
E-Constrained NAS
Dynamic NN
Energy-aware Training
…
…
…
Fusion
Reuse
Autotuning
Dynamic Precision
Lazy eval.
Operator Simplification
Loop Unrolling
Cross-Function Opt.
Profile-Guided opt.
Dynamic Compilation
…
3D Architecture
Systolic Enhancement
In-HW Pruning
Sparse Support
Aggressive DVFS
HBM/TRAM/ ….
In-Memory Computing
Near-memory
Neuromorphic HW …
Optical HW …
Optical Comm …
…
…
3D
High-k Metal Gate
Extreme UV Litho.
C-Nanotube Trans.
Silicon Photonics
Spintronic …
Memristor …
…
?
24
How to Change Our Approach to
RTL and Physical Design to keep AI
Development Energized?
25
Use ML to Automate and Improve Physical and RTL Design
Make it faster!
Make it more efficient!
How?
ML in
Electronic Design Automation (EDA)
26
Modern VLSI Layout
n Designs are getting larger with billions of transistor on chip
n Design flows are getting increasingly more complicated
27
IBM Power 10
18B
Apple A15
15B
NVIDIA Ampere GA100
54B
Cerebras Mega
1.2T
IC Physical Design Flow
28
Learning Assisted Computer Aided Design
n Physical Design is a very time-consuming Process
q Iterative and incremental (5-18 months in industry)
q Heuristic optimization algorithms
q Human Expert
n Applied Learning can help with
q Reduce the significance of human expert
q Optimization beyond heuristics
q Reduce the design time
29
Design Time
Design
Maturity
Goal
Learning-Assisted outcome
Conventional Design outcome
Design Space
Heuristic Search Space
Best
Local Best
Problem-Solution Opportunity Matrix
n Some problems do not get old, but our solutions do!
30
New
Problem
Old
Problem
New
Solution
Old
Solution
Waste of Time! Big Money!
Moderate Risk!
Big Potential!
High Risk
Some Potential!
Lower Risk!
Invention
Experimentation
Maintenance Innovation
Physical Design
n In each physical design step, there are 2 required processes:
q Optimization (usually semi-heurisitic) solution – placement, CTS, route, chip finishing
q Quality of Result (QoR) Analysis – i.e. power, area, timing, DRC violation, IR, EM, SI
n For example, during Placement
q The placement of movable objects (cells) are determined (optimization)
q The quality of placement is measured (QoR) in terms of:
n Power, Area, Timing, Potential routing congestion, etc.
n The optimization solution (semi-heuristic) are designed to optimize to improve QoR
metrics.
n In an ML-assisted physical design, we also need to develop ML solutions for both
Optimization and QoR analysis
n Given the approximate nature of ML, signoff level analysis is not possible. The next best
thing is prediction. (however it could be verified by EDA)
q If prediction has high accuracy and could be done far faster, it provide an advantage à could be
used in optimization loop. (full STA may take days, ML-based prediction may take seconds)!
n But also, ML allows us to forecast the outcome of future steps.
q Hence, ML can be used for both prediction and forecast.
31
How ML could help?
32
ML Framework could be used for speedup
33
n ML Framework for speedup: rewrite heuristic algorithms using learning framework,
formulate the optimization as a training problem, enjoy GPU scaling.
Learning Assisted Physical Design (LAPD)
ML
Framework
Speedup
Example: DREAMPlace
n VLSI Placement
34
VLSI
Placement
Gate level Netlist
STD Cell Lib
Floorplan
Constraints
Legal Placement
Challenges of Nonlinear Placement:
• Low efficiently
• >3h for 10M cell design
• Today we are targeting much larger placement!
• Limited acceleration
• Limited speedup
References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual
Design Automation Conference 2019. 2019.
Example: DREAMPlace
36
n Interestingly, the objective of placement and training are very similar
References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual
Design Automation Conference 2019. 2019.
Example: DREAMPlace
37
n Interestingly, the objective of placement and training are very similar
References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual
Design Automation Conference 2019. 2019.
Example: DREAMPlace Results
n Significant speedup!
n Area for improvement:
q It is not congestion aware (congestion is handled indirectly by density constraint)
q It is not timing aware (average timing is optimized, not the worst case) – good for power though!
References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual
Design Automation Conference 2019. 2019.
ML Framework could be used for speedup
39
n ML Framework for speedup: rewrite heuristic algorithms using learning framework,
formulate the optimization as a training problem, enjoy GPU scaling.
Learning Assisted Physical Design (LAPD)
ML
Framework
Speedup
40
n ML Framework for speedup
n Develop ML for QoR prediction
q Reduce QoR analysis time
q See many PVT corners early in design time
q Predict QoR in future steps (i.e. routing congestion in placement time)
ML could be used for QoR prediction/cast
Learning Assisted Physical Design (LAPD)
ML
for
QoR
Prediction
ML
Framework
Speedup
Examples:
• PBA prediction using GBA timing analysis
• MCMM STA prediction using STA runs in limited
(i.e., 3) corners.
• IR drop prediction
• Routing congestion prediction at synthesis
• Routing congestion prediction at placement
• DRC prediction
• Yield prediction
41
n ML Framework for speedup
n Develop ML for QoR prediction
n Use Representation Learning to automate feature engineering
q Learn better features à improve accuracy!
q Remove need for cross domain ML and CAD experts à Ease of development
q Lower Adoption bar à widespread use!
ML could be used for QoR prediction/cast
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
We will cover a case study from our group for MCMM
PBA prediction using automated feature engineering
42
n ML Framework for speedup
n Develop ML for QoR prediction
n Use Representation Learning to automate feature engineering
n ML (i.e., RL) for Optimization
q Optimize beyond heuristic models.
ML Could be used for Optimization
ML
for
Optimization
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
Examples:
• RL,CNN, GAN for Macro Placement
• RL for CTS
• RL,CNN, GAN for Routing
• ….
We will cover a case study from our group for
CTS using Reinforcement Learning!
43
ML
for
Optimization
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
n ML Framework for speedup
n Develop ML for QoR prediction
n Use Representation Learning to automate feature engineering
n ML (i.e., RL) for Optimization
n The ML for QoR prediction and Optimization (ML or framework based) could work
together in loop.
ML-Guided ML-Optimization (ML Loop)
ML-Guided ML-Optimization (ML Loop)
44
n ML Framework for speedup
n Develop ML for QoR prediction
n Use Representation Learning to automate feature engineering
n ML (i.e., RL) for Optimization
n The ML for QoR prediction and Optimization (ML or framework based) could work
together in loop.
ML
for
Optimization
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
Two possibilities:
1. QoR nested within optimizationà
ML guided ML optimization
• Reduced number of iterations
• Make better optimization decisions
2. Optimization nested within QoR à
Genetic-based optimization
• Analogous to ML replacing physical designer,
in analyzing the result, and re-running the flow!
45
n Using multiple ML-based QoR prediction in optimization loop results in
q multi-objective optimization
q Predictive-optimization
n Pros:
q Optimize for current and future QoR
q Prevent doomed runs
q Prevent QoR estimation (pessimism) from lowering design quality
q Faster signoff
q Reduce Time To Market (TTM)
q Reduce tool and licensing cost
q Reduce engineering cost
q ….
ML
for
Optimization
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
ML-Guided ML-Optimization (ML Loop)
Case Study 1
RL for Clock Tree Synthesis
46
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
47
RL for Peak Current Reduction
Time
Demanded
Current
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
48
RL for Peak Current Reduction
Time
Demanded
Current
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
n Solution: Reinforcement Learning
49
RL for Peak Current Reduction
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
50
Reinforcement Learning
n Reinforcement learning (RL): How Intelligent agent should take actions and
interact with environment in order to maximize its reward
n RL combines exploitation (maximizing rewards) with exploration (taking risk) to learn
about possible future rewards
n Applicable to problems of sequential decision making
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
n Solution: Reinforcement Learning
51
RL for Peak Current Reduction
Agent
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
n Solution: Reinforcement Learning
52
RL for Peak Current Reduction
T + tck-c – tck-l ³ tc-q + tplogic + tsu
thold + d ≤ tcdlogic + tcdreg
Timing Check
Environment
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Problem: Reducing peak current à IR drop
n Method: Maximize the skew in design to spread the clock arrival times
n Solution: Reinforcement Learning
q Positive reward if an action increase overall CAT distribution
q Large Negative reward if an action generate timing violation
q Allow aggressive exploitation with large discount factor (allowing search for future rewards)
53
RL for Peak Current Reduction
Timing Engine
Action A =
• Insert or remove clock buffer
• Where to Move
Reward R =
• + delta skew
• - timing violation
State S = Updated design
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
n Results on Ethernet benchmark
q Wider distribution of CAT in RL
n Measurement from Ansys Redhawk
q 36.2% Reduction in Peak current
q 41.4% Improvement in IR drop
55
Reinforce-L for Peak Current Reduction
skew
Count
CTS
RL
Heuristic CTS
Reinforcement Learning
Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
Case Study 2
RepL for Static Timing Analysis
56
Representation Learning for STA Prediction
n Problem:
q Path-based static timing (PBA) analysis is very expensive
q Timing Checks in many corners is a very expensive
q Designer resort to Graph-based timing analysis (GBA) on one or few corners to carry the physical design
q When near maturity, designer switch to PBA mode, and check for other corners
q Signoff require PBA timing check in all corners
57
GBA
PBA
(PVT) = 1
(P,V,T) = 100s
Design Time
Design
Maturity
Goal
Timing
Check
Design outcome + GBA
Design outcome + PBA
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
n But at this point, the damage is already done!
q Design Cycle: GBA is pessimistic à tool will overfix à increasing design iterations
q PPA penalty: GBA is pessimistic à Trade PPA for timing à Lost optimization opportunity
q Corner Blindness: Design is only tracked in limited corners à other corners may surprise!
58
GBA
PBA
(PVT) = 1
(P,V,T) = 100s
Design Time
Design
Maturity
Goal
Timing
Check
Design outcome + GBA
Design outcome + PBA
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
n Objective: Prediction of Static Timing Analysis Results (PBA prediction)
n Constraint: No manual feature engineering
n Approach: Representation Learning + MLP
59
GBA
PBA
(PVT) = 1
(P,V,T) = 100s
Design Time
Design
Maturity
Goal
Timing
Check
Learning Model
PBA Corner 1
PBA Corner 2
PBA Corner 185
…
…
Design outcome (old) + PBA
Design outcome (now) + GBA
PBA Predicted outcome (ML)
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Comparison to Prior Art
60
Timing Engine (STA)
Features
Timing Engine (STA)
Representation
Learning
Training
Training
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
61
3-Unit LSTM Cell
Gate XYZ
Property 1: Value
Property 2: Value
Property 3: Value
.
.
.
Property 500: Value
net YUZ
Property 1: Value
Property 2: Value
Property 3: Value
.
.
.
Property 600: Value
Bigram XYZ-YUZ
Property 1: Value
Property 2: Value
Property 3: Value
.
.
.
Property 500: Value
Property 501: Value
Property 502: Value
Property 503: Value
.
.
.
Property 1100: Value
1
LSTM LSTM LSTM LSTM LSTM
Bigram
Bigram
Bigram
Bigram
Bigram
Data Path
Data Path
2
LSTM
LSTM LSTM
LSTM LSTM
LSTM LSTM
LSTM LSTM
LSTM
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
62
Data Path
1
LSTM LSTM LSTM LSTM LSTM
2
Data
1
LSTM LSTM LSTM LSTM LSTM
2
Capture
1
LSTM LSTM LSTM LSTM LSTM
2
Launch
Capture
Path
Launch
Path
Data Representation Learning
Capture Representation Learning
Launch Representation Learning
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
63
1
LSTM LSTM LSTM LSTM LSTM
2
Data
1
LSTM LSTM LSTM LSTM LSTM
2
Capture
1
LSTM LSTM LSTM LSTM LSTM
2
Launch
Data Representation Learning
Capture Representation Learning
Launch Representation Learning
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Representation Learning for STA
64
Launch
Delay
Capture
Delay
Data
Delay
Launch Path
Features
Capture Path
Features
Data Path
Features
Label
Prediction
PBA Slack Prediction
FC
Sub-Label Prediction
Used for training Phase; Removed in test phase
Input Sample
Data
Representation
Learning
Capture
Representation
Learning
Launch
Representation
Learning
Fully Connected Layer
Dropout
Fully Connected Layer
FC FC
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Comparison to Prior Art
65
Timing Engine (STA)
Features
Timing Engine (STA)
Representation
Learning
Training
Training
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Results (Route PBA from GBA)
0
5
10
15
20
25
30
0
.
7
8
0
.
7
9
0
.
8
1
0
.
8
3
0
.
8
5
0
.
8
7
0
.
8
9
0
.
9
1
0
.
9
3
0
.
9
5
0
.
9
7
0
.
9
9
1
.
0
1
1
.
0
3
1
.
0
5
68
Ethernet
0
5
10
15
20
25
30
35
40
0
.
7
8
0
.
7
9
0
.
8
1
0
.
8
3
0
.
8
5
0
.
8
7
0
.
8
9
0
.
9
1
0
.
9
3
0
.
9
5
0
.
9
7
0
.
9
9
1
.
0
1
1
.
0
3
1
.
0
5
S38417
Standard
Deviation
(ps)
Standard
Deviation
(ps)
Voltages (V)
Voltages (V)
RAPTA average train and test time on GPU. The reported number for test, is the time needed to generate PBA
prediction for 10K timing paths. The training is only done once during the design cycle.
RAPTA
Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
Many Possibilities Ahead!
Going Beyond Heuristic
69
70
ML
for
Optimization
Learning Assisted Physical Design (LAPD)
Auto
Feature
Engineering
ML
for
QoR
Prediction
ML
Framework
Speedup
Final Word!
71
?

More Related Content

Similar to RTL DESIGN IN ML WORLD_OBJECT AUTOMATION Inc

Architecting a real time optimization platform for driver positioning (applie...
Architecting a real time optimization platform for driver positioning (applie...Architecting a real time optimization platform for driver positioning (applie...
Architecting a real time optimization platform for driver positioning (applie...Lyft
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to ProductionMostafa Majidpour
 
Driving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverDriving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverLeanKit
 
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jBuilding Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jNeo4j
 
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCProduct Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCHAKKACHE Mohamed
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathJohn Holden
 
Kansas Elsas Klint 2011
Kansas Elsas Klint 2011Kansas Elsas Klint 2011
Kansas Elsas Klint 2011Philip Elsas
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scaleHenry Saputra
 
COLLABORATE 18 Presentation: Demand Planning in Cloud R13
COLLABORATE 18 Presentation: Demand Planning in Cloud R13COLLABORATE 18 Presentation: Demand Planning in Cloud R13
COLLABORATE 18 Presentation: Demand Planning in Cloud R13Jade Global
 
TPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsTPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsRim Moussa
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Machine Learning with Data Science Online Course | Learn and Build
 Machine Learning with Data Science Online Course | Learn and Build  Machine Learning with Data Science Online Course | Learn and Build
Machine Learning with Data Science Online Course | Learn and Build Learn and Build
 
Legacy code - Taming The Beast
Legacy code  - Taming The BeastLegacy code  - Taming The Beast
Legacy code - Taming The BeastSARCCOM
 
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...AgileNetwork
 
Computing Without Computers - Oct08
Computing Without Computers - Oct08Computing Without Computers - Oct08
Computing Without Computers - Oct08Ian Page
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab CreateTuri, Inc.
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 RecapSri Ambati
 

Similar to RTL DESIGN IN ML WORLD_OBJECT AUTOMATION Inc (20)

Architecting a real time optimization platform for driver positioning (applie...
Architecting a real time optimization platform for driver positioning (applie...Architecting a real time optimization platform for driver positioning (applie...
Architecting a real time optimization platform for driver positioning (applie...
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Driving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverDriving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land Rover
 
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jBuilding Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
 
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCProduct Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
 
Kansas Elsas Klint 2011
Kansas Elsas Klint 2011Kansas Elsas Klint 2011
Kansas Elsas Klint 2011
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scale
 
COLLABORATE 18 Presentation: Demand Planning in Cloud R13
COLLABORATE 18 Presentation: Demand Planning in Cloud R13COLLABORATE 18 Presentation: Demand Planning in Cloud R13
COLLABORATE 18 Presentation: Demand Planning in Cloud R13
 
TPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data cloudsTPC-H analytics' scenarios and performances on Hadoop data clouds
TPC-H analytics' scenarios and performances on Hadoop data clouds
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Machine Learning with Data Science Online Course | Learn and Build
 Machine Learning with Data Science Online Course | Learn and Build  Machine Learning with Data Science Online Course | Learn and Build
Machine Learning with Data Science Online Course | Learn and Build
 
Legacy code - Taming The Beast
Legacy code  - Taming The BeastLegacy code  - Taming The Beast
Legacy code - Taming The Beast
 
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...
ANIn Hyderabad Nov2023|Generative AI for software engineering - Lakshman Peet...
 
Computing Without Computers - Oct08
Computing Without Computers - Oct08Computing Without Computers - Oct08
Computing Without Computers - Oct08
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
 

More from Object Automation

CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopObject Automation
 
RTL Design Methodologies_Object Automation Inc
RTL Design Methodologies_Object Automation IncRTL Design Methodologies_Object Automation Inc
RTL Design Methodologies_Object Automation IncObject Automation
 
High-Level Synthesis for the Design of AI Chips
High-Level Synthesis for the Design of AI ChipsHigh-Level Synthesis for the Design of AI Chips
High-Level Synthesis for the Design of AI ChipsObject Automation
 
AI-Inspired IOT Chiplets and 3D Heterogeneous Integration
AI-Inspired IOT Chiplets and 3D Heterogeneous IntegrationAI-Inspired IOT Chiplets and 3D Heterogeneous Integration
AI-Inspired IOT Chiplets and 3D Heterogeneous IntegrationObject Automation
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
CDAC presentation as part of Global AI Festival and Future
CDAC presentation as part of Global AI Festival and FutureCDAC presentation as part of Global AI Festival and Future
CDAC presentation as part of Global AI Festival and FutureObject Automation
 
Global AI Festivla and Future one day event
Global AI Festivla and Future one day eventGlobal AI Festivla and Future one day event
Global AI Festivla and Future one day eventObject Automation
 
Generative AI In Logistics_Object Automation
Generative AI In Logistics_Object AutomationGenerative AI In Logistics_Object Automation
Generative AI In Logistics_Object AutomationObject Automation
 
Gen AI_Object Automation_TechnologyWorkshop
Gen AI_Object Automation_TechnologyWorkshopGen AI_Object Automation_TechnologyWorkshop
Gen AI_Object Automation_TechnologyWorkshopObject Automation
 
Deploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfDeploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfObject Automation
 
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdfAI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdfObject Automation
 
5G Edge Computing_Object Automation workshop
5G Edge Computing_Object Automation workshop5G Edge Computing_Object Automation workshop
5G Edge Computing_Object Automation workshopObject Automation
 
Course_Object Automation.pdf
Course_Object Automation.pdfCourse_Object Automation.pdf
Course_Object Automation.pdfObject Automation
 
Enterprise AI by using IBM DB2
Enterprise AI by using IBM DB2Enterprise AI by using IBM DB2
Enterprise AI by using IBM DB2Object Automation
 

More from Object Automation (20)

CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshop
 
RTL Design Methodologies_Object Automation Inc
RTL Design Methodologies_Object Automation IncRTL Design Methodologies_Object Automation Inc
RTL Design Methodologies_Object Automation Inc
 
High-Level Synthesis for the Design of AI Chips
High-Level Synthesis for the Design of AI ChipsHigh-Level Synthesis for the Design of AI Chips
High-Level Synthesis for the Design of AI Chips
 
AI-Inspired IOT Chiplets and 3D Heterogeneous Integration
AI-Inspired IOT Chiplets and 3D Heterogeneous IntegrationAI-Inspired IOT Chiplets and 3D Heterogeneous Integration
AI-Inspired IOT Chiplets and 3D Heterogeneous Integration
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
CDAC presentation as part of Global AI Festival and Future
CDAC presentation as part of Global AI Festival and FutureCDAC presentation as part of Global AI Festival and Future
CDAC presentation as part of Global AI Festival and Future
 
Global AI Festivla and Future one day event
Global AI Festivla and Future one day eventGlobal AI Festivla and Future one day event
Global AI Festivla and Future one day event
 
Generative AI In Logistics_Object Automation
Generative AI In Logistics_Object AutomationGenerative AI In Logistics_Object Automation
Generative AI In Logistics_Object Automation
 
Gen AI_Object Automation_TechnologyWorkshop
Gen AI_Object Automation_TechnologyWorkshopGen AI_Object Automation_TechnologyWorkshop
Gen AI_Object Automation_TechnologyWorkshop
 
Deploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdfDeploying Pretrained Model In Edge IoT Devices.pdf
Deploying Pretrained Model In Edge IoT Devices.pdf
 
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdfAI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
 
5G Edge Computing_Object Automation workshop
5G Edge Computing_Object Automation workshop5G Edge Computing_Object Automation workshop
5G Edge Computing_Object Automation workshop
 
COE AI Lab Universities
COE AI Lab UniversitiesCOE AI Lab Universities
COE AI Lab Universities
 
Bootcamp_AIApps.pdf
Bootcamp_AIApps.pdfBootcamp_AIApps.pdf
Bootcamp_AIApps.pdf
 
Bootcamp_AIApps.pdf
Bootcamp_AIApps.pdfBootcamp_AIApps.pdf
Bootcamp_AIApps.pdf
 
Bootcamp_AIAppsUCSD.pptx
Bootcamp_AIAppsUCSD.pptxBootcamp_AIAppsUCSD.pptx
Bootcamp_AIAppsUCSD.pptx
 
Course_Object Automation.pdf
Course_Object Automation.pdfCourse_Object Automation.pdf
Course_Object Automation.pdf
 
Enterprise AI_New.pdf
Enterprise AI_New.pdfEnterprise AI_New.pdf
Enterprise AI_New.pdf
 
Super AI tools
Super AI toolsSuper AI tools
Super AI tools
 
Enterprise AI by using IBM DB2
Enterprise AI by using IBM DB2Enterprise AI by using IBM DB2
Enterprise AI by using IBM DB2
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

RTL DESIGN IN ML WORLD_OBJECT AUTOMATION Inc

  • 1. Avesta Sasan Associate Professor University of California, Davis RTLDESIGN IN MLWORLD
  • 2. I am Avesta Sasan! UC Davis Electrical and Computer Engineering NICE TO MEET YOU!
  • 4. 4 2004 2008 Memory Design Memory Compiler Design Memory and Compiler Design Industry Experience
  • 5. 5 2004 2008 2009 EDA Design Physical Design VLSI Design Signal & Power Integrity Physical Design, SOC Design, and EDA Design 20+ tapeouts! Industry Experience
  • 6. 6 2004 2008 2009 2014 EDA Design Timing Signoff R&D Yield Analysis IR Analysis R&D, and in-house EDA Design Industry Experience
  • 7. 7 2004 2008 2009 2014 Joining Academia 2016 Academic Experience
  • 8. 8 2004 2008 2009 2014 Moving to Davis 2016 2021 Academic Experience
  • 9. Research Focus 9 Applied, efficient & Accelerated ML VLSI Design and Hardware Security
  • 11. Let’s first talk about some problems! 11
  • 13. Hardware vs AI Trend 13 Source: Ark Investments Transformer (213M param) trained with NAS Time to Market Design RTL/Physical Manufactu re Test/Pkg/Ing Days to Months – Months – to years
  • 14. Hardware vs AI Trend n HW improvement reduce the cost of training 37% per year n The model size has grown at the paste of 10X per year n AI training cost continues to climb quickly 14 Source: Ark Investments
  • 15. 15 Source: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/ AI is Energy Demanding 2.0 – Round trip flight NY/SF (1 passenger) 11.0 -- Human life (1 year) 36.2 -- American life (1 year) 126.0 -- US car (1 year) Transformer (213M param) trained with NAS CO2 Emission (x1000lbs) 626.2
  • 16. Carbon Footprint of AI and DL n Eye opening study: q Energy of training the model for 1-day was computed q Scaled using the data in the paper on the number of GPU-days took for training q The cost was computed based on average energy cost in US q This is result for a one training run 16 Source: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/ BERT carbon footprint =~ 1,400lb of CO2 ~ close to a round-trip trans-America flight for one person = round-trip trans- America flight for 416 persons
  • 17. Growth in AI Energy Requirements n Models are becoming larger q GPT-2 à 1.5B parameters and a few petaflop-days to train q GPT-3 à 175B parameters q PaLM à 540B parameters q GPT4 à 100s of B parameters! q What is next? 17 Source: compute trends across three areas of machine learning, Servila et al, arXiv 2022
  • 18. Growth in AI Energy Requirements n Datasets are becoming significantly larger q 3B words (training set) à training of BERT q 32B words à XLNet q 40B words à GPT-2 q 500B words à GPT-3 18 Source: https://epochai.org/blog/trends-in-training-dataset-sizes
  • 19. 19 AI User/App Base is J Growing!
  • 20. 20 Source: https://www.analyticsvidhya.com/blog/2022/03/the-carbon-footprint-of-ai-and-deep-learning/# Increase in number of ML publication is exponential # researchers + entities investigating ML and its application à Exponential Growth
  • 21. 21 What Can We Do? Improve AI Speed and Efficiency!
  • 22. Need Everyone's Contribution! 22 Model Compiler Hardware RTL/Physical Design Process Design ? ? ? ? ?
  • 23. Need Everyone's Contribution! 23 Model Architecture Compiler Hardware Architecture RTL/Physical Design Process Design Pruning Quantization Sparse. Efficient Architectures Low-rank Fact. Alg. Early Stopping Adaptive Computation E-Constrained NAS Dynamic NN Energy-aware Training … … … Fusion Reuse Autotuning Dynamic Precision Lazy eval. Operator Simplification Loop Unrolling Cross-Function Opt. Profile-Guided opt. Dynamic Compilation … 3D Architecture Systolic Enhancement In-HW Pruning Sparse Support Aggressive DVFS HBM/TRAM/ …. In-Memory Computing Near-memory Neuromorphic HW … Optical HW … Optical Comm … … … 3D High-k Metal Gate Extreme UV Litho. C-Nanotube Trans. Silicon Photonics Spintronic … Memristor … … ?
  • 24. 24 How to Change Our Approach to RTL and Physical Design to keep AI Development Energized?
  • 25. 25 Use ML to Automate and Improve Physical and RTL Design Make it faster! Make it more efficient! How?
  • 26. ML in Electronic Design Automation (EDA) 26
  • 27. Modern VLSI Layout n Designs are getting larger with billions of transistor on chip n Design flows are getting increasingly more complicated 27 IBM Power 10 18B Apple A15 15B NVIDIA Ampere GA100 54B Cerebras Mega 1.2T
  • 29. Learning Assisted Computer Aided Design n Physical Design is a very time-consuming Process q Iterative and incremental (5-18 months in industry) q Heuristic optimization algorithms q Human Expert n Applied Learning can help with q Reduce the significance of human expert q Optimization beyond heuristics q Reduce the design time 29 Design Time Design Maturity Goal Learning-Assisted outcome Conventional Design outcome Design Space Heuristic Search Space Best Local Best
  • 30. Problem-Solution Opportunity Matrix n Some problems do not get old, but our solutions do! 30 New Problem Old Problem New Solution Old Solution Waste of Time! Big Money! Moderate Risk! Big Potential! High Risk Some Potential! Lower Risk! Invention Experimentation Maintenance Innovation
  • 31. Physical Design n In each physical design step, there are 2 required processes: q Optimization (usually semi-heurisitic) solution – placement, CTS, route, chip finishing q Quality of Result (QoR) Analysis – i.e. power, area, timing, DRC violation, IR, EM, SI n For example, during Placement q The placement of movable objects (cells) are determined (optimization) q The quality of placement is measured (QoR) in terms of: n Power, Area, Timing, Potential routing congestion, etc. n The optimization solution (semi-heuristic) are designed to optimize to improve QoR metrics. n In an ML-assisted physical design, we also need to develop ML solutions for both Optimization and QoR analysis n Given the approximate nature of ML, signoff level analysis is not possible. The next best thing is prediction. (however it could be verified by EDA) q If prediction has high accuracy and could be done far faster, it provide an advantage à could be used in optimization loop. (full STA may take days, ML-based prediction may take seconds)! n But also, ML allows us to forecast the outcome of future steps. q Hence, ML can be used for both prediction and forecast. 31
  • 32. How ML could help? 32
  • 33. ML Framework could be used for speedup 33 n ML Framework for speedup: rewrite heuristic algorithms using learning framework, formulate the optimization as a training problem, enjoy GPU scaling. Learning Assisted Physical Design (LAPD) ML Framework Speedup
  • 34. Example: DREAMPlace n VLSI Placement 34 VLSI Placement Gate level Netlist STD Cell Lib Floorplan Constraints Legal Placement Challenges of Nonlinear Placement: • Low efficiently • >3h for 10M cell design • Today we are targeting much larger placement! • Limited acceleration • Limited speedup References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual Design Automation Conference 2019. 2019.
  • 35. Example: DREAMPlace 36 n Interestingly, the objective of placement and training are very similar References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual Design Automation Conference 2019. 2019.
  • 36. Example: DREAMPlace 37 n Interestingly, the objective of placement and training are very similar References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual Design Automation Conference 2019. 2019.
  • 37. Example: DREAMPlace Results n Significant speedup! n Area for improvement: q It is not congestion aware (congestion is handled indirectly by density constraint) q It is not timing aware (average timing is optimized, not the worst case) – good for power though! References: Lin, Yibo, et al. "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement." Proceedings of the 56th Annual Design Automation Conference 2019. 2019.
  • 38. ML Framework could be used for speedup 39 n ML Framework for speedup: rewrite heuristic algorithms using learning framework, formulate the optimization as a training problem, enjoy GPU scaling. Learning Assisted Physical Design (LAPD) ML Framework Speedup
  • 39. 40 n ML Framework for speedup n Develop ML for QoR prediction q Reduce QoR analysis time q See many PVT corners early in design time q Predict QoR in future steps (i.e. routing congestion in placement time) ML could be used for QoR prediction/cast Learning Assisted Physical Design (LAPD) ML for QoR Prediction ML Framework Speedup Examples: • PBA prediction using GBA timing analysis • MCMM STA prediction using STA runs in limited (i.e., 3) corners. • IR drop prediction • Routing congestion prediction at synthesis • Routing congestion prediction at placement • DRC prediction • Yield prediction
  • 40. 41 n ML Framework for speedup n Develop ML for QoR prediction n Use Representation Learning to automate feature engineering q Learn better features à improve accuracy! q Remove need for cross domain ML and CAD experts à Ease of development q Lower Adoption bar à widespread use! ML could be used for QoR prediction/cast Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup We will cover a case study from our group for MCMM PBA prediction using automated feature engineering
  • 41. 42 n ML Framework for speedup n Develop ML for QoR prediction n Use Representation Learning to automate feature engineering n ML (i.e., RL) for Optimization q Optimize beyond heuristic models. ML Could be used for Optimization ML for Optimization Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup Examples: • RL,CNN, GAN for Macro Placement • RL for CTS • RL,CNN, GAN for Routing • …. We will cover a case study from our group for CTS using Reinforcement Learning!
  • 42. 43 ML for Optimization Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup n ML Framework for speedup n Develop ML for QoR prediction n Use Representation Learning to automate feature engineering n ML (i.e., RL) for Optimization n The ML for QoR prediction and Optimization (ML or framework based) could work together in loop. ML-Guided ML-Optimization (ML Loop)
  • 43. ML-Guided ML-Optimization (ML Loop) 44 n ML Framework for speedup n Develop ML for QoR prediction n Use Representation Learning to automate feature engineering n ML (i.e., RL) for Optimization n The ML for QoR prediction and Optimization (ML or framework based) could work together in loop. ML for Optimization Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup Two possibilities: 1. QoR nested within optimizationà ML guided ML optimization • Reduced number of iterations • Make better optimization decisions 2. Optimization nested within QoR à Genetic-based optimization • Analogous to ML replacing physical designer, in analyzing the result, and re-running the flow!
  • 44. 45 n Using multiple ML-based QoR prediction in optimization loop results in q multi-objective optimization q Predictive-optimization n Pros: q Optimize for current and future QoR q Prevent doomed runs q Prevent QoR estimation (pessimism) from lowering design quality q Faster signoff q Reduce Time To Market (TTM) q Reduce tool and licensing cost q Reduce engineering cost q …. ML for Optimization Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup ML-Guided ML-Optimization (ML Loop)
  • 45. Case Study 1 RL for Clock Tree Synthesis 46
  • 46. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times 47 RL for Peak Current Reduction Time Demanded Current Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 47. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times 48 RL for Peak Current Reduction Time Demanded Current Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 48. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times n Solution: Reinforcement Learning 49 RL for Peak Current Reduction Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 49. 50 Reinforcement Learning n Reinforcement learning (RL): How Intelligent agent should take actions and interact with environment in order to maximize its reward n RL combines exploitation (maximizing rewards) with exploration (taking risk) to learn about possible future rewards n Applicable to problems of sequential decision making Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 50. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times n Solution: Reinforcement Learning 51 RL for Peak Current Reduction Agent Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 51. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times n Solution: Reinforcement Learning 52 RL for Peak Current Reduction T + tck-c – tck-l ³ tc-q + tplogic + tsu thold + d ≤ tcdlogic + tcdreg Timing Check Environment Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 52. n Problem: Reducing peak current à IR drop n Method: Maximize the skew in design to spread the clock arrival times n Solution: Reinforcement Learning q Positive reward if an action increase overall CAT distribution q Large Negative reward if an action generate timing violation q Allow aggressive exploitation with large discount factor (allowing search for future rewards) 53 RL for Peak Current Reduction Timing Engine Action A = • Insert or remove clock buffer • Where to Move Reward R = • + delta skew • - timing violation State S = Updated design Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 53. n Results on Ethernet benchmark q Wider distribution of CAT in RL n Measurement from Ansys Redhawk q 36.2% Reduction in Peak current q 41.4% Improvement in IR drop 55 Reinforce-L for Peak Current Reduction skew Count CTS RL Heuristic CTS Reinforcement Learning Reference: "A Reinforced Learning Solution for Clock Skew Engineering to Reduce Peak Current and IR Drop." Proceedings of the 2021 on GLSVLSI 2021
  • 54. Case Study 2 RepL for Static Timing Analysis 56
  • 55. Representation Learning for STA Prediction n Problem: q Path-based static timing (PBA) analysis is very expensive q Timing Checks in many corners is a very expensive q Designer resort to Graph-based timing analysis (GBA) on one or few corners to carry the physical design q When near maturity, designer switch to PBA mode, and check for other corners q Signoff require PBA timing check in all corners 57 GBA PBA (PVT) = 1 (P,V,T) = 100s Design Time Design Maturity Goal Timing Check Design outcome + GBA Design outcome + PBA Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 56. Representation Learning for STA n But at this point, the damage is already done! q Design Cycle: GBA is pessimistic à tool will overfix à increasing design iterations q PPA penalty: GBA is pessimistic à Trade PPA for timing à Lost optimization opportunity q Corner Blindness: Design is only tracked in limited corners à other corners may surprise! 58 GBA PBA (PVT) = 1 (P,V,T) = 100s Design Time Design Maturity Goal Timing Check Design outcome + GBA Design outcome + PBA Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 57. Representation Learning for STA n Objective: Prediction of Static Timing Analysis Results (PBA prediction) n Constraint: No manual feature engineering n Approach: Representation Learning + MLP 59 GBA PBA (PVT) = 1 (P,V,T) = 100s Design Time Design Maturity Goal Timing Check Learning Model PBA Corner 1 PBA Corner 2 PBA Corner 185 … … Design outcome (old) + PBA Design outcome (now) + GBA PBA Predicted outcome (ML) Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 58. Comparison to Prior Art 60 Timing Engine (STA) Features Timing Engine (STA) Representation Learning Training Training Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 59. Representation Learning for STA 61 3-Unit LSTM Cell Gate XYZ Property 1: Value Property 2: Value Property 3: Value . . . Property 500: Value net YUZ Property 1: Value Property 2: Value Property 3: Value . . . Property 600: Value Bigram XYZ-YUZ Property 1: Value Property 2: Value Property 3: Value . . . Property 500: Value Property 501: Value Property 502: Value Property 503: Value . . . Property 1100: Value 1 LSTM LSTM LSTM LSTM LSTM Bigram Bigram Bigram Bigram Bigram Data Path Data Path 2 LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 60. Representation Learning for STA 62 Data Path 1 LSTM LSTM LSTM LSTM LSTM 2 Data 1 LSTM LSTM LSTM LSTM LSTM 2 Capture 1 LSTM LSTM LSTM LSTM LSTM 2 Launch Capture Path Launch Path Data Representation Learning Capture Representation Learning Launch Representation Learning Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 61. Representation Learning for STA 63 1 LSTM LSTM LSTM LSTM LSTM 2 Data 1 LSTM LSTM LSTM LSTM LSTM 2 Capture 1 LSTM LSTM LSTM LSTM LSTM 2 Launch Data Representation Learning Capture Representation Learning Launch Representation Learning Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 62. Representation Learning for STA 64 Launch Delay Capture Delay Data Delay Launch Path Features Capture Path Features Data Path Features Label Prediction PBA Slack Prediction FC Sub-Label Prediction Used for training Phase; Removed in test phase Input Sample Data Representation Learning Capture Representation Learning Launch Representation Learning Fully Connected Layer Dropout Fully Connected Layer FC FC Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 63. Comparison to Prior Art 65 Timing Engine (STA) Features Timing Engine (STA) Representation Learning Training Training Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 64. Results (Route PBA from GBA) 0 5 10 15 20 25 30 0 . 7 8 0 . 7 9 0 . 8 1 0 . 8 3 0 . 8 5 0 . 8 7 0 . 8 9 0 . 9 1 0 . 9 3 0 . 9 5 0 . 9 7 0 . 9 9 1 . 0 1 1 . 0 3 1 . 0 5 68 Ethernet 0 5 10 15 20 25 30 35 40 0 . 7 8 0 . 7 9 0 . 8 1 0 . 8 3 0 . 8 5 0 . 8 7 0 . 8 9 0 . 9 1 0 . 9 3 0 . 9 5 0 . 9 7 0 . 9 9 1 . 0 1 1 . 0 3 1 . 0 5 S38417 Standard Deviation (ps) Standard Deviation (ps) Voltages (V) Voltages (V) RAPTA average train and test time on GPU. The reported number for test, is the time needed to generate PBA prediction for 10K timing paths. The training is only done once during the design cycle. RAPTA Reference: "RAPTA: A hierarchical representation learning solution for real-time prediction of path-based static timing analysis." GLSVLSI 2022
  • 65. Many Possibilities Ahead! Going Beyond Heuristic 69
  • 66. 70 ML for Optimization Learning Assisted Physical Design (LAPD) Auto Feature Engineering ML for QoR Prediction ML Framework Speedup Final Word!
  • 67. 71 ?