SlideShare a Scribd company logo
1 of 41
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thomas Delteil, Applied Scientist @ AWS Deep Engine
APJCTech Summit 2018, Macau
Debugging MXNet Gluon
modelsAnd other performance tricks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thomas Delteil, Applied Scientist @ AWS Deep Engine
APJCTech Summit 2018, Macau
Debugging MXNet Gluon models
And other performance tricks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Remote debugging with PyCharm
Visualizing deep learning
Performance tricks and gotchas
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Apache History
• CMU project of PHD students in 2015 and the Distributed
Machine Learning Community (DMLC)
• 2017 => MXNet Gluon Imperative API is released
Tianqi Chen
UW
Mu Li
Amazon AI
Yutian Li
Stanford
Min Lin
MILA
Naiyan Wang
TuSimple
Minjie Wang
NYU CS
Tianjun Xiao
Tesla
Bing Xu
Apple AI
Chiyuan Zhang
Google Brain
Zheng Zhang
MSR Asia
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Imperative vs Symbolic computational graphs
Symbolic
define, compile, run
Imperative
define-by-run in the host language
Inception model
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Imperative > Symbolic
Debuggable
Fast to prototype
Hybridizable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Interactive Debugging
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Shapes
Values
Gradients
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Youtube tutorial
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visualizing Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Network Architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXNet native code (#1) print(net)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXNet native code (#2) mx.viz.plot_network(sym)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXNet native code (#3) mx.viz.print_summary(sym)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Netron (online tool)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXBoard sw.add_graph(net)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System performance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
GPU: gpu_monitor (github)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CPU / RAM: > top i
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXBoard
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXBoard Scalars
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MXBoard Images
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Console
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Tips and tricks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
130 samples/sec 1.25x 2.41x 2.46x 2.53x 3.84x
GPU utilization
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Environment
mxnet-mkl (32x)
vs
mxnet
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
I/O Bound
→ GPU Starvation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
#1 Asynchronously pre-fetching data (low CPU) (1.25x)
DataLoader(num_workers=CPU_COUNT-3)
#2 Offline preprocessing (full CPU)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
GPU → CPU memcopy synchronization idling
#3 Smart synchronization calls (2.46x)
→ Small networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Copy to GPU
Forward/Backward
Metric
Copy to GPU
Forward/Backward
Metric
Copy to GPU
Forward/Backward
Metric
Copy to GPU
Forward/Backward
Metric
Copy to GPU
Copy to GPU
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Execution engine
Imperative → Symbolic (2.41x)
net.hybridize()
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameters
Batchsize (2.56x)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Optimizer
Performance:
Time to accuracy
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mixed precision training
float32 → float16 (3.84x)
net.cast("float16")
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Profiling
profiler.set_state('run')
…
profiler.set_state('stop')
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusion
- Use Gluon to debug and iterate quickly
- Hybridize and optimize for speed
- Know your model: Visualize performance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Follow-up:
tdelteil@
Github.com/thomasdelteil
AWS Deep Engine, Vancouver

More Related Content

Recently uploaded

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Debugging and Performance tricks for MXNet Gluon

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thomas Delteil, Applied Scientist @ AWS Deep Engine APJCTech Summit 2018, Macau Debugging MXNet Gluon modelsAnd other performance tricks © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thomas Delteil, Applied Scientist @ AWS Deep Engine APJCTech Summit 2018, Macau Debugging MXNet Gluon models And other performance tricks
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Remote debugging with PyCharm Visualizing deep learning Performance tricks and gotchas
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Apache History • CMU project of PHD students in 2015 and the Distributed Machine Learning Community (DMLC) • 2017 => MXNet Gluon Imperative API is released Tianqi Chen UW Mu Li Amazon AI Yutian Li Stanford Min Lin MILA Naiyan Wang TuSimple Minjie Wang NYU CS Tianjun Xiao Tesla Bing Xu Apple AI Chiyuan Zhang Google Brain Zheng Zhang MSR Asia
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Imperative vs Symbolic computational graphs Symbolic define, compile, run Imperative define-by-run in the host language Inception model
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Imperative > Symbolic Debuggable Fast to prototype Hybridizable
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Interactive Debugging
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Shapes Values Gradients
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Youtube tutorial
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visualizing Deep Learning
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Network Architecture
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXNet native code (#1) print(net)
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXNet native code (#2) mx.viz.plot_network(sym)
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXNet native code (#3) mx.viz.print_summary(sym)
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Netron (online tool)
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXBoard sw.add_graph(net)
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. System performance
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. GPU: gpu_monitor (github)
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CPU / RAM: > top i
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training metrics
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXBoard
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXBoard Scalars
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MXBoard Images
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Console
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Tips and tricks
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 130 samples/sec 1.25x 2.41x 2.46x 2.53x 3.84x GPU utilization
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Environment mxnet-mkl (32x) vs mxnet
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. I/O Bound → GPU Starvation
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. #1 Asynchronously pre-fetching data (low CPU) (1.25x) DataLoader(num_workers=CPU_COUNT-3) #2 Offline preprocessing (full CPU)
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. GPU → CPU memcopy synchronization idling #3 Smart synchronization calls (2.46x) → Small networks
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Copy to GPU Forward/Backward Metric Copy to GPU Forward/Backward Metric Copy to GPU Forward/Backward Metric Copy to GPU Forward/Backward Metric Copy to GPU Copy to GPU
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Execution engine Imperative → Symbolic (2.41x) net.hybridize()
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameters Batchsize (2.56x)
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Optimizer Performance: Time to accuracy
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Mixed precision training float32 → float16 (3.84x) net.cast("float16")
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Profiling profiler.set_state('run') … profiler.set_state('stop')
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusion - Use Gluon to debug and iterate quickly - Hybridize and optimize for speed - Know your model: Visualize performance
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! Follow-up: tdelteil@ Github.com/thomasdelteil AWS Deep Engine, Vancouver

Editor's Notes

  1. Data loading issue Nan values Loss exploding suddenly
  2. Explain ssh tunnel and tensorboard
  3. 22M$ GPU