Advertisement
Advertisement

More Related Content

Similar to Optimizing BERT and Natural Language Models with SigOpt Experiment Management(20)

Advertisement

Optimizing BERT and Natural Language Models with SigOpt Experiment Management

  1. SigOpt. Confidential. Why is Experiment Management important for NLP?
  2. SigOpt. Confidential. First, a quick overview of the problem
  3. SigOpt. Confidential. Two main questions 3 Can we understand the trade-offs made during model compression? Can we find a model architecture that fits our needs?
  4. SigOpt. Confidential. The Data: SQUAD 2.0 4 SQUAD 2.0
  5. SigOpt. Confidential. Distilling BERT for Question Answering 5 BERT Pre-trained for language modeling Student Model SQUAD 2.0 SQUAD 2.0 Soft target loss Hard target loss BERT Fine-tuned for SQUAD 2.0 Trained Student Model For more on distillation: Hinton et al 2015, DistilBERT
  6. SigOpt. Confidential. Multimetric Bayesian Optimization Optimizing for two competing metrics 6 SigOpt’s Multimetric Optimization
  7. SigOpt. Confidential. Experiment Management was critical throughout this process
  8. SigOpt. Confidential. Experiment management was critical for 8 Model Development Understanding the Problem Space Monitoring Long Cycles
  9. SigOpt. Confidential. Model Development
  10. SigOpt. Confidential. Establishing a Baseline 10 BERT Pre-trained for language modeling Student Model SQUAD 2.0 SQUAD 2.0 Soft target loss Hard target loss BERT Fine-tuned for SQUAD 2.0 Trained Student Model ? ? ? ? ?
  11. SigOpt. Confidential. Establishing a Baseline: Training from scratch 11 BERT Pre-trained for language modeling DistilBERT SQUAD 2.0 SQUAD 2.0 Standard soft target loss Standard hard target loss BERT Fine-tuned for SQUAD 2.0 Trained Model ? ? ? ? ?
  12. SigOpt. Confidential. Baseline #1: Trained from scratch Dashboard link
  13. SigOpt. Confidential. Establishing a Baseline: Warm starting the model 13 BERT Pre-trained for language modeling DistilBERT SQUAD 2.0 SQUAD 2.0 Standard soft target loss Standard Hard target loss BERT Fine-tuned for SQUAD 2.0 Trained Model DistilBERT Pre-trained for language modeling Pretrained Weights ? ?
  14. SigOpt. Confidential. Baseline #2: Pretrained weights Dashboard link
  15. SigOpt. Confidential. Understanding the problem space
  16. SigOpt. Confidential. Running HPO to understand the problem space 16
  17. SigOpt. Confidential. Let’s take a look at the experiment dashboard
  18. SigOpt. Confidential. Correlations in the parameter space 18
  19. SigOpt. Confidential. Exploring specific parameter areas 19 Runs dashboard
  20. SigOpt. Confidential. Taking data properties into account 20
  21. SigOpt. Confidential. Providing feedback to the optimizer 21
  22. SigOpt. Confidential. Monitoring the full experiment
  23. SigOpt. Confidential. Monitoring the full experiment 23 Run dashboard
  24. SigOpt. Confidential. SigOpt found dozens of viable models 24 Baseline Exact Baseline Size Metric Threshold
  25. SigOpt. Confidential. How did experiment management help throughout my process?
  26. SigOpt. Confidential.26 Model Development Understanding the Problem Space Monitoring Long Cycles Experiment Validation Experiment design and exploring the parameter space Tracking and Debugging
  27. SigOpt. Confidential. So why Experiment Management? 27
  28. SigOpt. Confidential. Check out our YouTube channel: Learn more about SigOpt Read our research and product blog. See more videos here. Sign up to try out SigOpt for free! Join the Experiment Management beta Click Here Read the full work on Nvidia’s dev blog
  29. SigOpt. Confidential. Thank you!
Advertisement