Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DSD-INT 2019 Predicting overtopping - Wilms

31 views

Published on

Presentation by Fine Wilms, Deltares, at the Data Science Symposium, during Delft Software Days - Edition 2019. Thursday, 14 November 2019, Delft.

Published in: Software
  • Be the first to comment

  • Be the first to like this

DSD-INT 2019 Predicting overtopping - Wilms

  1. 1. Predicting overtopping Josefine Wilms
  2. 2. What is Overtopping? Data Science Symposium 2019
  3. 3. Prediction Methods Analytical Numerical Empirical Prediction Method CFD solverPhysics Machine learning It’s complicated Model setup Data Data Science Symposium 2019
  4. 4. Prediction Methods: Machine Learning Neural networks Decision tree based Overtopping F1<2 0.50.2 Data Science Symposium 2019
  5. 5. Gradient Boosting with decision trees Calculate residuals Construct tree i Construct tree i+1 to improve high error samples Combine trees Data Science Symposium 2019
  6. 6. CLASH data Data Science Symposium 2019
  7. 7. Available samples x13000 Data Science Symposium 2019
  8. 8. Viable samples <13000 Data Science Symposium 2019
  9. 9. Viable samples Data Science Symposium 2019
  10. 10. Effect of small changes?? Data Science Symposium 2019
  11. 11. Bootstrapping a.k.a Bagging 3 Training sets 1,2,2,4 1,2,1,2 3,3,2,4 1,2,3,4 Data Science Symposium 2019
  12. 12. Create 500 sub-samples Data Science Symposium 2019
  13. 13. Workflow Create subsamples Train 500 models Train Set Test 500 models 500 predictions for each target Do some stats: mean, min, max Split into train/test Data Science Symposium 2019
  14. 14. Workflow Create subsamples Train 500 models Train Set Test 500 models 500 predictions for each target Do some stats: mean, min, max Split into train/test Data Science Symposium 2019
  15. 15. Is the model any good? Compare! Neural networks Overtopping • Overtopping 2.04 • UNIBO Data Science Symposium 2019
  16. 16. Comparison: Prediction vs target experiment 3 Data Science Symposium 2019
  17. 17. Comparison: RMSE Data Science Symposium 2019
  18. 18. Model spread for UNIBO Data Science Symposium 2019
  19. 19. Model spread for Overtopping 2.04 Data Science Symposium 2019
  20. 20. Model spread for XGBoost Data Science Symposium 2019
  21. 21. Conclusions Data Science Symposium 2019 • Gradient boosting outperforms neural nets • Model spread • RMSE • SKLearn Gradient boosting not as good as XGBoost • Multi-colinear features should not be removed for XGBoost • XGBoost is exchangeable
  22. 22. Questions… fine.wilms@deltares.nl, +31 65 008 7305 Data Science Symposium 2019

×