Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Le Bauer: Data Driven Model Development

1,129 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Le Bauer: Data Driven Model Development

  1. 1. Data Driven Model Development David LeBauer, Mike Dietze, Deepak Jaiswal, Rob Kooper, Stephen P. Long, Shawn Serbin, Dan Wang
  2. 2. Information Objective: Useful Predictions Precision, Accuracy Clark et al. 2001 Ecological Forecasts, An Emerging Imperative. Science
  3. 3. Sources of Uncertainty Schlesinger et al. 1979 Terminology for model credibility. Simulation.
  4. 4. Windows An error has occurred. To continue: Press Enter to return to Windows, or Press CTRL+ALT+DEL to restart your computer. If you do this, you will loose any unsaved information in all open applications Error: 0E : 016F : BFF9B3D4 Press any key to continue _
  5. 5. Technical Uncertainty Yield A Cautionary Tale Observed
  6. 6. Technical Uncertainty A Cautionary Tale Priors Yield Observed
  7. 7. Technical Uncertainty A Cautionary Tale + Trait Data Priors Yield Observed
  8. 8. Technical Uncertainty A Cautionary Tale + Flux Data + Trait Data Priors Yield Observed
  9. 9. Technical Uncertainty A Cautionary Tale + Flux Data + Trait Data Priors Yield Observed Annual Merge
  10. 10. Technical Uncertainty A Cautionary Tale Annual Merge + Latest Version + Flux Data + Trait Data Priors Yield Observed
  11. 11. Best Practices Write programs for people, not computers Automate repetitive tasks Use the computer to record history Make incremental changes Use version control Don't repeat yourself (or others) Plan for mistakes Optimize software only after it works correctly Document the design and purpose of code Conduct code reviews Wilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
  12. 12. Best Practices Write programs for people, not computers Automate repetitive tasks Use the computer to record history Make incremental changes Use version control Don't repeat yourself (or others) Plan for mistakes Optimize software only after it works correctly Document the design and purpose of code Conduct code reviews Wilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
  13. 13. Best Practices 1: Automation Write programs for people, not computers Automate repetitive tasks Use the computer to record history Make incremental changes Use version control Don't repeat yourself (or others) Plan for mistakes Optimize software only after it works correctly Document the design and purpose of code Conduct code reviews Altintas et al 2004. Kepler: an extensible system for design and execution of scientific workflows. Proc 16th ICSSDM
  14. 14. Parameter Uncertainty: Test Case Single Analysis: Contribution of parameter uncertainty to uncertainty in Switchgrass Yield prediction. LeBauer, Wang, Richter, Davidson, and Dietze 2013. Facilitating Feedbacks between ecological models and data. Ecological Monographs
  15. 15. Parameter Uncertainty: Automated * 17 Plant functional types * 6 biomes * 8 scientists * 6 Months Dietze, Serbin, LeBauer, Davidson, Desai, Feng, Kelly, Kooper, LeBauer, Mantooth, McHenry, and Wang. submitted A quantitative assessment of a terrestrial biosphere model's data needs across North American biomes. JGR % SD Explained Contribution of parameter uncertainty to model uncertainty.
  16. 16. Best Practices 2: Iteration with Testing Write programs for people, not computers Automate repetitive tasks Use the computer to record history Make incremental changes Use version control Don't repeat yourself (or others) Plan for mistakes Optimize software only after it works correctly Document the design and purpose of code Conduct code reviews Wilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
  17. 17. Case Study: C4 Crop  Coppice Willow C3 Photosynthesis Perennial Stem Leaf Senescence
  18. 18. Benchmark Data Aboveground Biomass 23 Calibration Sites 72 Observations 0.0 20.0 40.0 Observed (Mg/ha) 60.0
  19. 19. Results: Standard Deviation* 1 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 Correlation + Fixed Respiration + Leaf Senescence RMSE* 0 *Scaled to sddata = 1
  20. 20. Results: Standard 1 Deviation* 0.74 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 Correlation + Fixed Respiration 0.20 + Leaf Senescence 0.67 RMSE* 0 *Scaled to sddata = 1
  21. 21. Results: Standard 1 Deviation* 0.74 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 Correlation + Fixed Respiration 0.20 + Leaf Senescence 0.67 RMSE* 0 *Scaled to sddata = 1
  22. 22. Results: 1.46 Standard 1 Deviation* 0.74 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 Correlation + Fixed Respiration 0.20 + Leaf Senescence 0.67 RMSE* 0 *Scaled to sddata = 1
  23. 23. Results: 1.46 Standard 1 Deviation* 0.74 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 Correlation + Fixed Respiration 0.20 + Leaf Senescence 0.67 RMSE* 0 *Scaled to sddata = 1
  24. 24. Results: 1.46 Standard 1 Deviation* 0.74 0.84 Start (C4 Grass) + C3 Photosynthesis + Perennial Stem 1 0.87 Correlation + Fixed Respiration 0.20 + Leaf Senescence 0.67 RMSE* 0.30 0 *Scaled to sddata = 1
  25. 25. Aboveground Biomass (Mg/ha) Predicted 100 50.0 0.0 0.0 50.0 Observed 80.0
  26. 26. Conclusions * Best practices lead to more effective and efficient modeling * Applied integration tests to support model development * Controlling technical error produces more robust and accurate inference
  27. 27. Future Directions * Track benchmark metrics for specific model runs * Maintain ability to reproduce published results * Automated testing with each code commit or major release * Current Metrics to define limits of model credibility
  28. 28. More Information Email: dlebauer@illinois.edu Web: pecanproject.org Development: github.com/pecanproject

×