Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Cognitive systems16 by diannepatricia 366 views
- #48 Machine learning by Meridian Impactiv... 166 views
- Demystifying Machine Learning - How... by 10x Nation 395 views
- Actividad 02 by Alejandra Perez 440 views
- Machine learning the next revolut... by Jorge Ferrer 799 views
- Introduction to Machine Learning @ ... by Ilya Kuzovkin 519 views

281 views

Published on

Published in:
Technology

No Downloads

Total views

281

On SlideShare

0

From Embeds

0

Number of Embeds

87

Shares

0

Downloads

3

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Machine Learning for Understanding and Managing Ecosystems Tom Dietterich Oregon State University In collaboration with Postdocs: Dan Sheldon (now at UMass, Amherst), Mark Crowley (now at U. Waterloo) Graduate Students: Majid Taleghan, Kim Hall, Liping Liu, Akshat Kumar, Tao Sun, Rachel Houtman, Sean McGregor, Hailey Buckingham Economists: H. Jo Albers, Claire Montgomery Cornell Lab of Ornithology: Steve Kelling, Daniel Fink, Andrew Farnsworth, Wes Hochachka, Benjamin Van Doren, Kevin Webb 1 IBM Cognitive Computing
- 2. The World Faces Many Sustainability Challenges Species Extinctions Invasive Species Effects of Climate Change on these IBM Cognitive Computing 2
- 3. Computational Sustainability The study of computational methods that can contribute to the sustainable management of the earth’s ecosystems Data Models Policies Data Integration Data Interpretation Model Fitting Policy Optimization Data Acquisition Policy Execution 3 IBM Cognitive Computing
- 4. Outline: Three Projects at Oregon State Models of Bird Migration Collective Graphical Models Policy Optimization Controlling Invasive Species Managing Wildland Fire Data Integration Data Interpretation Model Fitting Policy Optimization Data Acquisition Policy Execution 4 IBM Cognitive Computing
- 5. BirdCast Project Understanding Bird Migration Goal: Develop a scientific model of bird migration Produce 24- and 48-hour bird migration forecasts Understanding bird decision making Absolute timing (e.g., based on day length) Temperature Wind speed and direction Relative humidity Food availability IBM Cognitive Computing 5
- 6. Data (1): www.ebird.org Volunteer Bird Watchers Stationary Count Travelling Count Time, place, duration, distance travelled Checklist of species seen 8,000-12,000 checklists uploaded per day 6 IBM Cognitive Computing
- 7. Data (2): Doppler Weather Radar Radar detects weather (remove) smoke, dust, and insects (remove) birds and bats IBM Cognitive Computing 7
- 8. Data (3): Acoustic monitoring Night flight calls People can identify species or species groups from these calls IBM Cognitive Computing 8
- 9. Modeling Goal: Spatial Hidden Markov Model Define a grid over the US Consider a single bird We say the bird is in state 𝑖𝑖 on day 𝑡𝑡 if it is located inside cell 𝑖𝑖 on that day Let 𝑃𝑃𝑡𝑡(𝑖𝑖 → 𝑗𝑗) be the probability that the bird will fly from cell 𝑖𝑖 to cell 𝑗𝑗 on the night from day 𝑡𝑡 to day 𝑡𝑡 + 1 We will represent this probability in terms of variables such as wind speed and direction distance from 𝑖𝑖 to 𝑗𝑗 air temperature relative humidity day of the year etc. Let Θ be the coefficients of the probability model. 9 IBM Cognitive Computing
- 10. Simulating the Migration of a Single Bird Assume we know the value of Θ The bird starts in cell 4 at time 𝑡𝑡 = 1 𝑛𝑛1 4 = 1 Simulate the first night by drawing a cell 𝑗𝑗 according to 𝑃𝑃𝑡𝑡 4 → 𝑗𝑗 “rolling a dice” Repeat this for 𝑇𝑇 time steps If we had enough bird watchers, we could map out the trajectory of the bird Then we could match that against our simulated trajectory and adjust Θ until the simulations matched the observed behavior IBM Cognitive Computing 10
- 11. Simulating the Migration of a Single Bird Assume we know the value of Θ The bird starts in cell 4 at time 𝑡𝑡 = 1 𝑛𝑛1 4 = 1 Simulate the first night by drawing a cell 𝑗𝑗 according to 𝑃𝑃𝑡𝑡 4 → 𝑗𝑗 “rolling a dice” Repeat this for 𝑇𝑇 time steps If we had enough bird watchers, we could map out the trajectory of the bird Then we could match that against our simulated trajectory and adjust Θ until the simulations matched the observed behavior IBM Cognitive Computing 11
- 12. Simulating the Migration of a Single Bird Assume we know the value of Θ The bird starts in cell 4 at time 𝑡𝑡 = 1 𝑛𝑛1 4 = 1 Simulate the first night by drawing a cell 𝑗𝑗 according to 𝑃𝑃𝑡𝑡 4 → 𝑗𝑗 “rolling a dice” Repeat this for 𝑇𝑇 time steps If we had enough bird watchers, we could map out the trajectory of the bird Then we could match that against our simulated trajectory and adjust Θ until the simulations matched the observed behavior IBM Cognitive Computing 12
- 13. Population of Birds Consider a population of 𝑀𝑀 birds The state of this population is a vector 𝐧𝐧𝑡𝑡 such that 𝐧𝐧𝑡𝑡(𝑖𝑖) is the number of birds in cell 𝑖𝑖 on day 𝑡𝑡 We can simulate each of these birds moving simultaneously each bird “rolls a dice” every night to decide where to go If we have enough bird watchers, we can get a good estimate of 𝐧𝐧𝑡𝑡 every day We can compare our simulations against the observations and adjust Θ until they match IBM Cognitive Computing 13
- 14. This is very slow Computer Science to the rescue Formulate the problem mathematically Formalism is called the “Collective Graphical Model” (CGM) Develop algorithms for probabilistic inference Use these algorithms to fit the model to the observations IBM Cognitive Computing 14
- 15. 16 grid cells Probabilistic Inference for CGMs Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] IBM Cognitive Computing 15
- 16. 16 grid cells Probabilistic Inference for CGMs 49 grid cells Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] IBM Cognitive Computing 16
- 17. 16 grid cells Probabilistic Inference for CGMs 49 grid cells Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] Convex optimization [Sheldon, Sun, Kumar, ICML 2013] IBM Cognitive Computing 17
- 18. 16 grid cells Probabilistic Inference for CGMs 49 grid cells Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] Convex optimization [Sheldon, Sun, Kumar, ICML 2013] Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014] No Data IBM Cognitive Computing 18
- 19. 16 grid cells Probabilistic Inference for CGMs 49 grid cells Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] Convex optimization [Sheldon, Sun, Kumar, ICML 2013] Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014] Non-linear belief propagation [Sun, Sheldon, Kumar, ICML 2015] IBM Cognitive Computing 19
- 20. 16 grid cells Probabilistic Inference for CGMs Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011] Convex optimization [Sheldon, Sun, Kumar, ICML 2013] Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014] Non-linear belief propagation [Sun, Sheldon, Kumar, ICML 2015] Proximal algorithm [Vilnis, Belanger, Sheldon, McCallum UAI 2015] 49 grid cells IBM Cognitive Computing 20
- 21. Initial Results: Ruby-throated Humming Bird IBM Cognitive Computing 21
- 22. Need to Constrain the Model Problem: The migration model tends to “store” birds in Canada There are no observations there, so the model is not constrained by the data Solution: Constrain the model Specify the times and places where the CGM is allowed to have birds IBM Cognitive Computing 22
- 23. Constrained Results: Ruby-Throated Humming Bird IBM Cognitive Computing 23
- 24. Fitted Transition Parameters Θ Distance and direction traveled: northness: −0.4808 distance: 0.1895 stayput: 3.5058 time: 0.5217 temperature: −0.1556 wind profit: 0.2754 IBM Cognitive Computing 24
- 25. Next Steps: Integrating Multiple Data Sources IBM Cognitive Computing 25 𝒏𝒏𝑡𝑡 𝑠𝑠 𝒏𝒏𝑡𝑡,𝑡𝑡+1 𝑠𝑠 𝑒𝑒𝑡𝑡 𝑠𝑠 (𝑖𝑖, 𝑜𝑜) 𝑠𝑠 = 1, … , 𝑆𝑆 𝑚𝑚𝑡𝑡,𝑡𝑡+1 𝑠𝑠 (𝑘𝑘) 𝑦𝑦𝑡𝑡,𝑡𝑡+1 𝑠𝑠 (𝑘𝑘) 𝑟𝑟𝑡𝑡,𝑡𝑡+1(𝑣𝑣) 𝑧𝑧𝑡𝑡,𝑡𝑡+1(𝑣𝑣) …… 𝑜𝑜 = 1, … , 𝑂𝑂(𝑖𝑖, 𝑡𝑡) 𝑠𝑠 = 1, … , 𝑆𝑆 𝑖𝑖 = 1, … , 𝐿𝐿 𝑠𝑠 = 1, … , 𝑆𝑆 𝑘𝑘 = 1, … , 𝐾𝐾 𝑣𝑣 = 1, … , 𝑉𝑉 eBird acoustic radar birds 𝒙𝒙𝑡𝑡,𝑡𝑡+1𝒙𝒙𝑡𝑡
- 26. Outline: Three Projects at Oregon State Models of Bird Migration Collective Graphical Models Policy Optimization Controlling Invasive Species Managing Wildland Fire Data Integration Data Interpretation Model Fitting Policy Optimization Data Acquisition Policy Execution 26 IBM Cognitive Computing
- 27. Invasive Species Management in River Networks Tamarisk: invasive tree from the Middle East Out-competes native vegetation for water Reduces biodiversity What is the best way to manage a spatially-spreading organism? 27 IBM Cognitive Computing
- 28. Mathematical Model Tree-structured river network Each segment 𝑒𝑒 has 𝐻𝐻 “sites” where a tree can grow. Each site can be {empty, occupied by native, occupied by invasive} Management actions Each segment: {do nothing, eradicate, restore, eradicate+restore} 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 n 28 IBM Cognitive Computing
- 29. Dynamics and Objective Dynamics: In each time period 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 n 29 IBM Cognitive Computing
- 30. Dynamics and Objective Dynamics: In each time period Natural death 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 n 30 IBM Cognitive Computing
- 31. Dynamics and Objective Dynamics: In each time period Natural death Seed production 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 n 31 IBM Cognitive Computing
- 32. Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 n 32 IBM Cognitive Computing
- 33. Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 tnnnn 33 IBM Cognitive Computing
- 34. Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established Couples all edges because of spatial spread Inference is intractable 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 tnnnn 34 IBM Cognitive Computing
- 35. Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established Couples all edges because of spatial spread Inference is intractable Objective: Minimize expected discounted costs (sum of cost of invasion plus cost of management) Subject to annual budget constraint 𝑒𝑒1 𝑒𝑒2 𝑒𝑒3 𝑒𝑒4 𝑒𝑒5 tnnnn 35 IBM Cognitive Computing
- 36. Finding the Optimal Management Policy Formalize as a Markov Decision Process Solve by Stochastic Dynamic Programming SDP requires transition matrix 𝑇𝑇 𝑖𝑖, 𝑗𝑗, 𝑎𝑎 = 𝑃𝑃(𝑗𝑗|𝑖𝑖, 𝑎𝑎) We don’t know 𝑇𝑇 Solution: Write a simulator Draw Monte Carlo samples from simulator to estimate 𝑇𝑇[𝑖𝑖, 𝑗𝑗, 𝑎𝑎] IBM Cognitive Computing 36
- 37. Solving the Tamarisk MDP using Monte Carlo Samples Repeat Use the current policy to choose a state 𝑖𝑖 and management action 𝑎𝑎 Invoke the simulator 𝑖𝑖, 𝑎𝑎 → (𝑗𝑗, 𝑐𝑐) 𝑗𝑗 is the resulting state 𝑐𝑐 is the cost of the action and the resulting state Update our model of 𝑇𝑇 Apply stochastic dynamic programming to compute an improved policy Until the policy has converged Key question: What 𝑖𝑖, 𝑎𝑎 should we choose? Our answer: The DDV heuristic IBM Cognitive Computing 37
- 38. Comparison against best previous Monte Carlo MDP planning method IBM Cognitive Computing 38 1.E+05 1.E+06 1.E+07 NumberofSamples MDP DDV Fiechter
- 39. Published Rule of Thumb Policies for Invasive Species Management Triage Policy Treat most-invaded edge first Break ties by treating upstream first Leading edge Eradicate along the leading edge of invasion Chades, et al. Treat most-upstream invaded edge first Break ties by amount of invasion DDV Our PAC solution 39 IBM Cognitive Computing
- 40. Cost Comparisons: Rule of Thumb Policies vs. DDV 0 50 100 150 200 250 300 350 400 450 Large pop, up to down Chades Leading Edge Optimal Total Costs Triage DDVChades Leading Edge 40 IBM Cognitive Computing
- 41. Outline: Three Projects at Oregon State Models of Bird Migration Collective Graphical Models Policy Optimization Controlling Invasive Species Managing Wildland Fire Data Integration Data Interpretation Model Fitting Policy Optimization Data Acquisition Policy Execution 41 IBM Cognitive Computing
- 42. Managing Wildfire in Eastern Oregon Natural state: Large Ponderosa Pine trees with open understory Frequent “ground fires” that remove understory plants (grasses, shrubs) but do not damage trees Fires have been suppressed since 1920s Heavy accumulation of fuels in understory Large catastrophic fires that kill all trees and damage soils Huge firefighting costs and lives lost 42 IBM Cognitive Computing
- 43. Study Area: Deschutes National Forest Goal: Return the landscape to its “natural” fire regime Management Question: LET-BURN: When lightning ignites a fire, should we let it burn? 43 IBM Cognitive Computing
- 44. Formulating LETBURN as a Markov Decision Process 〈𝑆𝑆, 𝐴𝐴, 𝑅𝑅, 𝑇𝑇, 𝛾𝛾〉 State space: 𝑆𝑆 4000 management units; each unit is in one of 25 local states Weather Ignition site Action space: 𝐴𝐴 At fire ignition time 𝑡𝑡, 𝑎𝑎𝑡𝑡 ∈ 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿, 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 Reward function: 𝑅𝑅(𝑠𝑠, ℓ, 𝑎𝑎) Cost of lost timber value Cost of lost species habitat Cost of fire suppression 44 𝑠𝑠𝑡𝑡 ignition 𝑎𝑎𝑡𝑡 action ℓ𝑡𝑡 fire outcome 𝑠𝑠𝑡𝑡+1 new ignition fire simulator lightning simulator 𝑟𝑟𝑡𝑡 IBM Cognitive Computing
- 45. The Simulator is Very Expensive Simulating one fire can take from 5 to 60 minutes (depending on the size of the fire) FARSITE Forest Vegetation Simulator (FVS) Lightning Strike model Weather Simulator Monte Carlo methods require at least 106 simulator calls What can we do? IBM Cognitive Computing 45
- 46. Current Strategy: Policy Search using a Surrogate Model Define a parameterized space of policies: 𝜋𝜋𝜃𝜃 𝑠𝑠 = 𝑎𝑎 Simulate an initial set of 100-year trajectories under a variety of policies Apply Bayesian Optimization (SMAC; Hutter, et al., 2011) to find the optimal value of 𝜃𝜃 To simulate 𝜋𝜋𝜃𝜃′ for some new 𝜃𝜃′ , apply the Model-Free Monte Carlo algorithm (Fonteneau, et al., 2013) IBM Cognitive Computing 46
- 47. A Simpler Problem: LETBURN one year Is there any benefit to allowing fires to burn for just one year? Year 1: LETBURN Years 2-100: SUPPRESS ALL Evaluate via Monte Carlo trials 47 IBM Cognitive Computing
- 48. Expected Benefit of LETBURN (Suppress all fires after year 1) 0 5 10 15 20 25 30 35 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 Frequency Expected Benefit (x $100,000) mean = $2.47 million median = $2.74 million 48[Houtman, Montgomery, Gagnon, Calkin, Dietterich, McGregor, Crowley 2013]IBM Cognitive Computing
- 49. Summary Models of Bird Migration Collective Graphical Models Policy Optimization Controlling Invasive Species Managing Wildland Fire Data Integration Data Interpretation Model Fitting Policy Optimization Data Acquisition Policy Execution 49 IBM Cognitive Computing
- 50. Common Threads Spatially-spreading processes Bird migration Invasive species Fire spread Dynamical model CGM: Spatial HMM with clever inference Simulator of seed spread Simulator of fire spread Computational challenges Efficient probabilistic inference Minimize calls to expensive simulators Value of information heuristics + PAC guarantees Bayesian optimization IBM Cognitive Computing 50
- 51. Thank-you Dan Sheldon, Akshat Kumar, Tao Sun: Collective Graphical Models Steve Kelling, Andrew Farnsworth, Wes Hochachka, Daniel Fink: BirdCast H. Jo Albers, Kim Hall, Majid Taleghan, Mark Crowley: Tamarisk Claire Montgomery, Sean McGregor, Mark Crowley, Rachel Houtman Carla Gomes for spearheading the Institute for Computational Sustainability National Science Foundation Grants 0832804 (CompSust), 1331932 (CyberSEES), 1125228 (Birdcast), 1521687 (CompSustNet) 51 IBM Cognitive Computing
- 52. Common Threads Spatially-spreading processes Bird migration Invasive species Fire spread Dynamical model CGM: Spatial HMM with clever inference Simulator of seed spread Simulator of fire spread Computational challenges Efficient probabilistic inference Minimize calls to expensive simulators Value of information heuristics + PAC guarantees Bayesian optimization IBM Cognitive Computing 52

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment