Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Using sentence compression to devel... by Shane Dawson 1207 views
- Probabilistic Risk Assessment by Haroon Abbu 796 views
- phd-thesis by Ralph Brecheisen 78 views
- Visualization of Uncertain Informat... by Manolis Wallace 478 views
- Visualizing Uncertainty in the Pred... by Xavier Ochoa 1203 views
- Data quality and uncertainty visual... by bdemchak 574 views

2,866 views

Published on

Vasily Demyanov – Heriot–Watt Institute, Edinburgh (U.K.)

Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)

No Downloads

Total views

2,866

On SlideShare

0

From Embeds

0

Number of Embeds

15

Shares

0

Downloads

172

Comments

1

Likes

4

No embeds

No notes for slide

- 1. UNCERTAINTY QUANTIFICATION OF GEOSCIENCE PREDICTION MODELS BASED ON SUPPORT VECTOR REGRESSION V. Demyanov1, A. Pozdnoukhov2, M. Kanevski3, M. Christie1 1 Institute of Petroleum Engineering, Heriot-Watt University, Edinburgh, UK vasily.demyanov@pet.hw.ac.uk 2 National Centre for Geocomputation, National University of Ireland, Maynooth. 3 Institute of Geomatics and Risk Analysis, University of Lausanne
- 2. Outline • Geoscience modelling under uncertainty • Machine learning based geomodels • Semi-supervised SVR reservoir model – Case study – Robustness to noise – Predictions with uncertainty • Conclusions
- 3. Outline • Geoscience modelling under uncertainty • Machine learning based geomodels • Semi-supervised SVR reservoir model – Case study – Robustness to noise – Predictions with uncertainty • Conclusions
- 4. Uncertainty Quantification (UQ) Framework Natural System Observed Data 2500 1000 2000 800 1500 600 1000 400 500 200 0 0 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 time (days) time (days) 1400 3500 1200 3000 Forecast Uncertainty 1000 2500 800 2000 600 1500 2500 1000 Mathematical 400 1000 2000 800 Model 200 500 Model 0 0 1500 600 MISMATCH 0 200 400 600 800 1000 1200 1400 0 100 200 300 400 500 600 time (days) time (days) parameters (parameters, pde) 1000 400 500 200 Computationally 0 0 200 400 600 800 1000 1200 1400 0 0 200 400 600 800 1000 1200 1400 expensive time (days) time (days) 1400 3500 1200 3000 1000 2500 800 2000 Computer Simulation Simulated vs Data 600 1500 1000 2500 1000 400 200 500 2000 800 (discretisation, 0 0 0 600 800 1000 1200 1400 200 400 0 100 200 300 400 500 600 1500 600 timestep) 1000 400 time (days) time (days) 500 200 0 0 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 time (days) time (days) 1400 3500 1200 3000
- 5. Adaptive Stochastic Optimisation for UQ Sampling prior iteration distribution Evaluation: Model 1 Model 2 Model New Model 3 Ranking Reproduction simulation population ………… Mismatch Model n calculation Ensemble of Models Sampling algorithms: • Genetic algorithms • Particle swarm optimisation • Ant Colony optimisation Inferred Ensemble of • Neighbourhood Models for prediction Inference approximation
- 6. Search for Matching Models Challenge • FW simulation of multiple models generated for different combinations of parameter values is computationally expensive • High-dimensional parameter space remains fairly empty and poorly described despite thousands of generated models Number of parameters Region of computational efficiency 100-10,000 FW runs Number of points per axis
- 7. UQ Framework with fast ML approximation Observed Data Natural System 2500 1000 2000 800 1500 600 1000 400 500 200 0 0 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 time (days) time (days) 1400 3500 1200 3000 1000 2500 Machine Learning 600 800 2000 1500 Forecast Uncertainty 1000 400 2500 1000 Mathematical 200 500 0 0 2000 800 Model 0 200 400 600 800 1000 1200 1400 0 Model 100 200 300 400 500 600 MISMATCH1000 parameters time (days) 1500 (days) time 600 (parameters, pde) 400 500 200 0 0 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 time (days) time (days) 1400 3500 1200 3000 1000 2500 800 2000 Simulated vs Data 600 1500 Computer Simulation 1000 400 2500 1000 200 500 0 0 2000 800 0 200 400 600 800 1000 1200 1400 0 100 200 300 400 500 600 (discretisation, 1500 600 time (days) time (days) timestep) 1000 400 500 200 0 0 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 time (days) time (days) 1400 3500 1200 3000
- 8. Challenges in Geomodelling • Improve representation of the reality with geologically realistic models based on identifiable parameters. • More effective use of information from various sources by incorporating prior geological and expert knowledge with associate uncertainty • Uncertainty propagation from data into the model without “freezing” assumptions and predefined model dependencies.
- 9. Aims Uncertainty quantification with a geomodel which is able to improve geological realism by more effective use of prior information • Model petrophysical properties in a fluvial reservoir using a robust machine learning approach – semi-supervised Support Vector Regression (SVR) • Reproduce realistic geological structures and inherent uncertainty of the geomodel • Integrate additional spatial data that are non-linearly correlated with reservoir properties.
- 10. Outline • Geoscience modelling under uncertainty • Machine learning based geomodels • Semi-supervised SVR reservoir model – Case study – Robustness to noise – Predictions with uncertainty • Conclusions
- 11. Support Vector Regression (SVR) • Linear regression in hyperspace L w + C ∑ ξi 2 • Complexity control with training errors: min 1 2 w i =1 SVR is formulated in terms of dot products of input data: (x ∙ x') → K (x , x') where K(x,xi) is a symmetric and positively defined kernel function. Kernel trick projects data into sufficiently high dimensional space: L f ( x) = wx + b f ( x) = ∑ yiα i K ( x,xi ) + b i =1 support vectors
- 12. Semi-supervised Learning Concept • Supervised learning with a tutor – Learn from known input and output (e.g. multi-layer perceptron neural network) • Unsupervised learning without a tutor – Learn from known inputs only, no outputs are available (e.g. Kohonen classification maps) • Semi-supervised learning – Learn from a combination of data: • Labelled with both known input and output • Unlabelled with only input available (manifold)
- 13. Kernel Methods on Geo-manifolds • Data-driven models incorporate prior knowledge on the domain of the problem using graph models of natural manifolds • Kernel function enforces continuity along the graph model – manifold – obtained from the prior information Spiral manifold Conventional regression Semi-supervised represented by estimate based on regression estimation unlabelled points (+) labelled data only (●) follows the smoothness along the graph
- 14. Semi-supervised Approach • Manifold assumption: data actually lie on the low-dimensional manifold in the input space • Geometry of the manifold can be estimated with unlabelled data: – incorporate natural similarities in data – enforce smoothness on the manifold • Manifold carries physical information and incorporates prior physical knowledge • Geo-manifold can reflect stochastic nature of the inherent model uncertainty
- 15. Sources of Geo-manifold fro Reservoir Models Geo-manifold for reservoir model can be elicited from prior information: – on-site spatial data (seismic, well logs) – other relevant data (outcrops, modern analogues, lab experiments) – expert knowledge in a non-parametric form – parametric geological models (object shapes, process models) – training image based models
- 16. Semi-supervised SVR Geomodel Prior information SVR Learning Seismic data Machine + geo-manifold unlabelled data Stanford VI synthetic case study Semi-supervised (SVR) • poro&perm labelled data from wells
- 17. Outline • Geoscience modelling under uncertainty • Machine learning based geomodels • Semi-supervised SVR reservoir model – Case study – Robustness to noise – Predictions with uncertainty • Conclusions
- 18. Case Study Stanford VI: a realistic synthetic reservoir data set • Fluvial clastic reservoir: - sinuous channels - meandering channels - delta front • Geomodel: - multi-points statistics models - sedimentation process model • “Hard” poro/perm data from wells •Synthetic seismic data: - 6 attributes: AI, EI, λ, μ, Sw, Poisson ratio S. Castro, J. Caers and T. Mukerji
- 19. Variability in Facies Modelling Multi-point simulation realisations Training Image Hard well data Soft probabilistic data based on seismic
- 20. Case Study 2D layer slices from different geological section: porosity truth case • sinuous channels • delta front SVR geomodel (tuneable or fixed parameters): • Spatial correlation size – Gaussian kernel width σ • Continuity strength – Impact of unlabelled data of the manifold • Smoothness along the manifold – Number of unlabelled points in the manifold – Number of neighbours in kernel regression • Prior belief level for seismic data – Weight of additional seismic input (scaling parameter) • Trade-off between goodness of fit and complexity – Regularisation term C determines balance between training error and margin max – Classification error
- 21. Stochastic Sampling for Matched Models • 640 models generated in 8D parameter space • 40 good fitting models with misfit < 250 Misfit minimisation: Generated models home in the regions of good fit: Misfit channel porosity 170 180 200 channel permeability 220 250 shale porosity 300 500 1000 shale permeability 2000 5000 channel porosity channel permeability
- 22. Fitted Model: Property Distribution Realistic reproduction of geological structures detected from the prior data: – fluvial channels – thin mud channel boundaries – point-bars porosity truth case
- 23. Fitted Model Forecast: Fluvial Channels case Oil and water production from 7 largest producing wells: ● History data (truth case + noise) ○ Validation truth case forecast data Matched model
- 24. Variability of Uncertain Model Properties • Correlation - kernel size σ σ σ channel sands shale • Smoothness along the manifold - number of unlabelled points N N N channel sands shale • Impact of additional data (seismic) on the predicted variables scaling porosity scaling for permeability • Seismic interpretation uncertainty Amplitude threshold for channel/shale boundary
- 25. Non-uniqueness of Semi-supervised SVR Stochastic realisations, based on geo-manifolds generated with different random seeds, represent inherent non-uniqueness of the model with the given combination of the parameter values Realisation 1 Realisation 2 Truth case
- 26. Impact of Noise in Seismic Data Original seismic data with injected noise N(0,σ) ● unlabelled data Semi-SVM porosity Truth case porosity Semi-SVM porosity for N(0,2σ) added noise
- 27. Production: Stochastic Realisations Realisations of a single fitted model with unique set of parameters Oil production profiles for 10 stochastic realisations for 6 wells: ● History data (truth case + noise) ○ Validation truth case forecast data Oil production profiles for semi-SVR model realisations
- 28. Multiple matching models vs Truth case porosity Multiple good fitting φ models Truth case φ The river delta front structure is very similar for different models due to the very clean synthetic seismic with no noise.
- 29. Fitted Model Forecast: Delta Front case Oil and water production from 7 largest producing wells: ● History data (truth case + noise) Fitted model Truth case
- 30. Fitted Model Forecast: Delta Front case Oil production from 7 largest producing wells: ● History data (truth case + noise) Fitted model Truth case
- 31. Forecast with Uncertainty Confidence P10/P90 interval for production forecast based on multiple models: Total oil and water production profiles: ● History data (truth case + noise) ○ Validation truth case forecast data P10/P90 production forecast confidence bounds
- 32. Uncertainty of Model Parameters Posterior probability distribution of the geomodel parameters: • Kernel width – correlation – for poro & perm in sand or shale • Continuity in sand and shale bodies – by N unlab • Impact of seismic data to poro & perm – weight
- 33. Outline • Geoscience modelling under uncertainty • Machine learning based geomodels • Semi-supervised SVR reservoir model – Case study – Robustness to noise – Predictions with uncertainty • Conclusions
- 34. Conclusions • A novel learning based model of petroleum reservoir based on capturing complex dependencies from data. • Semi-supervised SVR geomodel takes into account natural similarities in space and data relations: – Reproduction of geological structures and anisotropy of a fluvial systems in a realistic way based on prior information on geo-manifold represented by unlabelled data – Robustness to noise and flexible control of signal/noise levels in data to detect geologically interpretable information – Stochastic non-uniqueness inherent to the model is represented by the distribution of unlabelled data • Multiple fitted models match both production history and the validation data in the forecast • Uncertainty of the SVR model is quantified by inference of the multiple generated models, which provide uncertainty forecast envelope based on posterior probability
- 35. Further work • Extension to 3D case by adding one more input to the SVR model • Integrate other relevant data from outcrops and lab experiments • Apply SVR modelling approach with Bayesian UQ framework to application in different fields: environmental and climate modelling, epidemiolgy, etc. • 2 PhD positions in the Uncertainty Quantification project: – Geologist, data integration – Uncertainty modelling with machine learning Apply to vasily.demyanov@pet.hw.ac.uk
- 36. Acknowledgments • J. Caers and S. Castro of Stanford University for providing Stanford VI case study • UK EPSRC grant (GR/T24838/01) • Swiss National Science Foundation for funding “GeoKernels: kernel- based methods for geo- and environmental sciences” • Sponsors of Heriot-Watt Uncertainty Quantification project:
- 37. Research Summary • Developed a novel model for petroleum reservoir based on capturing complex dependencies from data with learning methods. • Novel model provide multiple HM model for different fluvial reservoirs: sinuous channels, delta front – both production history and the validation data in the forecast are matched • Benefits of the novel data driven geomodelling approach: – Reproduce realistic geological structure and anisotropy of property distribution. – Robust to noise in prior data – Relate to identifiable properties: continuity, correlation, prior belief in data, etc. • Model uncertainty is described by the inference of multiple models – Posterior confidence interval describe uncertainty forecast – Uncertainty of the model parameters is quantified by posterior probability distributions
- 38. Multiple good fitting φ models Labelled (●) & unlabelled (+) data Seismic data Prior information Learning Machine (SVR)
- 39. Next Steps • Production uncertainty forecasting based on the inference of the generated HM models. • Extension to 3D case by adding one more input to the SVR model • Integrate other relevant data from outcrops and lab experiments
- 40. Aims Uncertainty quantification with a geomodel which is able to improve geological realism by more effective use of prior information • Explore robustness of semi-supervised SVR geomodel to noisy data • Develop a way to reproduce inherent uncertainty of the semi-supervised SVR geomodel by stochastic realisations • Integrate semi-supervised SVR geomodel into the Bayesian uncertainty quantification framework
- 41. Content • Motivation and Aims • Semi-supervised learning concept – Support Vector Machine (SVM) recap • Machine learning based geomodel – Noise pollution experiment – Inherent non-uniqueness of SVR-based model – SVR geomodel in Bayesian sampling framework • Conclusions
- 42. Impact of Noise in Seismic Data In a real case additional data (seismic) are usually noisier than in our synthetic case Seismic is processed through a low pass filter to build a manifold of unlabelled points: Elastic impedance Filtering low frequency Channel geo-manifold component from seismic defined by unlabelled points
- 43. Seismic Data Polluted with Noise Gaussian noise with zero mean and 3 different std.dev σ is added. N(0, σ) N(0, 2σ) N(0, 3σ) Truth case
- 44. Filtering Only a low frequency component is left after filtering N(0, σ) N(0, 2σ) N(0, 3σ) Truth case
- 45. Geo-manifold Unlabelled points are generated only in the cells below the threshold N(0, σ) N(0, 2σ) N(0, 3σ) Truth case
- 46. Porosity SVR Estimates for Noisy Data Noise level: 1 σ Noise level: 2 σ Noise level: 3 σ Geo-manifold becomes less concentrative and the channel “erodes” with increase of the noise level Truth case
- 47. Prediction with a Large Noise Level Noise level: 3σ Even with large noise levels the channel continuity can be traced in SVR prediction although it is barely visible in the input data Truth case
- 48. Impact of Inherent Non-uniqueness Stochastic realisations of water production from 6 largest producing wells
- 49. NA Sampling: Misfit Distribution Misfit of models generated by NA Lowest misfit = 188
- 50. NA Sampling: Parameter Distributions Histogram of parameter values for the generated models Models generated by NA home in the regions of good fit
- 51. Support Vector Machine (SVM) Linear separation problem 1 αi = 0 Normal Samples + b= wx 1 0 < αi < C Support Vectors (SV) αi = C Support Vectors untypical or noisy L w + C ∑ ξi 2 Soft margin: min 1 2 w i =1 ξ ξi ≥ 0 slack variables to allow 1 =- noisy samples & outliers +b x2 to lie inside or on the w outer side of the margin Trade-off between: margin maximisation & training error minimisation Increase space dimension to solve separation problem linearly

No public clipboards found for this slide

Sharika

http://winkhealth.com http://financewink.com