2016 04-19 machine learning

Machine Learning
April 19, 2016
Roadmap to Constructing
A Top Down
Machine Learning Paradigm

1
Introduction to Southwestern Energy
Southwestern Energy Company (NYSE: SWN) is a
leading natural gas and oil company with operations
predominantly in the United States, engaged in
exploration, development and production activities,
including related natural gas gathering and marketing.
Source: http://www.swn.com/

2
Machine Learning, Deep Learning, AI
Roadmap to Constructing a
Top Down Machine Learning Paradigm
E&P organizations are turning more attention to accumulated data to
enhance operating efficiency, safety, and recovery. The computing
paradigm is shifting, the O&G paradigm is shifting, and the rise of the
machine learning paradigm requires careful attention to top-down
integrated systems engineering. A system approach will be presented to
stimulate out-of-the-box thinking to address the machine learning
paradigm.

3
Past Paradigm Shifts
• Seismic
• Horizontal Drilling
• Off Shore
• Factory Drilling
Paradigm Shifts in Process
• Big Crew Change
• Mobility (anytime,
anywhere)
• Big Data
• Machine Learning
The Shifting O&G Paradigm
Source: Mark Reynolds, compilation

4
Changing Paradigms
• Computing Paradigm
(4th Paradigm / eScience)
• O&G Paradigm
(Shale 2.0)
New Paradigms
• Machine Learning Paradigm
Paradigms We Are Discussing Today

5
The Structure of Scientific Revolutions
• Normal Science
– Equilibrium, harmony
• Model Drift
– Outliers cease to be outliers
– Ripples turn to discontinuity
• Model Crisis
– Alternate methods permitted
– Out-of-the-box reconsidered
• Model Revolution
– New model becomes the new-normal
• Paradigm Change
– (Textbooks play catch-up)
Source: Thomas Kuhn, (1962) The Structure of Scientific Revolutions. University of Chicago Press
Mark Reynolds, compilation
Normal
Science
Model Drift
(Anomaly)
Model
CrisisModel
Revolution
Paradigm
Change Kuhn
Cycle

6
The Shifting Computing Paradigm
Descriptive
and
Formulaic
Hypothetical
and
Investigative
Expertise
Driven
Models and
Cases
Multivariant
Differential
Modelling
eScience
Traditional Science

7
The Shifting Computing Paradigms
• O&G is where we found itEmpirical
• O&G is where we expect itTheoretical
• O&G is where we estimate itComputational
• O&G is where we infer it
Data
Exploration

8
The Machine Learning Paradigm
“ A computer program is said to learn from experience
(E) with respect to some class of tasks (T) and
performance measure (P), if its performance at tasks in
T, as measured by P, improves with experience E. ”
~Tom Mitchell
Source: Tom Mitchell, Mitchell, T. (1997). Machine Learning, McGraw Hill.
Mark Reynolds, compilation
Machine Learning is the “Extraction of Wisdom
by Understanding the underlying Data”

9
The Catalyst
• Data captured by
instruments
• Data generated by
simulations
• Data acquired by
sensor networks
The Destination
• Solutions from data analysis
• Solutions from data mining
• Solutions from visualization
• Solutions from drill down
• Solutions for bottom line
• Solutions using eScience
Machine Learning in the 4th Paradigm
eScience and the Fourth Paradigm: Data-Intensive Scientific Discovery and Digital Preservation, Tony Hey, Microsoft Research
http://www.alliancepermanentaccess.org/wp-content/uploads/2011/12/apa2011/15_%28Nov11%29TonyHey-APA%20Meeting.pdf
“ eScience is the set of tools and technologies
to support data federation and collaboration ”
~ Jim Grey

10
Predictive Analytics
• Focuses on Prediction
– Based on Known Properties
– Learned from Training Data
Data Mining
• Focuses on Discovery
– Unknown Properties in Data
– The Analysis Phase of
Knowledge Discovery
Precursors to Machine Learning
Machine Learning is the “Extraction of Wisdom
by Understanding the underlying Data”
~Mark Reynolds

11
The Machine Learning Paradigm
Unsupervised Learning
Supervised Learning
Semi-Supervised Learning
Reinforcement Learning
24/7
Predictive
Analytics
Data
Mining
Machine
Learning
AI

12
Principal Concepts in Machine Learning
• Unsupervised Learning
– Data is unlabeled
• Supervised Learning
– Teach and train with data that is well labeled with a
defined output
• Reinforcement Learning
– Validity of data alignment is served as feedback
• Semi-Supervised Learning
– Some of the data is labeled, some is unlabeled

13
Domestic
• Nest® Thermostats
• Pandora / Amazon
• Spam Detection
• Fraud Detection
• Traffic Light Duty Cycle
• Google
Upstream O&G
• Pump-Jack Duty Cycle
(circa 1986)
• Closed Loop Directional
Drilling (circa 2009)
Examples of Machine Learning

14
The Bridge Into Machine Learning
Today Tomorrow
Integrated Systems Engineering

15
Integrated Systems Engineering
Systems &
Knowledge
Engineer
O&G
Systems
Control
Systems
Remote
Systems
Information
Systems
Embedded
Systems
Robotic
Systems
Data
Fusion
Real-Time
Systems
Look-Back
Analysis
Look-
Ahead
Systems
Land and Regulatory
Geology Geophysics
Drilling Engineering
Completion Engineering
Production Engineering
Reservoir Engineering
Systems Engineering

16
Integrated Engineering – Top-Down
• Engineering the Source
– Signals, content, and
characterizations
• Engineering the Data
– Address errant data
– Address valid spurious data
– Address data quality
• Engineering the Store
– Repository
– Recall and Reporting
– Representations
Data Acquisition
Data Transmission
Data Retention
Data Analysis
Data Reduction

17
• Engineering the Store
– Data distribution
– Data staging
• Engineering the Recall
– Simple query
– Cube v Matrix
• Engineering the Use Case
– Destination: human
– Destination: machine
Classification
Regression
Clustering
Density Estimation
Dimensional Reduction

18
Integrated Engineering – System Flow
Acquire Analyze Annunciate Archive Analyze Anticipate Apply
Data
Information
Visualization
Knowledge
Forensics
Understanding
Analysis &
Mining
Wisdom
Anticipating
Application
 Creating Informational Accessibility and Transparency
 Discovering Experiential Performance Improvements
 Segmenting Processes and Process Results
 Replacing Human Decision w/ Automated Algorithms
 Innovating New Models, Products, Services

19
19
Data
Modeling
Proactive &
Closed-Loop
Systems
Mining and
Analytics
Forensics
Control
Visualization
and
Observation
Source
Capture and
Utilization
• Intelligence during operations (Observation and Anticipation)
• Intelligence reviewing operations (Forensic)
• Intelligence planning operations (Historical and Analytical)
Well
Plan RT
Prod
RT
Drill
Geo-
steer
RT
Frac
Daily
Rpts
AFE
  
  

20
Applied Machine Learning 101
Training
Data
Pre-
Processing
Learning
Error
Analysis
Model
Learning (Phase 1)
Prediction (Phase 2)
New Data Model
Predictable
Result

21
Representative Algorithms
• Decision Tree Learning
– Maps observation to conclusions
• Association Rule Learning
– Discovering interesting relations
• Artificial Neural Networks
– Incremental function modules
• Inductive Logic Programming
– Rule based representations for
input --> output
• Support Vector Machines
– Classification and regression
• Clustering
– Assignment of observations to
clusters
• Bayesian Networks
– Probabilistic models correlating
variables
– Finds policy to map states to
desired outcome
• Representation Learning
– Principal component analysis
• Similarity & Metric Learning
– Pairs of examples train others
• Sparse Dictionary Learning
– Datum as linear combinations
• Genetic Algorithms
– Mimics natural heuristics

22
Machine Learning: Data Diversity
• Macro (or field-level)
– Spatial
– Temporal
• Pad (or offset)
– Spatial
– Temporal
• Well (or wellbore)
– Spatial
– Temporal
• External
– Uploads
– Political, Climate, etc
• The 3 Cs of Data Quality
– Consistency
– Correctness
– Completeness
– [#4] Currency
– [#5] Conformity
Data Diversity - Spatial, Temporal, Referential

23
The Fast Data ecosystem in O&G
Land
Drilling
Reservoir Completion
Water
Production
Steering Regulatory
Midstream
Source: Assorted web images

24
Algorithmic Approaches (revisited)
• Decision Tree Learning
– Maps observation to conclusions
• Association Rule Learning
– Discovering interesting relations
• Artificial Neural Networks
– Incremental function modules
• Inductive Logic Programming
– Rule based representations for
input --> output
• Support Vector Machines
– Classification and regression
• Clustering
– Assignment of observations to
clusters
• Bayesian Networks
– Probabilistic models correlating
variables
– Finds policy to map states to
desired outcome
• Representation Learning
– Principal component analysis
• Similarity & Metric Learning
– Pairs of examples train others
• Sparse Dictionary Learning
– Datum as linear combinations
• Genetic Algorithms
– Mimics natural heuristics

25
Keep Your Eye on the Prize
Data
Information
Knowledge
Understanding
Wisdom
Application
The question is NOT
“How can we … ?”
But instead
“What is the objective?”
( or “Why?” )

26
Mark Reynolds
Mark Reynolds Vitae
• Southwestern Energy
• Lone Star College
• Intent Driven Designs
• Scan Systems
• Sikorsky Aircraft
• General Dynamics
• Southwestern Energy Email
– Mark_Reynolds@swn.com

2016 04-19 machine learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to 2016 04-19 machine learning

Similar to 2016 04-19 machine learning (20)

Recently uploaded

Recently uploaded (20)

2016 04-19 machine learning