AI in Geosciences - Mineral Exploration and Beyond

1. THE TECHNOLOGIES
2. Neural Networks
3. PRACTICAL APPLICATIONS
4. Current & Future Applications
5. FINAL THOUGHTS
Max Howarth - IBM Canada 2018
PROVIDE A PRACTICAL
OVERVIEW OF THE
MOST COMMON AI
TECHNOLOGY
PURPOSE:

1. THE CLASSIFICATION PROBLEM
2. REGRESSION-LIKE PROBLEM

NEURAL
NETWORK
IT’S A CAT

NEURAL
NETWORK
THERE’S 30
STRUCTURAL
MEASUREMENTS

NEURAL
NETWORK
YOU’LL SELL 70
HOT DOGS
Date Temperature P.O.P Foot Traﬃc
21/09/2018 23 20% 100
22/09/2019 27 10% 321
23/09/2020 19 60% 125

NEURAL
NETWORK
14 DAY SALINITY
FORECAST:
680 uS/cm
Date
FLOW @
ST.1
FLOW @
ST.2
FLOW @
ST.3
1/09/2018 30 45 12
2/09/2019 22 43 15
3/09/2020 19 41 18

LOTS OF EMPIRICAL DATA
NOT A LOT OF WAYS TO TIE IT TOGETHER

THE GOOD
• Great for:
• Pattern recognition
• Nonlinear modelling
• Classiﬁcation
• Association
• Control
• Fast
• Don’t have to explicitly
deﬁne an equation
• Learn from the past,
adapt in the future

THE GOOD THE BAD
• Great for:
• Classiﬁcation
• Association
• Control
deﬁne an equation
• Fast
adapt in the future
• Viewed as a panacea - but
isn’t
• Often highly situational
• Often requires intense
compute resources
• Can be hard to evaluate
success

THE GOOD THE BAD THE UGLY
• Great for:
• Classiﬁcation
• Association
• Control
deﬁne an equation
• Fast
adapt in the future
• Intense data cleansing
and preparation
requirements
• “Grey box” solution
• Lack of standardized
approaches
• Hockey stick learning
curve
• Viewed as a panacea - but
isn’t
• Often highly situational
• Often requires intense
compute resources
• Can be hard to evaluate
success

INPUTS NEURON OUTPUT
W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Training
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
CALCULATOR

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
1. Set random initial starting weights CALCULATOR
W1 W2 W3
0.8 -0.2 0.7

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
1. Set random initial starting weights
2. Calculate output using a training set
W1 W2 W3 Output
0.8 -0.2 0.7 0.7

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
3. Quantify error between calculated output
and expected output
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
Error Weighted Derivative: -0.14814

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
and expected output
4. Adjust training weights based on EWD
times initial input value.
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
W1 W2 W3
0.8 -0.2 0.551

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
and expected output
5. Repeat
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
W1 W2 W3
0.8 -0.2 0.551

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
and expected output
5. Repeat
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
W1 W2 W3
0.85 -0.15 0.56

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
and expected output
5. Repeat
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
W1 W2 W3
0.99 0 0
X100

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
Inputs Output
0 0 1 0
1 1 1 1
1 0 1 1
0 1 1 0
1 0 0 1
TRAIN
TEST
Training
and expected output
5. Repeat
W1 W2 W3 Output
0.8 -0.2 0.7 0.7
Error: -0.7
W1 W2 W3
0.99 0 0
OPTIMIZE

W1
W2
W3
X1
X2
X3
f(
∑
Xn × Wn)
ibm.biz/sabcs2018
Try it
out

Groundwater Model Approximation with Artiﬁcial
Neural Network for Selecting Optimum Pumping
Strategy for Plume Removal
Shreedhar Maskey, Yonas B. Dibike, Andreja Jonoski and Dimitri
Solomatine
What is the optimal pumping rate to
minimize time to cleanup?
P1 P2 P3
40 20 30

Solomatine
P1 P2 P3
40 20 30
MODFLOW/MODPATH
G.O. TECHNIQUE
MINIMUM TIME: 3000 DAYS

Solomatine
P1 P2 P3
40 20 30
MODFLOW/MODPATH
G.O. TECHNIQUE
MINUTES? HOURS?
THOUSANDS? MILLIONS?

Solomatine
P1 P2 P3
40 20 30
NEURAL NETWORK
G.O. TECHNIQUE

Solomatine
P1 P2 P3
40 20 30
NEURAL NETWORK
G.O. TECHNIQUE
TRAINING EXAMPLES FROM  
MODFLOW/MODPATH
TRAINING EXAMPLES <<
SIMULATION ITERATIONS
< SECOND

Solomatine
P1 P2 P3
40 20 30
NEURAL NETWORK
G.O. TECHNIQUE
TRAINING EXAMPLES FROM  
MODFLOW/MODPATH
TRAINING EXAMPLES <<
SIMULATION ITERATIONS
ORDER OF MAGNITUDE DECREASE IN TIME TO COMPLETE
< SECOND

WHAT WILL THE SALINITY LEVELS BE IN
14 DAYS?
RIVER MURRAY SYSTEM
ARTIFICIAL NEURAL NETWORKS IN HYDROLOGY. I: PRELIMINARY CONCEPTS
ARTIFICIAL NEURAL NETWORKS IN HYDROLOGY. II: HYDROLOGIC
APPLICATIONS
ASCE Task Committee on Application of Artiﬁcial Neural Networks in Hydrology

14 DAYS?
RIVER MURRAY SYSTEM

14 DAYS?
RIVER MURRAY SYSTEM
USE FLOW, WATER LEVEL,
DISCHARGE, AND TEMPERATURE
DATA FROM HERE
TO PREDICT SALINITY
HERE

14 DAYS?
17 STATIONS
3 MEASUREMENT TYPES/STATION
~51 INPUT NODES
2 X 10 NEURON
HIDDEN LAYERS

14 DAYS?
1. GET THE DATA 2. PREPARE THE DATA
M1 M2 M3 … M50
0.8 -0.2 0.7 0.7 0.7TARGET FORMAT

14 DAYS?
• Different stations collect different measurements
• Some measurements are blank…?
• Stations started collecting on different dates..!
• How far do I go back…??
• How do I transform the data…???
• Data needs to be normalized…!
M1 M2 M3 … M50
0.8 -0.2 0.7 0.7 0.7TARGET FORMAT

14 DAYS?
• Different stations collect different measurements
• Some measurements are blank…?
• How far do I go back…??
• How do I transform the data…???
• Stations started collecting on different dates..!
• Data needs to be normalized…!
M1 M2 M3 … M50
0.8 -0.2 0.7 0.7 0.7TARGET FORMAT
Model Building
35%
Data Preparation
65%

Dataset and code are available for
you to try on box
ibm.biz/sabcs2018

Did it work? !
25000 EPOCHS
SOME CONSIDERATIONS
• Network architecture was
largely arbitrary.
• Data cleansing was quick, and
not thorough.
• 1st iteration results - typically
do 10’s to 100’s
• Very promising capabilities.

A 3-dimensional
convolutional neural
network designed to work
with categorical and
continuous variables
PREDICTIONS

© 2018 IBM Corporation IBM Services
THE RESULTS
REDACTED

• Team made up of:
• Engineers (geological, materials, mining, etc.)
• Scientists (geophysicists, astrophysicists biologists,
etc.)
• Developers
• Data Scientists
• Like a startup backed by the power of IBM
• Always start with a Proof of Concept (PoC)
• Close working teams for successful
outcomes
10XRETURN ON INVESTMENT

1. BE MINDFUL OF YOUR DATA
2. BUILD A USE CASE FOR THE DATA YOU HAVE
3. REPLACE LEGACY CONVENTIONAL MODELLING TECHNIQUES
4. START SMALL
5. DEFINE SUCCESS
6. GET A DATA SCIENTIST

© 2018 IBM Corporation IBM Services!63 IBM Services
WHY THIS USE CASE?
• Hand written documents have more variability.
• Hand written documents are typically lower
quality.
• Hand written documents do not have a consistent
orientation.
• Maps are coloured in.
• Maps are done underground, so documents are
often damaged.

USE CASE OBJECTIVES
1. Find a symbol
2. Classify a symbol
3. Find associated dip measurement
4. Read dip measurement
5. Determine strike angle
6. Georeference symbol
7. Read metadata

© 2018 IBM Corporation IBM Services!65
THE TECHNOLOGY: IMAGE CLASSIFIERS
IMAGE
CLASSIFIER

IMAGE
CLASSIFIER
[cat, 0.98]

IMAGE CLASSIFIERS: TRAINING
IMAGE
CLASSIFIER
These are
all cats.

IMAGE
CLASSIFIER
These are
all cats.
These are
all not
cats.

IMAGE
CLASSIFIER
These are
all cats.
These are
all not
cats.
10K
of each

IMAGE
CLASSIFIER
[cat, 0.98]

CONVOLUTION LAYER:
Learning complex patterns
from the input patterns.
POOLING LAYER:
Reduce spatial size of the
representation to reduce
amount of parameters and
computation in network

CONVOLUTION LAYER:
POOLING LAYER:

Convolutional Deep Belief Networks for Scalable Unsupervised
Learning of Hierarchical Representations. Lee, Ng
CONVOLUTION LAYER:
POOLING LAYER:

CHALLENGES
• Symbols are very small
• Everyone has different handwriting
• Orientation varies
• Colouring obscures symbol clarity
• Lots of unrelated symbols
• Huge variation in quality and condition of images

Image
Faster RCNN
(Main Object Detection)
Image Preprocessing
(Denoising, Enhancing
etc.)
CNN
(Detecting digits/alphabets
per contour)
Image Manipulation
(Cropping etc.)
Faster RCNN
(Digit Detection )
CNN
(Detecting angles for
each digit to get
orientation)
CNN
(Detect Orientation of
Objects)
Image Manipulation
(Cropping etc.)
Image Preprocessing
(Denoising, Enhancing
etc.)
Output
Information
(SymbolObjects,
geo-reference
coordinates,
metadata)
OCR
(Detecting digits/letters per
contour)
OCR
(Detecting digits/letters per
contour)
Image Manipulation
(Cropping etc.)
Symbol detection (Objective 1,2,3,4)
Coordinates detection (Objective 6)
Metadata Extraction (Objective 7)
Faster RCNN
OROBJECTIVES
1. Find a symbol
2. Classify a symbol
3. Find associated dip measurement
4. Read dip measurement
5. Determine strike angle
6. Georeference symbol
7. Read metadata

SOME MORE EXAMPLES

THE PROBLEM:
Target identification and
prioritization is expensive,
time consuming, and risky.

THE PROMISE:
Predictive models use large
volumes of historical data to
determine the likelihood of
future outcomes.

THE PROMISE:
Predictive models use large
volumes of geological data to
determine the likelihood of
mineralization.

LARGE VOLUMES
OF DATA
?
DATA QUALITY SUBJECTIVITY
WHY PREDICTIVE MODELLING?

MINERAL EXPLORATION CONDENSED
DATA DRIVEN
KNOWLEDGE DRIVEN
What do I know
about the
geological setting?
What does my
survey data tell
me?
• More data than a human can reasonably consume
• Can be affected by human bias
• Requires extensive experience & education
• Changes from person to person

MINERAL EXPLORATION CONDENSED
DATA DRIVEN
KNOWLEDGE DRIVEN
What do I know
about the
geological setting?
What does my
survey data tell
me?
• More data than a human can reasonably consume
• Can be affected by human bias
• Requires extensive experience & education
• Changes from person to person
SWEET SPOT
• Largely data driven models
• Tribal knowledge embedded in data representation
• Domain knowledge represented in model construction

MINERAL EXPLORATION PROCESS
SEARCH
• Data is usually disparate and silo’ed - geologists have
to aggregate it from multiple sources.
PREPARE
MODEL
• Geologists examine the data, interpolate, and create
3D models to inform further exploration and mining
activities.

MINERAL EXPLORATION PROCESS
SEARCH
PREPARE
MODEL
• Geologists examine the data, interpolate, and create
3D models to inform further exploration and mining
activities.
70%
OF A GEOLOGIST’S TIME
UP TO

PREDICTIVE MODELLING FRAMEWORK
PREREQUISITES FOR AI
• Large quantity of data
• Data is cleaned
• Data is structured and organized
• Business objectives are understood

PREREQUISITES FOR AI
• Large quantity of data
• Data is cleaned
• Data is structured and organized
• Business objectives are understood
70%
OF A DATA SCIENTIST’S TIME

AI HELPS GEOLOGISTS
WORK BETTER HERE
SO MORE TIME CAN BE
SPENT HERE

AI INFORMS THE MODELLING PROCESS BY
ALLOWING EXPERIMENTS TO BE TESTED

SPEND LESS TIME PREPARING DATA
3D GIS PLATFORM FOR
MODELLING
• Data is aggregated, correlated, and stored
in a manner that is conducive to both
geological and predictive modelling.
NON-INVASIVE
• Continue to use your existing tools and
software to collect data - no breaking of
business processes.
NECESSARY
• Clean, organized data is a requirement for
modelling - maximize the value of your
preparation activities.

PREDICTIVE MODELLING REQUIREMENTS
Criteria
1. Consume large amounts of data.
2. Can use geospatial information (i.e. work in 3D).
3. Can use categorical variables.
4. Low requirement for knowledge engineering.
5. Can be trained on a specific area (e.g. brownfield).
6. Can be trained on a non-specific area (e.g. greenfield)

CONVOLUTIONAL NEURAL NETWORKS
Criteria
1. Consume large amounts of data.
2. Can use geospatial information (i.e. work in 3D).
3. Can use categorical variables.
4. Low requirement for knowledge engineering.
5. Can be trained on a specific area (e.g. brownfield).
6. Can be trained on a non-specific area (e.g. greenfield)

WATSON FOR GEOLOGY PREDICTIVE MODELS
A 3-dimensional convolutional neural
network designed to work with
categorical and continuous variables
Network architecture designed with
recognition of high level geological
features in mind - training the model
adds further context.

FEATURE ENGINEERING & DATA EXTRACTION
Can we teach the model about ternary
diagrams?
Can we leverage even more data?
DRILL LOG COMMENT COMPREHENSION
MAP ANALYSIS

NEXT STEPS
GREENFIELDS BROWNFIELDS RESOURCES EXPANSION
MODEL TESTEDQ1 2019

AI in Geosciences - Mineral Exploration and Beyond

Recommended

Recommended

More Related Content

Similar to AI in Geosciences - Mineral Exploration and Beyond

Similar to AI in Geosciences - Mineral Exploration and Beyond (20)

Recently uploaded

Recently uploaded (20)

AI in Geosciences - Mineral Exploration and Beyond