SlideShare a Scribd company logo
1 of 34
Download to read offline
Climate Informatics
Brian Reich, NCSU
SAMSI, 9/19/2017
Brian Reich, NC State Climate Informatics 1 / 34
Outline
Conceptual views of statistics
Definitions
Example of unsupervised learning: PCA
Example of supervised learning: Deep neural networks
Data challenge!
Brian Reich, NC State Climate Informatics 2 / 34
Conceptual view of parametric statistics
Last week, Prof Haran discussed an ice sheet study that
was a great example parametric modeling
They assumed that the entire glacial system was known up
to a few parameters
The role of statistics is then to compare the observed data
the model output to refine our understanding of these
parameters
This delivers new scientific insights, at least within the
current paradigm
Brian Reich, NC State Climate Informatics 3 / 34
Conceptual view of linear regression
For more complex systems we will not know the system up
to a few parameters
We might that use a first-order approximation via linear
regression
If the approximation is obviously wrong, we might try to
patch it up with, e.g., quadratic terms
This is now is an approximate system that is known up to a
few parameters (slopes, variances, etc)
If we have a decent statistical representation of the system
we can carefully interpret the parameters scientifically
Brian Reich, NC State Climate Informatics 4 / 34
Conceptual view of inferential statistics
Rather than model the entire system, we can conduct
experiments to learning about specific relationships
For example, it is impossible to model the entire biological
system that leads to cancer under different treatments
Instead, we might conduct a randomized clinical trial to
compare treatments
This doesn’t necessarily add to our biological
understanding of cancer, but it is surely useful
Brian Reich, NC State Climate Informatics 5 / 34
Conceptual view of machine learning
The premise of machine learning is that we can train an
algorithm to mimic how a complex system operates without
understanding the fundamental science
Last week you called this a statistical emulator
Often this requires a huge amount of training data
The algorithm that mimics reality may be a black box that is
not a function of parameters or equations we can interpret
However, in many applications, prediction even without
scientific understanding is powerful
Brian Reich, NC State Climate Informatics 6 / 34
Example: Short-term weather forecasting
Scientist scientist: study physics, chemistry, etc and
encode this in a mathematical model that takes the current
state of projects forward
Data scientist: Dump 100M observations of current met
variables into a deep learning algorithm and make
statistical predictions
What are the advantages and disadvantages of each
approach?
Brian Reich, NC State Climate Informatics 7 / 34
When to use parametric stats vs machine learning
Brian Reich, NC State Climate Informatics 8 / 34
A made-up example
https://satelliteliaisonblog.com/2017/03/
03/using-goes-16-to-detect-wildfires/
I made up some fake data in the R workspace trainingdata
It is meant to represent taking a bunch of subregions and
recording for each snapshot i = 1, ..., 10K
Yi a measure of fire in the region
Xij is the gray scale of pixel j = 1, ..., 100
Goal: Predict the response given the image predictor
A good spot for machine learning?
Brian Reich, NC State Climate Informatics 9 / 34
Flow chart of machine learning algorithms (from SAS)
Brian Reich, NC State Climate Informatics 10 / 34
Definitions - Unsupervised learning
The data consists of several variables, but none are the
“response”
That is, we have X but not Y
Usually unsupervised methods try to identify the main
patterns in the variables
Clustering: put the n observations into L < n clusters
Principal components analysis (PCA): explain the
correlations between variables concisely
Brian Reich, NC State Climate Informatics 11 / 34
Definitions - Supervised learning
The data consists of both independent variables X and
dependent variables Y
The goal is to study the effects of X or Y and/or predict Y|X
Regression is the obvious example, where we estimate
E(Y|X) = f(X)
Examples: Linear regression, trees, nets
Classification is another example, were the data are from
Q unordered classes, Y ∈ {1, ..., Q}
The goal is to assign Y to a class based on X
Examples: Logistic regression, support vector machines
Brian Reich, NC State Climate Informatics 12 / 34
Principal components analysis (PCA)
Say we have a collection of variables X = (X1, ..., Xp)T
For simplicity, assume all p variables are centered (mean
zero) and scaled (variance one)
We observe n samples from their joint distribution X1, ..., Xn
Linear relationships are summarized by the p × p sample
correlation matrix,
S ≈ Cor(X)
If p = 1000, S is huge and hard to interpret
Brian Reich, NC State Climate Informatics 13 / 34
PCA
The eigen decomposition of a matrix can be used to
approximate the full matrix with a few vectors
This dimension reduction highlights the most important
trends
Denote the eigen decomposition as S = ΓΛΓT
Γ’s columns, γ1, ..., γp, are the p orthonormal eigenvectors
That is, γT
j γk = 0 and thus Cor(γj, γk ) = 0
Λ is diagonal with eigenvalues λ1 ≥ ... ≥ λp on the
diagonal
Brian Reich, NC State Climate Informatics 14 / 34
PCA
The full eigen decomposition is
S =
p
l=1
λlγlγT
l
The λl are decreasing and are so the term’s importance
The “best” approximation with L < p terms is
S ≈
L
l=1
λlγT
l γl
The proportion of the S’s variance explained by L terms is
L
l=1 λl
p
l=1 λl
Brian Reich, NC State Climate Informatics 15 / 34
Using EOFs to study spatiotemporal trends
PCA is referred to as empirically orthogonal functions
(EOF) in the climate literature
As an example, say the p variables are the daily values of
precipitation at p spatial locations
The eigenvectors γl are spatial surfaces
They give uncorrelated weighted averages of the data
Zlt = γT
l Xt
A plot of the Z1t by t reveals spatiotemporal trends
http://www4.stat.ncsu.edu/~reich/
SpatialStats/code/EOF.html
Brian Reich, NC State Climate Informatics 16 / 34
PCA
How would you find the first eigenvector if p = 1, 000, 000?
http:
//www4.stat.ncsu.edu/~reich/SAMSI/PCA.html
Brian Reich, NC State Climate Informatics 17 / 34
PC regression (PCR)
PCA leads to an efficient high-dimensional regression
model
Linear regression
E(Y|X) = β0 +
p
j=1
Xjβj
High dimensional regression has n > p
Least squares doesn’t exist!
Need some form of dimension reduction
LASSO is one way, PCR is another
Brian Reich, NC State Climate Informatics 18 / 34
PC regression (PCR)
In PCA, we reduced the dimension from p correlated
variables to L << p uncorrelated variables
Zl = γT
l X
PCR uses the covariates Z1, ..., ZL as predictors
E(Y|X) = b +
L
l=1
Zlwl
where wl are fit via least squares
This is still linear in X,
E(Y|X) = b +
L
l=1
p
j=1
wlγljXj,
but because we use the correlation of X, we only have to
estimate L < p parameters
Brian Reich, NC State Climate Informatics 19 / 34
Nonparametric regression
General set-up: Y = f(X) +
Y is the continuous response
f is the response surface or mean function
are iid errors
In linear regression, f is assumed to be a linear in X
Life isn’t linear
If we want an algorithm to mimic life it can’t be linear
NP regression tries to model f non-linearly
Brian Reich, NC State Climate Informatics 20 / 34
Nonparametric regression
The response surface f is a function
1D: f(X) is a curve
2D: f(X) is a surface
3D: f(X) is complicated
What properties do we want in an algorithm to estimate f?
We want it to be able to approximate any continuous
function f that takes a p-dimensional input and returns at
univariate output
If this is the case, then as long as the sample size goes to
infinity we will be able to estimate an system
Brian Reich, NC State Climate Informatics 21 / 34
Types of nonparametric regression
Polynomial regression
Generalized additive models (GAM)
Gaussian process regression
Regression trees
Random forests
...
Neural networks
Brian Reich, NC State Climate Informatics 22 / 34
Single-layer neural network (NN)
Construct covariates (similar to PCR)
zj = b0
j +
p
l=1
w0
jl Xj
PCR: weights w0
jl determined by the sample covariance of
X and the mean is linear,
f(X) = α + z1β1 + ...zLβL
NN: weights w0
jl (and biases b0
j ) are fit using least squares
and the mean is non-linear,
f(X) = b1
+ σ(z1)w1
1 + ...σ(zL)w1
L
The activation function σ is chosen by the user, e.g.,
σ(x) = exp(x)/[1 + exp(x)] or σ(x) = x+
Brian Reich, NC State Climate Informatics 23 / 34
Analogy with brain activity
f(X) = b1
+ σ(z1)w1
1 + ...σ(zL)w1
L
zl = b0
l +
p
j=1
w0
lj Xj
The response is the sum of activity from L neurons
zl measures the intensity on neuron l
When the intensity gets high enough, it fires, and σ(zl) = 1
When neuron l fires it adds w1
l to the response
Brian Reich, NC State Climate Informatics 24 / 34
Let’s do a proof!
Say there is a single covariate (p = 1) and
σ(x) =
1 x > 0
0 x ≤ 0
Prove that any smooth curve f(X) for X ∈ [0, 1] can be
approximated by a function of the form
f(X) ≈ b1
+ σ(z1)w1
1 + ...σ(zL)w1
L
zl = b0
l + w0
l X
Argue without math that your proof will extend to p = 2
Brian Reich, NC State Climate Informatics 25 / 34
Optimization
To fit a NN we need to estimate all the biases b and
weights w to minimize
SSE =
n
i=1
e2
i =
n
i=1
[Yi − f(Xi)]2
It’s common to use gradient descent
The gradients with respect to b and w have a nice forms
They have a recursive structure which leads to the
back-propagation algorithm
Brian Reich, NC State Climate Informatics 26 / 34
Recursive gradients for back-propagation
∂ei
∂b1
= −2ei
∂ei
∂w1
l
=
∂ei
∂b1
σ(zl)
∂ei
∂b0
l
=
∂ei
∂b1
w
(1)
l σ (zl)
∂ei
∂w0
lj
=
∂ei
∂b0
l
Xj
Brian Reich, NC State Climate Informatics 27 / 34
Deep learning
Deep learning just adds more layers
Here is a two-layer NN
f(X) = b2
+ σ(z2
1 )w2
1 + ... + σ(z2
L )w2
L
z2
l = b1
l + σ(z1
l )w1
l1 + ... + σ(z1
L )w1
lM
z1
k = b0
k + w0
k1X1 + ... + w0
kpXp
There are L and M neurons in the two layers, respectively
The mean is a non-linear function of non-linear functions of
the inputs
Brian Reich, NC State Climate Informatics 28 / 34
DNN in our fire example
Xj is the intensity for image j = 1, ..., p
X = (X1, ..., Xp)T is the image
Truth: f(X) = 10 if at least two regions have high intensity
and f(X) = 0 otherwise
This is a weird non-linear function!
How to pick the biases and weights to get this function?
Brian Reich, NC State Climate Informatics 29 / 34
DNN in our fire example
Let σ(x) = 1 if x > 0 and σ(x) < 0 otherwise
Set L = 1 and M = p
Let p
j=1 w0
kjXj be the average intensity around voxel k
Pick b0
k so that σ(z1
k ) = 1 indicates in the intensity around
voxel k is high
Pick w1
lk = 1 so w1
11σ(z1
1 ) + ... + w1
1Mσ(z1
L ) is the number of
high intensity regions
Pick b1
1 so that σ(z2
1 ) = 1 indicates the number of high
intensity regions exceeds 1
Finally, set b2 = 0 and w2
1 = 10
Brian Reich, NC State Climate Informatics 30 / 34
DNN (from www.edureka.co)
Brian Reich, NC State Climate Informatics 31 / 34
Worked example
http://www4.stat.ncsu.edu/~reich/SAMSI/
neural_net_example.html
Brian Reich, NC State Climate Informatics 32 / 34
Black magic
We need to pick the number of layers and number of
neurons in each layer
For large datasets computing the gradients is slow and
stochastic gradient descent is a common remedy
How many mini-batches?
Drop-out rate?
Regularization?
Transformations?
More to come!
Brian Reich, NC State Climate Informatics 33 / 34
Data challenge!
Get in small groups and load the R workspace trainingdata
Fit NN model to the training data
The set up is the same as the worked example, but the
true response surface is different
Soon I will email everyone the test data
Evaluate your predictions on the test data
Winner take all! No prisoners!
Brian Reich, NC State Climate Informatics 34 / 34

More Related Content

What's hot

ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimationChristian Robert
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaAlexander Litvinenko
 

What's hot (20)

CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
Variograms
VariogramsVariograms
Variograms
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
CLIM: Transition Workshop - Projected Data Assimilation - Erik Van Vleck, Ma...
CLIM: Transition Workshop - Projected Data Assimilation  - Erik Van Vleck, Ma...CLIM: Transition Workshop - Projected Data Assimilation  - Erik Van Vleck, Ma...
CLIM: Transition Workshop - Projected Data Assimilation - Erik Van Vleck, Ma...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 

Viewers also liked

Viewers also liked (18)

Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
 
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
 
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
 
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
 
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
 
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
 
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
 
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
 

Similar to Climate Informatics: Machine Learning and Data Science Approaches

Intro to Feature Selection
Intro to Feature SelectionIntro to Feature Selection
Intro to Feature Selectionchenhm
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfAnna Carbone
 
Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Ana Luísa Pinho
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
Algorithmic Thermodynamics
Algorithmic ThermodynamicsAlgorithmic Thermodynamics
Algorithmic ThermodynamicsSunny Kr
 
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 Physics-driven Spatiotemporal Regularization for High-dimensional Predictive... Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...Hui Yang
 
Write down the four (4) nonlinear regression models covered in class,.pdf
Write down the four (4) nonlinear regression models covered in class,.pdfWrite down the four (4) nonlinear regression models covered in class,.pdf
Write down the four (4) nonlinear regression models covered in class,.pdfjyothimuppasani1
 
Time Series Forecasting using Neural Nets (GNNNs)
Time Series Forecasting using Neural Nets (GNNNs)Time Series Forecasting using Neural Nets (GNNNs)
Time Series Forecasting using Neural Nets (GNNNs)Sean Golliher
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrisonComputer Science Club
 
Prof. Rob Leigh (University of Illinois)
Prof. Rob Leigh (University of Illinois)Prof. Rob Leigh (University of Illinois)
Prof. Rob Leigh (University of Illinois)Rene Kotze
 
The Fundamental theorem of calculus
The Fundamental theorem of calculus The Fundamental theorem of calculus
The Fundamental theorem of calculus AhsanIrshad8
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca ldanozomuhamada
 

Similar to Climate Informatics: Machine Learning and Data Science Approaches (20)

Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Class9_PCA_final.ppt
Class9_PCA_final.pptClass9_PCA_final.ppt
Class9_PCA_final.ppt
 
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil ...
 
Intro to Feature Selection
Intro to Feature SelectionIntro to Feature Selection
Intro to Feature Selection
 
Lesson 26
Lesson 26Lesson 26
Lesson 26
 
AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
 
Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Algorithmic Thermodynamics
Algorithmic ThermodynamicsAlgorithmic Thermodynamics
Algorithmic Thermodynamics
 
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 Physics-driven Spatiotemporal Regularization for High-dimensional Predictive... Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 
B02402012022
B02402012022B02402012022
B02402012022
 
Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4
 
Write down the four (4) nonlinear regression models covered in class,.pdf
Write down the four (4) nonlinear regression models covered in class,.pdfWrite down the four (4) nonlinear regression models covered in class,.pdf
Write down the four (4) nonlinear regression models covered in class,.pdf
 
Time Series Forecasting using Neural Nets (GNNNs)
Time Series Forecasting using Neural Nets (GNNNs)Time Series Forecasting using Neural Nets (GNNNs)
Time Series Forecasting using Neural Nets (GNNNs)
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison
 
Prof. Rob Leigh (University of Illinois)
Prof. Rob Leigh (University of Illinois)Prof. Rob Leigh (University of Illinois)
Prof. Rob Leigh (University of Illinois)
 
The Fundamental theorem of calculus
The Fundamental theorem of calculus The Fundamental theorem of calculus
The Fundamental theorem of calculus
 
Intermediate Statistics 1
Intermediate Statistics 1Intermediate Statistics 1
Intermediate Statistics 1
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 

Climate Informatics: Machine Learning and Data Science Approaches

  • 1. Climate Informatics Brian Reich, NCSU SAMSI, 9/19/2017 Brian Reich, NC State Climate Informatics 1 / 34
  • 2. Outline Conceptual views of statistics Definitions Example of unsupervised learning: PCA Example of supervised learning: Deep neural networks Data challenge! Brian Reich, NC State Climate Informatics 2 / 34
  • 3. Conceptual view of parametric statistics Last week, Prof Haran discussed an ice sheet study that was a great example parametric modeling They assumed that the entire glacial system was known up to a few parameters The role of statistics is then to compare the observed data the model output to refine our understanding of these parameters This delivers new scientific insights, at least within the current paradigm Brian Reich, NC State Climate Informatics 3 / 34
  • 4. Conceptual view of linear regression For more complex systems we will not know the system up to a few parameters We might that use a first-order approximation via linear regression If the approximation is obviously wrong, we might try to patch it up with, e.g., quadratic terms This is now is an approximate system that is known up to a few parameters (slopes, variances, etc) If we have a decent statistical representation of the system we can carefully interpret the parameters scientifically Brian Reich, NC State Climate Informatics 4 / 34
  • 5. Conceptual view of inferential statistics Rather than model the entire system, we can conduct experiments to learning about specific relationships For example, it is impossible to model the entire biological system that leads to cancer under different treatments Instead, we might conduct a randomized clinical trial to compare treatments This doesn’t necessarily add to our biological understanding of cancer, but it is surely useful Brian Reich, NC State Climate Informatics 5 / 34
  • 6. Conceptual view of machine learning The premise of machine learning is that we can train an algorithm to mimic how a complex system operates without understanding the fundamental science Last week you called this a statistical emulator Often this requires a huge amount of training data The algorithm that mimics reality may be a black box that is not a function of parameters or equations we can interpret However, in many applications, prediction even without scientific understanding is powerful Brian Reich, NC State Climate Informatics 6 / 34
  • 7. Example: Short-term weather forecasting Scientist scientist: study physics, chemistry, etc and encode this in a mathematical model that takes the current state of projects forward Data scientist: Dump 100M observations of current met variables into a deep learning algorithm and make statistical predictions What are the advantages and disadvantages of each approach? Brian Reich, NC State Climate Informatics 7 / 34
  • 8. When to use parametric stats vs machine learning Brian Reich, NC State Climate Informatics 8 / 34
  • 9. A made-up example https://satelliteliaisonblog.com/2017/03/ 03/using-goes-16-to-detect-wildfires/ I made up some fake data in the R workspace trainingdata It is meant to represent taking a bunch of subregions and recording for each snapshot i = 1, ..., 10K Yi a measure of fire in the region Xij is the gray scale of pixel j = 1, ..., 100 Goal: Predict the response given the image predictor A good spot for machine learning? Brian Reich, NC State Climate Informatics 9 / 34
  • 10. Flow chart of machine learning algorithms (from SAS) Brian Reich, NC State Climate Informatics 10 / 34
  • 11. Definitions - Unsupervised learning The data consists of several variables, but none are the “response” That is, we have X but not Y Usually unsupervised methods try to identify the main patterns in the variables Clustering: put the n observations into L < n clusters Principal components analysis (PCA): explain the correlations between variables concisely Brian Reich, NC State Climate Informatics 11 / 34
  • 12. Definitions - Supervised learning The data consists of both independent variables X and dependent variables Y The goal is to study the effects of X or Y and/or predict Y|X Regression is the obvious example, where we estimate E(Y|X) = f(X) Examples: Linear regression, trees, nets Classification is another example, were the data are from Q unordered classes, Y ∈ {1, ..., Q} The goal is to assign Y to a class based on X Examples: Logistic regression, support vector machines Brian Reich, NC State Climate Informatics 12 / 34
  • 13. Principal components analysis (PCA) Say we have a collection of variables X = (X1, ..., Xp)T For simplicity, assume all p variables are centered (mean zero) and scaled (variance one) We observe n samples from their joint distribution X1, ..., Xn Linear relationships are summarized by the p × p sample correlation matrix, S ≈ Cor(X) If p = 1000, S is huge and hard to interpret Brian Reich, NC State Climate Informatics 13 / 34
  • 14. PCA The eigen decomposition of a matrix can be used to approximate the full matrix with a few vectors This dimension reduction highlights the most important trends Denote the eigen decomposition as S = ΓΛΓT Γ’s columns, γ1, ..., γp, are the p orthonormal eigenvectors That is, γT j γk = 0 and thus Cor(γj, γk ) = 0 Λ is diagonal with eigenvalues λ1 ≥ ... ≥ λp on the diagonal Brian Reich, NC State Climate Informatics 14 / 34
  • 15. PCA The full eigen decomposition is S = p l=1 λlγlγT l The λl are decreasing and are so the term’s importance The “best” approximation with L < p terms is S ≈ L l=1 λlγT l γl The proportion of the S’s variance explained by L terms is L l=1 λl p l=1 λl Brian Reich, NC State Climate Informatics 15 / 34
  • 16. Using EOFs to study spatiotemporal trends PCA is referred to as empirically orthogonal functions (EOF) in the climate literature As an example, say the p variables are the daily values of precipitation at p spatial locations The eigenvectors γl are spatial surfaces They give uncorrelated weighted averages of the data Zlt = γT l Xt A plot of the Z1t by t reveals spatiotemporal trends http://www4.stat.ncsu.edu/~reich/ SpatialStats/code/EOF.html Brian Reich, NC State Climate Informatics 16 / 34
  • 17. PCA How would you find the first eigenvector if p = 1, 000, 000? http: //www4.stat.ncsu.edu/~reich/SAMSI/PCA.html Brian Reich, NC State Climate Informatics 17 / 34
  • 18. PC regression (PCR) PCA leads to an efficient high-dimensional regression model Linear regression E(Y|X) = β0 + p j=1 Xjβj High dimensional regression has n > p Least squares doesn’t exist! Need some form of dimension reduction LASSO is one way, PCR is another Brian Reich, NC State Climate Informatics 18 / 34
  • 19. PC regression (PCR) In PCA, we reduced the dimension from p correlated variables to L << p uncorrelated variables Zl = γT l X PCR uses the covariates Z1, ..., ZL as predictors E(Y|X) = b + L l=1 Zlwl where wl are fit via least squares This is still linear in X, E(Y|X) = b + L l=1 p j=1 wlγljXj, but because we use the correlation of X, we only have to estimate L < p parameters Brian Reich, NC State Climate Informatics 19 / 34
  • 20. Nonparametric regression General set-up: Y = f(X) + Y is the continuous response f is the response surface or mean function are iid errors In linear regression, f is assumed to be a linear in X Life isn’t linear If we want an algorithm to mimic life it can’t be linear NP regression tries to model f non-linearly Brian Reich, NC State Climate Informatics 20 / 34
  • 21. Nonparametric regression The response surface f is a function 1D: f(X) is a curve 2D: f(X) is a surface 3D: f(X) is complicated What properties do we want in an algorithm to estimate f? We want it to be able to approximate any continuous function f that takes a p-dimensional input and returns at univariate output If this is the case, then as long as the sample size goes to infinity we will be able to estimate an system Brian Reich, NC State Climate Informatics 21 / 34
  • 22. Types of nonparametric regression Polynomial regression Generalized additive models (GAM) Gaussian process regression Regression trees Random forests ... Neural networks Brian Reich, NC State Climate Informatics 22 / 34
  • 23. Single-layer neural network (NN) Construct covariates (similar to PCR) zj = b0 j + p l=1 w0 jl Xj PCR: weights w0 jl determined by the sample covariance of X and the mean is linear, f(X) = α + z1β1 + ...zLβL NN: weights w0 jl (and biases b0 j ) are fit using least squares and the mean is non-linear, f(X) = b1 + σ(z1)w1 1 + ...σ(zL)w1 L The activation function σ is chosen by the user, e.g., σ(x) = exp(x)/[1 + exp(x)] or σ(x) = x+ Brian Reich, NC State Climate Informatics 23 / 34
  • 24. Analogy with brain activity f(X) = b1 + σ(z1)w1 1 + ...σ(zL)w1 L zl = b0 l + p j=1 w0 lj Xj The response is the sum of activity from L neurons zl measures the intensity on neuron l When the intensity gets high enough, it fires, and σ(zl) = 1 When neuron l fires it adds w1 l to the response Brian Reich, NC State Climate Informatics 24 / 34
  • 25. Let’s do a proof! Say there is a single covariate (p = 1) and σ(x) = 1 x > 0 0 x ≤ 0 Prove that any smooth curve f(X) for X ∈ [0, 1] can be approximated by a function of the form f(X) ≈ b1 + σ(z1)w1 1 + ...σ(zL)w1 L zl = b0 l + w0 l X Argue without math that your proof will extend to p = 2 Brian Reich, NC State Climate Informatics 25 / 34
  • 26. Optimization To fit a NN we need to estimate all the biases b and weights w to minimize SSE = n i=1 e2 i = n i=1 [Yi − f(Xi)]2 It’s common to use gradient descent The gradients with respect to b and w have a nice forms They have a recursive structure which leads to the back-propagation algorithm Brian Reich, NC State Climate Informatics 26 / 34
  • 27. Recursive gradients for back-propagation ∂ei ∂b1 = −2ei ∂ei ∂w1 l = ∂ei ∂b1 σ(zl) ∂ei ∂b0 l = ∂ei ∂b1 w (1) l σ (zl) ∂ei ∂w0 lj = ∂ei ∂b0 l Xj Brian Reich, NC State Climate Informatics 27 / 34
  • 28. Deep learning Deep learning just adds more layers Here is a two-layer NN f(X) = b2 + σ(z2 1 )w2 1 + ... + σ(z2 L )w2 L z2 l = b1 l + σ(z1 l )w1 l1 + ... + σ(z1 L )w1 lM z1 k = b0 k + w0 k1X1 + ... + w0 kpXp There are L and M neurons in the two layers, respectively The mean is a non-linear function of non-linear functions of the inputs Brian Reich, NC State Climate Informatics 28 / 34
  • 29. DNN in our fire example Xj is the intensity for image j = 1, ..., p X = (X1, ..., Xp)T is the image Truth: f(X) = 10 if at least two regions have high intensity and f(X) = 0 otherwise This is a weird non-linear function! How to pick the biases and weights to get this function? Brian Reich, NC State Climate Informatics 29 / 34
  • 30. DNN in our fire example Let σ(x) = 1 if x > 0 and σ(x) < 0 otherwise Set L = 1 and M = p Let p j=1 w0 kjXj be the average intensity around voxel k Pick b0 k so that σ(z1 k ) = 1 indicates in the intensity around voxel k is high Pick w1 lk = 1 so w1 11σ(z1 1 ) + ... + w1 1Mσ(z1 L ) is the number of high intensity regions Pick b1 1 so that σ(z2 1 ) = 1 indicates the number of high intensity regions exceeds 1 Finally, set b2 = 0 and w2 1 = 10 Brian Reich, NC State Climate Informatics 30 / 34
  • 31. DNN (from www.edureka.co) Brian Reich, NC State Climate Informatics 31 / 34
  • 33. Black magic We need to pick the number of layers and number of neurons in each layer For large datasets computing the gradients is slow and stochastic gradient descent is a common remedy How many mini-batches? Drop-out rate? Regularization? Transformations? More to come! Brian Reich, NC State Climate Informatics 33 / 34
  • 34. Data challenge! Get in small groups and load the R workspace trainingdata Fit NN model to the training data The set up is the same as the worked example, but the true response surface is different Soon I will email everyone the test data Evaluate your predictions on the test data Winner take all! No prisoners! Brian Reich, NC State Climate Informatics 34 / 34