SlideShare a Scribd company logo
1 of 107
Download to read offline
Unsupervised Machine Learning Methods for Feature Extraction
Velimir V. Vesselinov (monty) (vvv@lanl.gov)
Earth and Environmental Sciences Division, Los Alamos National Laboratory, NM, USA
http://tensors.lanl.gov
LA-UR-19-21450
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why unsupervised Machine Learning (ML)?
Supervised ML: requires prior categorization (knowledge) of the processed data
random forest, neural networks, active, reinforcement learning, ...
Example: Recognize images of cats and dogs after extensive training; but cannot
recognize horses if not trained
Cannot find something that we do not already know
Unsupervised ML: extracts hidden features (signals) in the processed data without any
prior information (exploratory analysis for data-driven science)
Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses,
etc.); without prior information or training
AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently
pursued by many
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why not supervised Machine Learning (ML)
Supervised ML
can introduce subjectivity (through the labeling process)
does not provide insights why horses are different than dogs / cats
cannot make predictions
requires huge training (labeled) datasets
is impacted by “adversarial examples”
⇒ major limitations of the supervised methods
for data-analytics and data-driven science applications
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Supervised Machine Learning (xkcd)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Model Analytics/Diagnostics: Identify processes (features) in model outputs
Physics-Informed Machine Learning: Coupled Data/Model Analytics (fusion)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Data Analytics: Identify signals (features) in datasets
Feature extraction (FE):
Blind source separation (BSS)
Detection of disruptions / anomalies
Image recognition
Guide development of physics / reduced-order models representing the data
Discover unknown dependencies and phenomena
Develop reduced-order/surrogate models
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Model Analytics/Diagnostics: Identify processes (features) in model outputs
Separate processes (inseparable during modeling)
Model reduction (develop reduced-order/surrogate models)
Identify dependencies between model inputs and outputs
Discover unknown dependencies and phenomena
Make predictions
Optimize data acquisition
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Unsupervised Machine Learning Applications
Physics-Informed Machine Learning: Coupled Data/Model Analytics (fusion)
Simultaneous analyses of data and model outputs
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Nonnegative Matrix/Tensor Factorization
We have developed a series of novel unsupervised Machine Learning (ML) methods
and computational techniques
Our methods are based in matrix/tensor factorization coupled with custom k-means
clustering and nonnegativity/sparsity constraints:
NMFk: Nonnegative Matrix Factorization
NTFk: Nonnegative Tensor Factorization
NMFk/NTFk are capable to efficiently process large datasets (GB/TB’s) utilizing GPU’s
& TPU’s (TensorFlow, PyTorch, MXNet)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why nonnegativity?
Nonnegativity constraints provide meaningful and interpretable results (+sparsity)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why tensors?
Tensors (multi-dimensional/multi-modal/multi-way datasets) are everywhere:
monitoring data are typically a 5-D tensor (x,y,z,t,attributes)
model outputs are typically a 5-D tensor (x,y,z,t,attributes)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Why tensors-based unsupervised machine learning?
SVD cannot be applied on multi-way (tensor datasets)
Tensor-based unsupervised machine learning can be applied to
resolve interactions between dimensions (modes)
identify how controls (e.g., pressure, temperature, volume, etc.) are
impacting outcomes (e.g. oil production)
identify how observables (e.g., pressure, temperature, volume, etc.) are
impacting other observables (e.g. concentration)
identify how model inputs (e.g., conductivity, storage, etc.) are impacting
model outputs (e.g. pressure)
simultaneous analyses of data and model outputs (data/model fusion; PIML)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Physics-Informed Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Factorization (3D case): Tucker-3
≈
X
[K, M, N]
G
[k, m, n]
H
[k, K]
W
[n, N]
V
[m, M]
X ≈ G ⊗ H ⊗ W ⊗ V
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Feature extraction
≈
X
[K, M, N]
G
[k, m, n]
H
[k, K]
W
[n, N]
V
[m, M]
X ≈ G ⊗ H ⊗ W ⊗ V
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Feature extraction
≈
t
y
x
t
y
x
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Tucker-2 (three possible alternatives)
≈
X
[K, M, N]
G
[k, M, n]
H
[k, K]
W
[n, N]
X ≈ G ⊗ H ⊗ W
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Tucker-1 (three possible alternatives)
≈
X
[K, M, N]
G
[k, M, N]
H
[k, K]
X ≈ G ⊗ H
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Factorization (3D case): Tucker-3
≈
X
[K, M, N]
G
[k, m, n]
H
[k, K]
W
[n, N]
V
[m, M]
X ≈ G ⊗ H ⊗ W ⊗ V
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Feature extraction
Tucker decomposition is achieved through minimization
Nonnegativity and sparsity constraints help the feature extraction
Optimal number of features [k, m, n] is estimated through k-means clustering of
a series minimization solutions with random initial guesses
≈
X
[K, M, N]
G
[k, m, n]
H
[k, K]
W
[n, N]
V
[m, M]
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
V =


t
t2
tanh(t − 10) + 1


W =




x
x3
ex
sin(x) + 1




H =






y
y4
ln(y)
sin(2y) + 1
cos(y) + 1






(12 elements)
X = G ⊗ H ⊗ W ⊗ V
= xyt + xy4
t + xtln(y)+
xt(sin(2y) + 1) + x3
yt+
xt(cos(y) + 1) + ytex
+
yt(sin(x) + 1)+
xyt2
+ xy(1 + tanh(t − 10))+
t2
ex
(sin(2y) + 1)+
(1 + tanh(t − 10))(sin(x) + 1)
(cos(y) + 1)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
← Truth vs Predictions →
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
← Truth vs Predictions →
V =


t
t2
tanh(t − 10) + 1


W =




x
x3
ex
sin(x) + 1




H =






y
y4
ln(y)
sin(2y) + 1
cos(y) + 1






Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
M = G ⊗ H ⊗ W
M1 = xy + xy4
+ xlog(y + 1)+
x(sin(2y) + 1) + yex
x(cos(y) + 1) + x3
y+
y(sin(x) + 1)
M2 = xy+
ex
(cos(z) + 1)
M3 = xy+
(sin(x) + 1)(cos(y) + 1)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example II
(50 × 50 × 50) tensor
50 columns in x
50 rows in y
50 time frames
‘ones‘ swimming in a sea of ‘zeros‘
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example II
Factorizing all 3 dimensions (50 × 50 × 50) → (6 × 44 × 48)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Tucker Tensor Decomposition: Example II
6 groups of swimmers (x); 44 lanes occupied (y); 48 time frames (first/last empty)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NMFk / NTFk Challenges
Identifying the number of unknown features:
resolved using custom k-means clustering and sparsity constraints on the core
tensor
number of features identified based on the reconstruction quality (e.g., Frobenius
norm) and cluster Silhouettes
Solving a non-unique optimization problem:
addressed through multistarts, regularization and nonnegativity constraints
applying diverse optimization techniques (Multiplicative/Alternating Least
Squares algorithms, NLopt, Ipopt, Gurobi, MOSEK, GLPK, Clp, Cbc, ...)
Processing Big Data:
GPU’s / TPU’s / Distributed computing
Account for data sparsity and structure
Nonnegative Tensor Trains
Dealing with Noisy Data:
Random noise impacts accuracy but its accountable
Systematic noise is identified as separate signals
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NMFk / NTFk Performance
4GB Tensor (1000 × 1000 × 1000)
Framework Execution time (seconds)
MATLAB 2634
NumPy 881
MXNet 644
PyTorch 121
TensorFlow 119
Julia 109
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NMFk / NTFk Scalability
108 109 1010 1011
Tensor Size (bytes)
101
102
103
104
time(s)
3D Tensor
perfect scaling
256 CPUs
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NMFk / NTFk Analyses
Field Data:
Characterization of groundwater
contaminant sources
US Climate data
Geothermal data
Seismic data
Lab Data:
X-ray Spectroscopy
UV Fluorescence Spectroscopy
Microbial population analyses
Operational Data:
LANSCE: Los Alamos Neutron
Accelerator
Oil/gas production
Model Data:
Reactive mixing A + B → C
Phase separation of co-polymers
Molecular Dynamics of proteins
EU Climate modeling (Helmholtz
Institute, Germany)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NMFk / NTFk Publications
Stanev, Vesselinov, Kusne, Antoszewski, Takeuchi, Alexandrov, Unsupervised Phase
Mapping of X-ray Diffraction Data by Nonnegative Matrix Factorization Integrated with
Custom Clustering, Nature Computational Materials, 2018.
Vesselinov, Munuduru, Karra, O’Maley, Alexandrov, Unsupervised Machine Learning
Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing, Journal of
Computational Physics, (in review), 2018.
Vesselinov, O’Malley, Alexandrov, Nonnegative Tensor Factorization for Contaminant
Source Identification, Journal of Contaminant Hydrology, (accepted), 2018.
O’Malley, Vesselinov, Alexandrov, Alexandrov, Nonnegative/binary matrix factorization
with a D-Wave quantum annealer, PLOS ONE, (accepted), 2018.
Vesselinov, O’Malley, Alexandrov, Contaminant source identification using
semi-supervised machine learning, Journal of Contaminant Hydrology,
10.1016/j.jconhyd.2017.11.002, 2017.
Alexandrov, Vesselinov, Blind source separation for groundwater level analysis based
on nonnegative matrix factorization, WRR, 10.1002/2013WR015037, 2014.
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: air temperature
fluctuations in the air temperature [◦
C]
(424 × 412 × 365)
(columns × rows × days)
NTFk extracts spatial and temporal
footprints of dominant features:
storm signal
winter seasonal signal
summer seasonal signal
....
Data compression: ∼ 4GB → ∼ 0.5MB
Compression ratio: ∼ 8000
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: air temperature fluctuations represented by 3 signals
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: air temperature fluctuations represented by 3 signals
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: water-table depth
fluctuations in the water-table depth [m]
(424 × 412 × 365)
(columns × rows × days)
NTFk extracts spatial and temporal
footprints of dominant signals
spring snowmelt signal
summer rainfall signal
seasonal signal
....
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: water-table fluctuations represented by 3 signals
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: water-table fluctuations represented by 3 signals
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Climate model of Europe: analyze all model outputs (>40) simultaneously
Find interconnections among model outputs
Evaluate impacts of different model setups
Find dominant processes impacting model predictions
(e.g., climate impacts on groundwater resources, impacts of subsurface processes on
atmospheric conditions)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
ML for extracting contaminant plumes
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
ML for extracting contaminant plumes
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
ML for extracting contaminant plumes
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Geochemistry: LANL hydrogeochemical dataset
R-44#1
R-28 R-42
(18 × 8 × 12) tensor (wells × species × years)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Geochemistry: Nonnegative Tensor Factorization based on Tucker-1 decomposition
X: data tensor
W: source (groundwater type) matrix
(unknown)
G: mixing tensor (unknown)
M: number of observation points (wells)
N: number of observation times (e.g,
2001, 2002, ..., 2017)
K: number of geochemical species
observed (e.g., Cr6+
, SO2+
4 , NO−
3 , etc.)
k: number of unknown groundwater
types mixed at each well
Constraints:
all tensor/matrix elements ≥ 0
k
i=1
Gi,j,t = 1 ∀j, t
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk analysis estimated 7 groundwater types
Sources Cr Cl−
ClO4
3
H NO3 Ca Mg SO4
(µg/L) (mg/L) (µg/L) (pCi/L) (mg/L) (mg/L) (mg/L) (mg/L)
S1 2970.00 63.00 0.00 0.00 14.00 73.00 25.00 170.00
S5 21.00 51.00 0.00 950.00 2.40 67.00 15.00 50.00
S6 1.50 64.00 0.00 0.00 2.80 51.00 10.00 68.00
S2 0.79 0.35 14.00 0.00 0.50 5.30 1.70 0.60
S4 0.50 0.14 0.00 0.00 10.00 21.00 5.00 10.00
S3 (B) 0.25 3.60 0.00 0.00 0.01 41.00 11.00 0.06
S7 (B) 0.10 0.03 0.00 0.00 0.01 0.40 0.80 0.90
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk estimated concentrations at various wells
R-28 R-42 R-44#1 R-45#1 R-11 R-1 R-50#1
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk estimated time-dependent mixing of 7 groundwater types at various wells
R-28
R-42
R-44#1
R-45#1
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2016
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2005
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2006
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2007
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2008
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2009
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2010
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2011
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2012
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2013
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2014
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2015
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk identified sources (groundwater types) Jan-Dec 2016
Source 7: (background)
Source 1: Cr, Cl
−
, NO3, Ca, Mg, and SO4
Source 3: Cl
−
, Ca, Mg (background)
Source 2: ClO4
Source 6: Cl
−
, Ca, Mg, and SO4
Source 4: NO3
Source 5:
3
H, Cr, Cl
−
, Ca, Mg, and SO4
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
NTFk analysis of LANL hydrogeochemical datasets
(18 × 8 × 12) tensor
(wells × species × years)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity
32,251 seismic events from 1989 to
2017
Tensor: total energy of events over a
discretized domain
(118 × 97 × 520)
(columns × rows × weeks)
NTFk applied to extract dominant
hidden (latent) features based on spatial
footprints and temporal characteristics
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: reconstruction by 4 features (signals)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: extracted signals vs. injected volumes
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: extracted signals vs. majors seismic events
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: 4 seismic features
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity
volume pressure seismicity
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: 5 volume/pressure/seismicity features
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Oklahoma seismicity: 5 volume/pressure/seismicity features
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: Problem
Numerous independent variables ... knobs controlling the accelerator performance
15 provided / analyzed
Numerous dependent variables ... observations representing the accelerator
performance
8 provided / analyzed
312,326 * 23 = 7,183,498 ≈ 100 MB (per month)
Full monthly dataset ≈ 2 GB; For 30 years ≈ 1 TB;
Goal #1: ML a reduced-order model to simulate the performance (physics model does
not exist and cannot be built)
Goal #2: Use ML to control the knobs to optimize the performance
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: Independent variables = 15
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: Dependent variables = 8
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: NMFk estimated signal #1 based on the dependent variables
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: NMFk estimated signals #2 based on the dependent variables
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: NMFk Estimated signal #3 based on the dependent variables
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: NMFk reconstruction of one of the dependent variables
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: NMFk estimated signal #3 vs one of the independent variables
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
LANSCE: Reduced-order (SVR) Model
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Reactive mixing
A BBarrier
A + B → C
> 2000 simulations of C concentrations in time/space with varying
model inputs representing reactive mixing (5 input model
parameters)
NTFk identifies physics processes impacting C concentrations and
their relationship to model inputs
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Reactive mixing: NTFk results
NTFk extracts the dominant time/space features (processes /
vortices) and compresses the model outputs
Compression: > 200GB → ∼ 70MB (ratio ∼ 3000
Here, (1000 × 81 × 81) → (3 × 12 × 13) (time × rows × columns)
Advection Dispersion Diffusion
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Reactive mixing: NTFk results
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Reactive mixing: NTFk results (κf L impacts)
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Summary
Developed novel unsupervised ML methods and
computational tools based on Nonnegative
Factorization (Matrices/Tensors)
Our ML methods have been used to solve various
real-world problems (brought breakthrough
discoveries related to human cancer research)
Our ML work already contributes in many
research areas
Modeldata
Sensor data Experimentaldata
Robust Unsupervised
Machine Learning
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
Machine Learning (ML) Algorithms / Codes developed by our team
Codes:
NMFk MADS NTFk
Examples:
http://madsjulia.github.io/Mads.jl/Examples/blind_source_separation
http://tensors.lanl.gov
http://tensordecompositions.github.io
Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary

More Related Content

Similar to Unsupervised machine learning methods for feature extraction

machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
butest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
Philip Zheng
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
butest
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 

Similar to Unsupervised machine learning methods for feature extraction (20)

module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Binary Search Algorithm
Binary Search Algorithm Binary Search Algorithm
Binary Search Algorithm
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
 
Artificial inteligence
Artificial inteligence Artificial inteligence
Artificial inteligence
 
ML Basic Concepts.pdf
ML Basic Concepts.pdfML Basic Concepts.pdf
ML Basic Concepts.pdf
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
GDSC ML-1.pptx
GDSC ML-1.pptxGDSC ML-1.pptx
GDSC ML-1.pptx
 
Launching into machine learning
Launching into machine learningLaunching into machine learning
Launching into machine learning
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
 
Machine learning
Machine learningMachine learning
Machine learning
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
Artificial Intelligence IA at the service of Laboratories
Artificial Intelligence IA at the service of LaboratoriesArtificial Intelligence IA at the service of Laboratories
Artificial Intelligence IA at the service of Laboratories
 
Conférence Y. GervaiseEN1st Green Analytical Y. Gervaise.pdf
Conférence Y. GervaiseEN1st Green Analytical Y. Gervaise.pdfConférence Y. GervaiseEN1st Green Analytical Y. Gervaise.pdf
Conférence Y. GervaiseEN1st Green Analytical Y. Gervaise.pdf
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

More from Velimir (monty) Vesselinov

More from Velimir (monty) Vesselinov (14)

Physics-Informed Machine Learning Methods for Data Analytics and Model Diagno...
Physics-Informed Machine Learning Methods for Data Analytics and Model Diagno...Physics-Informed Machine Learning Methods for Data Analytics and Model Diagno...
Physics-Informed Machine Learning Methods for Data Analytics and Model Diagno...
 
Data and Model-Driven Decision Support for Environmental Management of a Chro...
Data and Model-Driven Decision Support for Environmental Management of a Chro...Data and Model-Driven Decision Support for Environmental Management of a Chro...
Data and Model-Driven Decision Support for Environmental Management of a Chro...
 
Environmental Management Modeling Activities at Los Alamos National Laborator...
Environmental Management Modeling Activities at Los Alamos National Laborator...Environmental Management Modeling Activities at Los Alamos National Laborator...
Environmental Management Modeling Activities at Los Alamos National Laborator...
 
GNI: Coupling Model Analysis Tools and High-Performance Subsurface Flow and T...
GNI: Coupling Model Analysis Tools and High-Performance Subsurface Flow and T...GNI: Coupling Model Analysis Tools and High-Performance Subsurface Flow and T...
GNI: Coupling Model Analysis Tools and High-Performance Subsurface Flow and T...
 
Tomographic inverse estimation of aquifer properties based on pressure varia...
Tomographic inverse estimation of aquifer properties based on  pressure varia...Tomographic inverse estimation of aquifer properties based on  pressure varia...
Tomographic inverse estimation of aquifer properties based on pressure varia...
 
Uncertainties in Transient Capture-Zone Estimates
Uncertainties in Transient Capture-Zone EstimatesUncertainties in Transient Capture-Zone Estimates
Uncertainties in Transient Capture-Zone Estimates
 
Model-driven decision support for monitoring network design based on analysis...
Model-driven decision support for monitoring network design based on analysis...Model-driven decision support for monitoring network design based on analysis...
Model-driven decision support for monitoring network design based on analysis...
 
Decision Analyses Related to a Chromium Plume at Los Alamos National Laboratory
Decision Analyses Related to a Chromium Plume at Los Alamos National LaboratoryDecision Analyses Related to a Chromium Plume at Los Alamos National Laboratory
Decision Analyses Related to a Chromium Plume at Los Alamos National Laboratory
 
Model-free Source Identification
Model-free Source IdentificationModel-free Source Identification
Model-free Source Identification
 
Decision Support for Environmental Management of a Chromium Plume at Los Alam...
Decision Support for Environmental Management of a Chromium Plume at Los Alam...Decision Support for Environmental Management of a Chromium Plume at Los Alam...
Decision Support for Environmental Management of a Chromium Plume at Los Alam...
 
Model Analyses of Complex Systems Behavior using MADS,
Model Analyses of Complex Systems Behavior using MADS,Model Analyses of Complex Systems Behavior using MADS,
Model Analyses of Complex Systems Behavior using MADS,
 
Reduced Order Models for Decision Analysis and Upscaling of Aquifer Heterogen...
Reduced Order Models for Decision Analysis and Upscaling of Aquifer Heterogen...Reduced Order Models for Decision Analysis and Upscaling of Aquifer Heterogen...
Reduced Order Models for Decision Analysis and Upscaling of Aquifer Heterogen...
 
ZEM: Integrated Framework for Real-Time Data and Model Analyses for Robust En...
ZEM: Integrated Framework for Real-Time Data and Model Analyses for Robust En...ZEM: Integrated Framework for Real-Time Data and Model Analyses for Robust En...
ZEM: Integrated Framework for Real-Time Data and Model Analyses for Robust En...
 
Decision Analyses for Groundwater Remediation
Decision Analyses for Groundwater RemediationDecision Analyses for Groundwater Remediation
Decision Analyses for Groundwater Remediation
 

Recently uploaded

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 

Unsupervised machine learning methods for feature extraction

  • 1. Unsupervised Machine Learning Methods for Feature Extraction Velimir V. Vesselinov (monty) (vvv@lanl.gov) Earth and Environmental Sciences Division, Los Alamos National Laboratory, NM, USA http://tensors.lanl.gov LA-UR-19-21450 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 2. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 3. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 4. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 5. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 6. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 7. Why unsupervised Machine Learning (ML)? Supervised ML: requires prior categorization (knowledge) of the processed data random forest, neural networks, active, reinforcement learning, ... Example: Recognize images of cats and dogs after extensive training; but cannot recognize horses if not trained Cannot find something that we do not already know Unsupervised ML: extracts hidden features (signals) in the processed data without any prior information (exploratory analysis for data-driven science) Example: Identify features that distinguish images of animals (e.g., cats, dogs, horses, etc.); without prior information or training AI + Unsupervised ML: ... coupled AI and unsupervised techniques ... currently pursued by many Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 8. Why not supervised Machine Learning (ML) Supervised ML can introduce subjectivity (through the labeling process) does not provide insights why horses are different than dogs / cats cannot make predictions requires huge training (labeled) datasets is impacted by “adversarial examples” ⇒ major limitations of the supervised methods for data-analytics and data-driven science applications Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 9. Supervised Machine Learning (xkcd) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 10. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Model Analytics/Diagnostics: Identify processes (features) in model outputs Physics-Informed Machine Learning: Coupled Data/Model Analytics (fusion) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 11. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 12. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 13. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 14. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 15. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 16. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 17. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 18. Unsupervised Machine Learning Applications Data Analytics: Identify signals (features) in datasets Feature extraction (FE): Blind source separation (BSS) Detection of disruptions / anomalies Image recognition Guide development of physics / reduced-order models representing the data Discover unknown dependencies and phenomena Develop reduced-order/surrogate models Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 19. Unsupervised Machine Learning Applications Model Analytics/Diagnostics: Identify processes (features) in model outputs Separate processes (inseparable during modeling) Model reduction (develop reduced-order/surrogate models) Identify dependencies between model inputs and outputs Discover unknown dependencies and phenomena Make predictions Optimize data acquisition Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 20. Unsupervised Machine Learning Applications Physics-Informed Machine Learning: Coupled Data/Model Analytics (fusion) Simultaneous analyses of data and model outputs Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 21. Nonnegative Matrix/Tensor Factorization We have developed a series of novel unsupervised Machine Learning (ML) methods and computational techniques Our methods are based in matrix/tensor factorization coupled with custom k-means clustering and nonnegativity/sparsity constraints: NMFk: Nonnegative Matrix Factorization NTFk: Nonnegative Tensor Factorization NMFk/NTFk are capable to efficiently process large datasets (GB/TB’s) utilizing GPU’s & TPU’s (TensorFlow, PyTorch, MXNet) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 22. Why nonnegativity? Nonnegativity constraints provide meaningful and interpretable results (+sparsity) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 23. Why tensors? Tensors (multi-dimensional/multi-modal/multi-way datasets) are everywhere: monitoring data are typically a 5-D tensor (x,y,z,t,attributes) model outputs are typically a 5-D tensor (x,y,z,t,attributes) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 24. Why tensors-based unsupervised machine learning? SVD cannot be applied on multi-way (tensor datasets) Tensor-based unsupervised machine learning can be applied to resolve interactions between dimensions (modes) identify how controls (e.g., pressure, temperature, volume, etc.) are impacting outcomes (e.g. oil production) identify how observables (e.g., pressure, temperature, volume, etc.) are impacting other observables (e.g. concentration) identify how model inputs (e.g., conductivity, storage, etc.) are impacting model outputs (e.g. pressure) simultaneous analyses of data and model outputs (data/model fusion; PIML) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 25. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 26. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 27. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 28. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 29. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 30. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 31. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 32. Physics-Informed Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 33. Tucker Tensor Factorization (3D case): Tucker-3 ≈ X [K, M, N] G [k, m, n] H [k, K] W [n, N] V [m, M] X ≈ G ⊗ H ⊗ W ⊗ V Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 34. Tucker Tensor Decomposition: Feature extraction ≈ X [K, M, N] G [k, m, n] H [k, K] W [n, N] V [m, M] X ≈ G ⊗ H ⊗ W ⊗ V Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 35. Tucker Tensor Decomposition: Feature extraction ≈ t y x t y x Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 36. Tucker Tensor Decomposition: Tucker-2 (three possible alternatives) ≈ X [K, M, N] G [k, M, n] H [k, K] W [n, N] X ≈ G ⊗ H ⊗ W Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 37. Tucker Tensor Decomposition: Tucker-1 (three possible alternatives) ≈ X [K, M, N] G [k, M, N] H [k, K] X ≈ G ⊗ H Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 38. Tucker Tensor Factorization (3D case): Tucker-3 ≈ X [K, M, N] G [k, m, n] H [k, K] W [n, N] V [m, M] X ≈ G ⊗ H ⊗ W ⊗ V Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 39. Tucker Tensor Decomposition: Feature extraction Tucker decomposition is achieved through minimization Nonnegativity and sparsity constraints help the feature extraction Optimal number of features [k, m, n] is estimated through k-means clustering of a series minimization solutions with random initial guesses ≈ X [K, M, N] G [k, m, n] H [k, K] W [n, N] V [m, M] Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 40. Tucker Tensor Decomposition: Example V =   t t2 tanh(t − 10) + 1   W =     x x3 ex sin(x) + 1     H =       y y4 ln(y) sin(2y) + 1 cos(y) + 1       (12 elements) X = G ⊗ H ⊗ W ⊗ V = xyt + xy4 t + xtln(y)+ xt(sin(2y) + 1) + x3 yt+ xt(cos(y) + 1) + ytex + yt(sin(x) + 1)+ xyt2 + xy(1 + tanh(t − 10))+ t2 ex (sin(2y) + 1)+ (1 + tanh(t − 10))(sin(x) + 1) (cos(y) + 1) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 41. Tucker Tensor Decomposition: Example Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 42. Tucker Tensor Decomposition: Example ← Truth vs Predictions → Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 43. Tucker Tensor Decomposition: Example ← Truth vs Predictions → V =   t t2 tanh(t − 10) + 1   W =     x x3 ex sin(x) + 1     H =       y y4 ln(y) sin(2y) + 1 cos(y) + 1       Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 44. Tucker Tensor Decomposition: Example Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 45. Tucker Tensor Decomposition: Example M = G ⊗ H ⊗ W M1 = xy + xy4 + xlog(y + 1)+ x(sin(2y) + 1) + yex x(cos(y) + 1) + x3 y+ y(sin(x) + 1) M2 = xy+ ex (cos(z) + 1) M3 = xy+ (sin(x) + 1)(cos(y) + 1) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 46. Tucker Tensor Decomposition: Example Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 47. Tucker Tensor Decomposition: Example II (50 × 50 × 50) tensor 50 columns in x 50 rows in y 50 time frames ‘ones‘ swimming in a sea of ‘zeros‘ Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 48. Tucker Tensor Decomposition: Example II Factorizing all 3 dimensions (50 × 50 × 50) → (6 × 44 × 48) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 49. Tucker Tensor Decomposition: Example II 6 groups of swimmers (x); 44 lanes occupied (y); 48 time frames (first/last empty) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 50. NMFk / NTFk Challenges Identifying the number of unknown features: resolved using custom k-means clustering and sparsity constraints on the core tensor number of features identified based on the reconstruction quality (e.g., Frobenius norm) and cluster Silhouettes Solving a non-unique optimization problem: addressed through multistarts, regularization and nonnegativity constraints applying diverse optimization techniques (Multiplicative/Alternating Least Squares algorithms, NLopt, Ipopt, Gurobi, MOSEK, GLPK, Clp, Cbc, ...) Processing Big Data: GPU’s / TPU’s / Distributed computing Account for data sparsity and structure Nonnegative Tensor Trains Dealing with Noisy Data: Random noise impacts accuracy but its accountable Systematic noise is identified as separate signals Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 51. NMFk / NTFk Performance 4GB Tensor (1000 × 1000 × 1000) Framework Execution time (seconds) MATLAB 2634 NumPy 881 MXNet 644 PyTorch 121 TensorFlow 119 Julia 109 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 52. NMFk / NTFk Scalability 108 109 1010 1011 Tensor Size (bytes) 101 102 103 104 time(s) 3D Tensor perfect scaling 256 CPUs Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 53. NMFk / NTFk Analyses Field Data: Characterization of groundwater contaminant sources US Climate data Geothermal data Seismic data Lab Data: X-ray Spectroscopy UV Fluorescence Spectroscopy Microbial population analyses Operational Data: LANSCE: Los Alamos Neutron Accelerator Oil/gas production Model Data: Reactive mixing A + B → C Phase separation of co-polymers Molecular Dynamics of proteins EU Climate modeling (Helmholtz Institute, Germany) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 54. NMFk / NTFk Publications Stanev, Vesselinov, Kusne, Antoszewski, Takeuchi, Alexandrov, Unsupervised Phase Mapping of X-ray Diffraction Data by Nonnegative Matrix Factorization Integrated with Custom Clustering, Nature Computational Materials, 2018. Vesselinov, Munuduru, Karra, O’Maley, Alexandrov, Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing, Journal of Computational Physics, (in review), 2018. Vesselinov, O’Malley, Alexandrov, Nonnegative Tensor Factorization for Contaminant Source Identification, Journal of Contaminant Hydrology, (accepted), 2018. O’Malley, Vesselinov, Alexandrov, Alexandrov, Nonnegative/binary matrix factorization with a D-Wave quantum annealer, PLOS ONE, (accepted), 2018. Vesselinov, O’Malley, Alexandrov, Contaminant source identification using semi-supervised machine learning, Journal of Contaminant Hydrology, 10.1016/j.jconhyd.2017.11.002, 2017. Alexandrov, Vesselinov, Blind source separation for groundwater level analysis based on nonnegative matrix factorization, WRR, 10.1002/2013WR015037, 2014. Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 55. Climate model of Europe: air temperature fluctuations in the air temperature [◦ C] (424 × 412 × 365) (columns × rows × days) NTFk extracts spatial and temporal footprints of dominant features: storm signal winter seasonal signal summer seasonal signal .... Data compression: ∼ 4GB → ∼ 0.5MB Compression ratio: ∼ 8000 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 56. Climate model of Europe: air temperature fluctuations represented by 3 signals Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 57. Climate model of Europe: air temperature fluctuations represented by 3 signals Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 58. Climate model of Europe: water-table depth fluctuations in the water-table depth [m] (424 × 412 × 365) (columns × rows × days) NTFk extracts spatial and temporal footprints of dominant signals spring snowmelt signal summer rainfall signal seasonal signal .... Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 59. Climate model of Europe: water-table fluctuations represented by 3 signals Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 60. Climate model of Europe: water-table fluctuations represented by 3 signals Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 61. Climate model of Europe: analyze all model outputs (>40) simultaneously Find interconnections among model outputs Evaluate impacts of different model setups Find dominant processes impacting model predictions (e.g., climate impacts on groundwater resources, impacts of subsurface processes on atmospheric conditions) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 62. ML for extracting contaminant plumes Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 63. ML for extracting contaminant plumes Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 64. ML for extracting contaminant plumes Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 65. Geochemistry: LANL hydrogeochemical dataset R-44#1 R-28 R-42 (18 × 8 × 12) tensor (wells × species × years) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 66. Geochemistry: Nonnegative Tensor Factorization based on Tucker-1 decomposition X: data tensor W: source (groundwater type) matrix (unknown) G: mixing tensor (unknown) M: number of observation points (wells) N: number of observation times (e.g, 2001, 2002, ..., 2017) K: number of geochemical species observed (e.g., Cr6+ , SO2+ 4 , NO− 3 , etc.) k: number of unknown groundwater types mixed at each well Constraints: all tensor/matrix elements ≥ 0 k i=1 Gi,j,t = 1 ∀j, t Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 67. NTFk analysis estimated 7 groundwater types Sources Cr Cl− ClO4 3 H NO3 Ca Mg SO4 (µg/L) (mg/L) (µg/L) (pCi/L) (mg/L) (mg/L) (mg/L) (mg/L) S1 2970.00 63.00 0.00 0.00 14.00 73.00 25.00 170.00 S5 21.00 51.00 0.00 950.00 2.40 67.00 15.00 50.00 S6 1.50 64.00 0.00 0.00 2.80 51.00 10.00 68.00 S2 0.79 0.35 14.00 0.00 0.50 5.30 1.70 0.60 S4 0.50 0.14 0.00 0.00 10.00 21.00 5.00 10.00 S3 (B) 0.25 3.60 0.00 0.00 0.01 41.00 11.00 0.06 S7 (B) 0.10 0.03 0.00 0.00 0.01 0.40 0.80 0.90 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 68. NTFk estimated concentrations at various wells R-28 R-42 R-44#1 R-45#1 R-11 R-1 R-50#1 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 69. NTFk estimated time-dependent mixing of 7 groundwater types at various wells R-28 R-42 R-44#1 R-45#1 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 70. NTFk identified sources (groundwater types) Jan-Dec 2016 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 71. NTFk identified sources (groundwater types) Jan-Dec 2005 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 72. NTFk identified sources (groundwater types) Jan-Dec 2006 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 73. NTFk identified sources (groundwater types) Jan-Dec 2007 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 74. NTFk identified sources (groundwater types) Jan-Dec 2008 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 75. NTFk identified sources (groundwater types) Jan-Dec 2009 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 76. NTFk identified sources (groundwater types) Jan-Dec 2010 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 77. NTFk identified sources (groundwater types) Jan-Dec 2011 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 78. NTFk identified sources (groundwater types) Jan-Dec 2012 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 79. NTFk identified sources (groundwater types) Jan-Dec 2013 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 80. NTFk identified sources (groundwater types) Jan-Dec 2014 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 81. NTFk identified sources (groundwater types) Jan-Dec 2015 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 82. NTFk identified sources (groundwater types) Jan-Dec 2016 Source 7: (background) Source 1: Cr, Cl − , NO3, Ca, Mg, and SO4 Source 3: Cl − , Ca, Mg (background) Source 2: ClO4 Source 6: Cl − , Ca, Mg, and SO4 Source 4: NO3 Source 5: 3 H, Cr, Cl − , Ca, Mg, and SO4 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 83. NTFk analysis of LANL hydrogeochemical datasets (18 × 8 × 12) tensor (wells × species × years) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 84. Oklahoma seismicity 32,251 seismic events from 1989 to 2017 Tensor: total energy of events over a discretized domain (118 × 97 × 520) (columns × rows × weeks) NTFk applied to extract dominant hidden (latent) features based on spatial footprints and temporal characteristics Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 85. Oklahoma seismicity: reconstruction by 4 features (signals) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 86. Oklahoma seismicity: extracted signals vs. injected volumes Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 87. Oklahoma seismicity: extracted signals vs. majors seismic events Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 88. Oklahoma seismicity: 4 seismic features Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 89. Oklahoma seismicity volume pressure seismicity Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 90. Oklahoma seismicity: 5 volume/pressure/seismicity features Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 91. Oklahoma seismicity: 5 volume/pressure/seismicity features Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 92. LANSCE Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 93. LANSCE: Problem Numerous independent variables ... knobs controlling the accelerator performance 15 provided / analyzed Numerous dependent variables ... observations representing the accelerator performance 8 provided / analyzed 312,326 * 23 = 7,183,498 ≈ 100 MB (per month) Full monthly dataset ≈ 2 GB; For 30 years ≈ 1 TB; Goal #1: ML a reduced-order model to simulate the performance (physics model does not exist and cannot be built) Goal #2: Use ML to control the knobs to optimize the performance Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 94. LANSCE: Independent variables = 15 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 95. LANSCE: Dependent variables = 8 Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 96. LANSCE: NMFk estimated signal #1 based on the dependent variables Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 97. LANSCE: NMFk estimated signals #2 based on the dependent variables Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 98. LANSCE: NMFk Estimated signal #3 based on the dependent variables Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 99. LANSCE: NMFk reconstruction of one of the dependent variables Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 100. LANSCE: NMFk estimated signal #3 vs one of the independent variables Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 101. LANSCE: Reduced-order (SVR) Model Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 102. Reactive mixing A BBarrier A + B → C > 2000 simulations of C concentrations in time/space with varying model inputs representing reactive mixing (5 input model parameters) NTFk identifies physics processes impacting C concentrations and their relationship to model inputs Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 103. Reactive mixing: NTFk results NTFk extracts the dominant time/space features (processes / vortices) and compresses the model outputs Compression: > 200GB → ∼ 70MB (ratio ∼ 3000 Here, (1000 × 81 × 81) → (3 × 12 × 13) (time × rows × columns) Advection Dispersion Diffusion Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 104. Reactive mixing: NTFk results Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 105. Reactive mixing: NTFk results (κf L impacts) Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 106. Summary Developed novel unsupervised ML methods and computational tools based on Nonnegative Factorization (Matrices/Tensors) Our ML methods have been used to solve various real-world problems (brought breakthrough discoveries related to human cancer research) Our ML work already contributes in many research areas Modeldata Sensor data Experimentaldata Robust Unsupervised Machine Learning Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary
  • 107. Machine Learning (ML) Algorithms / Codes developed by our team Codes: NMFk MADS NTFk Examples: http://madsjulia.github.io/Mads.jl/Examples/blind_source_separation http://tensors.lanl.gov http://tensordecompositions.github.io Unsupervised ML Tucker Studies Climate Geochem Seimic LANSCE Mixing Summary