SlideShare a Scribd company logo
1 of 34
Download to read offline
IllustrationbyChrisBrigman
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly
owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
MLconf: The Machine Learning Conference
San Francisco, CA, Nov 10, 2017
A Tensor is an d-Way Array
11/10/2017 Kolda @ MLconf 2
Vector
d = 1
Matrix
d = 2
3rd-Order Tensor
d = 3
4th-Order Tensor
d = 4
5th-Order Tensor
d = 5
From Matrices to Tensors: Two Points of View
11/10/2017 Kolda @ MLconf 3
Singular value decomposition (SVD),
eigendecomposition (EVD), nonnegative matrix
factorization (NMF), sparse SVD, CUR, etc.
Viewpoint 1: Sum of outer products,
useful for interpretation
Tucker Model: Project onto high-variance
subspaces to reduce dimensionality
CP Model: Sum of d-way outer products,
useful for interpretation
CANDECOMP, PARAFAC, Canonical Polyadic, CP
HO-SVD, Best Rank-(𝑹1,𝑹2,…,𝑹d) decomposition
Other models for compression include
hierarchical Tucker and tensor train.
Viewpoint 2: High-variance subspaces,
useful for compression
Matrix Factorization
11/10/2017 Kolda @ MLconf 4
ModelData
≈
Standard Matrix Formulation
CP Tensor Factorization (3-way)
11/10/2017 Kolda @ MLconf 5
ModelData
≈
CP = CANDECOMP/PARAFAC or Canonical Polyadic
Hitchcock 1927, Harshman 1970, Carroll & Chang 1970
CP Tensor Factorization (𝒅-way)
11/10/2017 Kolda @ MLconf 6
ModelData
≈
Hitchcock 1927, Harshman 1970, Carroll & Chang 1970
11/10/2017 Kolda @ MLconf 7
Amino Acids Fluorescence Dataset
▪ Fluorescence
measurements of 5
samples containing 3
amino acids
▪ Tryptophan
▪ Tyrosine
▪ Phenylalanine
▪ Tensor of size 5 x 51 x 201
▪ 5 samples
▪ 51 excitations
▪ 201 emissions
11/10/2017 Kolda @ MLconf 8
Unknown mixture of
three amino acids
samples
excitation
R. Bro, PARAFAC: Tutorial and Applications, Chemometrics and Intelligent Laboratory Systems, 38:149-171, 1997
Rank-3 CP Factorization of Amino Acids Data
11/10/2017 Kolda @ MLconf 9
𝐀 (5 × 3) 𝐁 (201 × 3) 𝐂 (51 × 3)
Bro 1997
samples
excitation
11/10/2017 Kolda @ MLconf 10
Motivating Example: Neuron Activity in Learning
11/10/2017 Kolda @ MLconf 11
Thanks to Schnitzer Group @ Stanford
Mark Schnitzer, Fori Wang, Tony Kim
mouse
in “maze” neural activity
× 120 time bins
× 600 trials (over 5 days)
Microscope by
Inscopix
One Column
of Neuron x
Time Matrix
300 neurons
One Trial
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
Trials Vary Start Position and Strategies
11/10/2017 Kolda @ MLconf 12
• 600 Trials over 5 Days
• Start West or East
• Conditions Swap Twice
❖ Always Turn South
❖ Always Turn Right
❖ Always Turn South
wall
W
E
N
S
note different patterns on curtains
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
CP for Simultaneous Analysis of Neurons, Time,
and Trial
11/10/2017 Kolda @ MLconf 13
Prior tensor work in neuroscience for fMRI and EEG: Andersen and Rayens (2004),
Mørup et al. (2004), Acar et al. (2007), De Vos et al. (2007), and more
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
8-Component CP Decomposition of Mouse
Neuron Data
11/10/2017 Kolda @ MLconf 14
Interpretation of Mouse Neuron Data
11/10/2017 Kolda @ MLconf 15
11/10/2017 Kolda @ MLconf 16
Tensor Factorization (3-way)
11/10/2017 Kolda @ MLconf 17
ModelData
≈
We can rewrite this as a matrix equation in 𝐀, 𝐁, or 𝐂.
CP-ALS: Fitting CP Model via Alternating Least
Squares
11/10/2017 Kolda @ MLconf 18
Harshman, 1970; Carroll & Chang, 1970
▪ Rank (R) NP-Hard: Even best
low-rank solution may not exist
(Håstad 1990, Silva & Lim 2006,
Hillar & Lim 2009)
▪ Not nested: Best rank-(R-1)
factorization may not be part
of best rank-R factorization
(Kolda 2001)
▪ Nonconvex: But convex linear
least squares problems
▪ Not orthogonal: Factor
matrices are not orthogonal
and may even have linearly
dependent columns
▪ Essentially Unique: Under
modest conditions, CP is
unique up to permutation and
scaling unique (Kruskal 1977)
Repeat until convergence:
Step 1:
Step 2:
Step 3:
CP-ALS Least Squares Problem
11/10/2017 Kolda @ MLconf 19
𝐗(1) −
Khatri-Rao Product
“right hand sides” “matrix”
𝐀 (𝐂 ⊙ 𝐁)′
𝑛 × 𝑛 𝑑−1 𝑛 × 𝑟 𝑟 × 𝑛 𝑑−1
𝑛 × 𝑛2 𝑛 × 𝑟 𝑟 × 𝑛2
CP Least Squares Problem
11/10/2017 Kolda @ MLconf 20
−
−
How to randomize this?
𝑛 × 𝑛 𝑑−1 𝑛 × 𝑟 𝑟 × 𝑛 𝑑−1
Aside: Sketching for Standard Least Squares
11/10/2017 Kolda @ MLconf 21
𝐀 𝐛𝐱
−
ො𝑛
𝑛
Backslash causes MATLAB to
automatically call the best
solver (cholesky, qr, etc.)
𝒪(ො𝑛𝑛2) Sarlós 2006, Woodruff 2014
Sampled Least Squares
11/10/2017 Kolda @ MLconf 22
𝐀 𝐛𝐱
−
Choose 𝑞 rows, uniformly at random
𝐒𝐀 𝐒𝐛𝐱
−
𝑞
𝑛
𝒪(𝑞𝑛2)
approximate
Sampling only guaranteed to
“work” if the 𝐀 is incoherent.
𝐒
𝒪(ො𝑛𝑛2
)
ො𝑛
𝑛
Sarlós 2006, Woodruff 2014
CP-ALS-RAND
11/10/2017 Kolda @ MLconf 23
−
−
−
Battaglino, Ballard, & Kolda 2017
Randomizing the Convergence Check
11/10/2017 Kolda @ MLconf 24
Estimate convergence of
function values using small
random subset of elements
in function evaluation
(use Chernoff-Hoeffding to
bound accuracy)
16000 samples < 1% of full data
Battaglino, Ballard, & Kolda 2017
Speed Advantage: Analysis of Hazardous Gas
Data
11/10/2017 Kolda @ MLconf 25
Data from Vergara et al. 2013; see also Vervliet and De Lathauwer (2016)
This mode scaled by component size Color-coded by gas type
900 experiments (with three different gas types) x 72 sensors x 25,900 time steps (13 GB)
Battaglino, Ballard, & Kolda 2017
Globalization Advantage?
Amino Acids Data
11/10/2017 Kolda @ MLconf 26
Benefits are not as clear without mixing.
Fit = 0.92
Fit = 0.97
11/10/2017 Kolda @ MLconf 27
Generalizing the Goodness-of-Fit Criteria
11/10/2017 Kolda @ MLconf 28
Anderson-Bergman, Duersch, Hong, Kolda 2017
Similar ideas have been proposed in matrix world,
e.g., Collins, Dasgupta, Schapire 2002
“Standard” CP
11/10/2017 Kolda @ MLconf 29
Typically: Consider data to be
low-rank plus “white noise”
Equivalently, Gaussian with mean 𝑚𝑖𝑗𝑘
Gaussian Probability
Density Function (PDF)
Minimize negative log likelihood:
Results in the “standard” objective:
Link:
Anderson-Bergman, Duersch, Hong, Kolda 2017
“Boolean CP”: Odds Link
11/10/2017 Kolda @ MLconf 30
Consider data to be Bernoulli distributed
with probability 𝑝𝑖jk
Equivalent to minimizing negative log likelihood:
Probability Mass Function (PMF):
𝑝𝑖𝑗𝑘 =
𝑚𝑖𝑗𝑘
1 + 𝑚𝑖𝑗𝑘
⇔ 𝑚𝑖𝑗𝑘 =
𝑝𝑖𝑗𝑘
1 − 𝑝𝑖𝑗𝑘
Convert from probability to odds:
𝑝 𝑥 1 − 𝑝 1−𝑥
Anderson-Bergman, Duersch, Hong, Kolda 2017
Generalized CP
11/10/2017 Kolda @ MLconf 31
“Standard” CP uses:
“Poisson” CP (Chi-Kolda 2012) uses:
“Boolean-Odds” CP uses:
Apply favorite optimization method (including SGD) to compute the solution.
Anderson-Bergman, Duersch, Hong, Kolda 2017
A Sparse Dataset
▪ UC Irvine Chat Network
▪ 4-way binary tensor
▪ Sender (205)
▪ Receiver (210)
▪ Hour of Day (24)
▪ Day (194)
▪ 14,953 nonzeros (very sparse)
▪ Goodness-of-fit (odds):
𝑓 𝑥, 𝑚 = log 𝑚 + 1 − 𝑥 log 𝑚
▪ Use GCP to compute rank-12
decomposition
11/10/2017 Kolda @ MLconf 32
Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163, doi: 10.1016/j.socnet.2009.02.002
Binary Chat Data using Boolean CP
11/10/2017 Kolda @ MLconf 33
Anderson-Bergman, Duersch, Hong, Kolda 2017
Tensors & Data Analysis
▪ CP tensor decomposition is effective for unsupervised data analysis
▪ Latent factor analysis
▪ Dimension reduction
▪ CP can be generalized to alternative fit functions
▪ Boolean data, count data, etc.
▪ Randomized techniques are open new doorways to larger datasets
and more robust solutions
▪ Matrix sketching
▪ Stochastic gradient descent
▪ Other on-going & future work
▪ Parallel CP and GCP implementations
(https://gitlab.com/tensors/genten)
▪ Parallel Tucker for compression
(https://gitlab.com/tensors/TuckerMPI)
▪ Randomized ST-HOSVD (Tucker)
▪ Functional tensor factorization as surrogate for expensive functions
▪ Extensions to many more applications (binary data, signals, etc.)
11/10/2017 Kolda @ MLconf 34
Acknowledgements
▪ Cliff Anderson-
Bergman (Sandia)
▪ Grey Ballard
(Wake Forrest)
▪ Casey Battaglino
(Georgia Tech)
▪ Jed Duersch
(Sandia)
▪ David Hong
(U. Michigan)
▪ Alex Williams
(Stanford)
Kolda and Bader,
Tensor Decompositions
and Applications, SIAM
Review ‘09
Tensor Toolbox for MATLAB:
www.tensortoolbox.org
Bader, Kolda, Acar, Dunlavy,
and othersContact: Tammy Kolda, www.kolda.net, tgkolda@sandia.gov

More Related Content

What's hot

tensor-decomposition
tensor-decompositiontensor-decomposition
tensor-decomposition
Kenta Oono
 
Presentation on Probability Genrating Function
Presentation on Probability Genrating FunctionPresentation on Probability Genrating Function
Presentation on Probability Genrating Function
Md Riaz Ahmed Khan
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 

What's hot (20)

Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
 
tensor-decomposition
tensor-decompositiontensor-decomposition
tensor-decomposition
 
presentation
presentationpresentation
presentation
 
strong slot and filler
strong slot and fillerstrong slot and filler
strong slot and filler
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Artificial Bee Colony algorithm
Artificial Bee Colony algorithmArtificial Bee Colony algorithm
Artificial Bee Colony algorithm
 
ppt
pptppt
ppt
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answers
 
A Walk in the GAN Zoo
A Walk in the GAN ZooA Walk in the GAN Zoo
A Walk in the GAN Zoo
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
Presentation on Probability Genrating Function
Presentation on Probability Genrating FunctionPresentation on Probability Genrating Function
Presentation on Probability Genrating Function
 
AlphaGo
AlphaGoAlphaGo
AlphaGo
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
QMIX: monotonic value function factorization paper review
QMIX: monotonic value function factorization paper reviewQMIX: monotonic value function factorization paper review
QMIX: monotonic value function factorization paper review
 
Iterative deepening search
Iterative deepening searchIterative deepening search
Iterative deepening search
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Reinforcement Learning Q-Learning
Reinforcement Learning   Q-Learning Reinforcement Learning   Q-Learning
Reinforcement Learning Q-Learning
 

Viewers also liked

Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
MLconf
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
MLconf
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
MLconf
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
MLconf
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
MLconf
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
MLconf
 
Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, Pepperdata
MLconf
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
MLconf
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
MLconf
 
Lukas Biewald, MLconf
Lukas Biewald, MLconf Lukas Biewald, MLconf
Lukas Biewald, MLconf
MLconf
 

Viewers also liked (20)

Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, Pepperdata
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Lukas Biewald, MLconf
Lukas Biewald, MLconf Lukas Biewald, MLconf
Lukas Biewald, MLconf
 
Matei zaharia, spark presentation m lconf 2013
Matei zaharia, spark presentation m lconf 2013Matei zaharia, spark presentation m lconf 2013
Matei zaharia, spark presentation m lconf 2013
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 

Similar to Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories at MLconf SF 2017

End of Sprint 5
End of Sprint 5End of Sprint 5
End of Sprint 5
dm_work
 
EOS5 Demo
EOS5 DemoEOS5 Demo
EOS5 Demo
dm_work
 

Similar to Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories at MLconf SF 2017 (20)

Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER Models
 
End of Sprint 5
End of Sprint 5End of Sprint 5
End of Sprint 5
 
EOS5 Demo
EOS5 DemoEOS5 Demo
EOS5 Demo
 
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
 
Regression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network ApproachRegression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network Approach
 
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...
 
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
 
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)
 
modeling.ppt
modeling.pptmodeling.ppt
modeling.ppt
 
Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 

More from MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

Recently uploaded (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories at MLconf SF 2017

  • 1. IllustrationbyChrisBrigman Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. MLconf: The Machine Learning Conference San Francisco, CA, Nov 10, 2017
  • 2. A Tensor is an d-Way Array 11/10/2017 Kolda @ MLconf 2 Vector d = 1 Matrix d = 2 3rd-Order Tensor d = 3 4th-Order Tensor d = 4 5th-Order Tensor d = 5
  • 3. From Matrices to Tensors: Two Points of View 11/10/2017 Kolda @ MLconf 3 Singular value decomposition (SVD), eigendecomposition (EVD), nonnegative matrix factorization (NMF), sparse SVD, CUR, etc. Viewpoint 1: Sum of outer products, useful for interpretation Tucker Model: Project onto high-variance subspaces to reduce dimensionality CP Model: Sum of d-way outer products, useful for interpretation CANDECOMP, PARAFAC, Canonical Polyadic, CP HO-SVD, Best Rank-(𝑹1,𝑹2,…,𝑹d) decomposition Other models for compression include hierarchical Tucker and tensor train. Viewpoint 2: High-variance subspaces, useful for compression
  • 4. Matrix Factorization 11/10/2017 Kolda @ MLconf 4 ModelData ≈ Standard Matrix Formulation
  • 5. CP Tensor Factorization (3-way) 11/10/2017 Kolda @ MLconf 5 ModelData ≈ CP = CANDECOMP/PARAFAC or Canonical Polyadic Hitchcock 1927, Harshman 1970, Carroll & Chang 1970
  • 6. CP Tensor Factorization (𝒅-way) 11/10/2017 Kolda @ MLconf 6 ModelData ≈ Hitchcock 1927, Harshman 1970, Carroll & Chang 1970
  • 8. Amino Acids Fluorescence Dataset ▪ Fluorescence measurements of 5 samples containing 3 amino acids ▪ Tryptophan ▪ Tyrosine ▪ Phenylalanine ▪ Tensor of size 5 x 51 x 201 ▪ 5 samples ▪ 51 excitations ▪ 201 emissions 11/10/2017 Kolda @ MLconf 8 Unknown mixture of three amino acids samples excitation R. Bro, PARAFAC: Tutorial and Applications, Chemometrics and Intelligent Laboratory Systems, 38:149-171, 1997
  • 9. Rank-3 CP Factorization of Amino Acids Data 11/10/2017 Kolda @ MLconf 9 𝐀 (5 × 3) 𝐁 (201 × 3) 𝐂 (51 × 3) Bro 1997 samples excitation
  • 10. 11/10/2017 Kolda @ MLconf 10
  • 11. Motivating Example: Neuron Activity in Learning 11/10/2017 Kolda @ MLconf 11 Thanks to Schnitzer Group @ Stanford Mark Schnitzer, Fori Wang, Tony Kim mouse in “maze” neural activity × 120 time bins × 600 trials (over 5 days) Microscope by Inscopix One Column of Neuron x Time Matrix 300 neurons One Trial Williams et al., bioRxiv, 2017, DOI:10.1101/211128
  • 12. Trials Vary Start Position and Strategies 11/10/2017 Kolda @ MLconf 12 • 600 Trials over 5 Days • Start West or East • Conditions Swap Twice ❖ Always Turn South ❖ Always Turn Right ❖ Always Turn South wall W E N S note different patterns on curtains Williams et al., bioRxiv, 2017, DOI:10.1101/211128
  • 13. CP for Simultaneous Analysis of Neurons, Time, and Trial 11/10/2017 Kolda @ MLconf 13 Prior tensor work in neuroscience for fMRI and EEG: Andersen and Rayens (2004), Mørup et al. (2004), Acar et al. (2007), De Vos et al. (2007), and more Williams et al., bioRxiv, 2017, DOI:10.1101/211128
  • 14. 8-Component CP Decomposition of Mouse Neuron Data 11/10/2017 Kolda @ MLconf 14
  • 15. Interpretation of Mouse Neuron Data 11/10/2017 Kolda @ MLconf 15
  • 16. 11/10/2017 Kolda @ MLconf 16
  • 17. Tensor Factorization (3-way) 11/10/2017 Kolda @ MLconf 17 ModelData ≈ We can rewrite this as a matrix equation in 𝐀, 𝐁, or 𝐂.
  • 18. CP-ALS: Fitting CP Model via Alternating Least Squares 11/10/2017 Kolda @ MLconf 18 Harshman, 1970; Carroll & Chang, 1970 ▪ Rank (R) NP-Hard: Even best low-rank solution may not exist (Håstad 1990, Silva & Lim 2006, Hillar & Lim 2009) ▪ Not nested: Best rank-(R-1) factorization may not be part of best rank-R factorization (Kolda 2001) ▪ Nonconvex: But convex linear least squares problems ▪ Not orthogonal: Factor matrices are not orthogonal and may even have linearly dependent columns ▪ Essentially Unique: Under modest conditions, CP is unique up to permutation and scaling unique (Kruskal 1977) Repeat until convergence: Step 1: Step 2: Step 3:
  • 19. CP-ALS Least Squares Problem 11/10/2017 Kolda @ MLconf 19 𝐗(1) − Khatri-Rao Product “right hand sides” “matrix” 𝐀 (𝐂 ⊙ 𝐁)′ 𝑛 × 𝑛 𝑑−1 𝑛 × 𝑟 𝑟 × 𝑛 𝑑−1 𝑛 × 𝑛2 𝑛 × 𝑟 𝑟 × 𝑛2
  • 20. CP Least Squares Problem 11/10/2017 Kolda @ MLconf 20 − − How to randomize this? 𝑛 × 𝑛 𝑑−1 𝑛 × 𝑟 𝑟 × 𝑛 𝑑−1
  • 21. Aside: Sketching for Standard Least Squares 11/10/2017 Kolda @ MLconf 21 𝐀 𝐛𝐱 − ො𝑛 𝑛 Backslash causes MATLAB to automatically call the best solver (cholesky, qr, etc.) 𝒪(ො𝑛𝑛2) Sarlós 2006, Woodruff 2014
  • 22. Sampled Least Squares 11/10/2017 Kolda @ MLconf 22 𝐀 𝐛𝐱 − Choose 𝑞 rows, uniformly at random 𝐒𝐀 𝐒𝐛𝐱 − 𝑞 𝑛 𝒪(𝑞𝑛2) approximate Sampling only guaranteed to “work” if the 𝐀 is incoherent. 𝐒 𝒪(ො𝑛𝑛2 ) ො𝑛 𝑛 Sarlós 2006, Woodruff 2014
  • 23. CP-ALS-RAND 11/10/2017 Kolda @ MLconf 23 − − − Battaglino, Ballard, & Kolda 2017
  • 24. Randomizing the Convergence Check 11/10/2017 Kolda @ MLconf 24 Estimate convergence of function values using small random subset of elements in function evaluation (use Chernoff-Hoeffding to bound accuracy) 16000 samples < 1% of full data Battaglino, Ballard, & Kolda 2017
  • 25. Speed Advantage: Analysis of Hazardous Gas Data 11/10/2017 Kolda @ MLconf 25 Data from Vergara et al. 2013; see also Vervliet and De Lathauwer (2016) This mode scaled by component size Color-coded by gas type 900 experiments (with three different gas types) x 72 sensors x 25,900 time steps (13 GB) Battaglino, Ballard, & Kolda 2017
  • 26. Globalization Advantage? Amino Acids Data 11/10/2017 Kolda @ MLconf 26 Benefits are not as clear without mixing. Fit = 0.92 Fit = 0.97
  • 27. 11/10/2017 Kolda @ MLconf 27
  • 28. Generalizing the Goodness-of-Fit Criteria 11/10/2017 Kolda @ MLconf 28 Anderson-Bergman, Duersch, Hong, Kolda 2017 Similar ideas have been proposed in matrix world, e.g., Collins, Dasgupta, Schapire 2002
  • 29. “Standard” CP 11/10/2017 Kolda @ MLconf 29 Typically: Consider data to be low-rank plus “white noise” Equivalently, Gaussian with mean 𝑚𝑖𝑗𝑘 Gaussian Probability Density Function (PDF) Minimize negative log likelihood: Results in the “standard” objective: Link: Anderson-Bergman, Duersch, Hong, Kolda 2017
  • 30. “Boolean CP”: Odds Link 11/10/2017 Kolda @ MLconf 30 Consider data to be Bernoulli distributed with probability 𝑝𝑖jk Equivalent to minimizing negative log likelihood: Probability Mass Function (PMF): 𝑝𝑖𝑗𝑘 = 𝑚𝑖𝑗𝑘 1 + 𝑚𝑖𝑗𝑘 ⇔ 𝑚𝑖𝑗𝑘 = 𝑝𝑖𝑗𝑘 1 − 𝑝𝑖𝑗𝑘 Convert from probability to odds: 𝑝 𝑥 1 − 𝑝 1−𝑥 Anderson-Bergman, Duersch, Hong, Kolda 2017
  • 31. Generalized CP 11/10/2017 Kolda @ MLconf 31 “Standard” CP uses: “Poisson” CP (Chi-Kolda 2012) uses: “Boolean-Odds” CP uses: Apply favorite optimization method (including SGD) to compute the solution. Anderson-Bergman, Duersch, Hong, Kolda 2017
  • 32. A Sparse Dataset ▪ UC Irvine Chat Network ▪ 4-way binary tensor ▪ Sender (205) ▪ Receiver (210) ▪ Hour of Day (24) ▪ Day (194) ▪ 14,953 nonzeros (very sparse) ▪ Goodness-of-fit (odds): 𝑓 𝑥, 𝑚 = log 𝑚 + 1 − 𝑥 log 𝑚 ▪ Use GCP to compute rank-12 decomposition 11/10/2017 Kolda @ MLconf 32 Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163, doi: 10.1016/j.socnet.2009.02.002
  • 33. Binary Chat Data using Boolean CP 11/10/2017 Kolda @ MLconf 33 Anderson-Bergman, Duersch, Hong, Kolda 2017
  • 34. Tensors & Data Analysis ▪ CP tensor decomposition is effective for unsupervised data analysis ▪ Latent factor analysis ▪ Dimension reduction ▪ CP can be generalized to alternative fit functions ▪ Boolean data, count data, etc. ▪ Randomized techniques are open new doorways to larger datasets and more robust solutions ▪ Matrix sketching ▪ Stochastic gradient descent ▪ Other on-going & future work ▪ Parallel CP and GCP implementations (https://gitlab.com/tensors/genten) ▪ Parallel Tucker for compression (https://gitlab.com/tensors/TuckerMPI) ▪ Randomized ST-HOSVD (Tucker) ▪ Functional tensor factorization as surrogate for expensive functions ▪ Extensions to many more applications (binary data, signals, etc.) 11/10/2017 Kolda @ MLconf 34 Acknowledgements ▪ Cliff Anderson- Bergman (Sandia) ▪ Grey Ballard (Wake Forrest) ▪ Casey Battaglino (Georgia Tech) ▪ Jed Duersch (Sandia) ▪ David Hong (U. Michigan) ▪ Alex Williams (Stanford) Kolda and Bader, Tensor Decompositions and Applications, SIAM Review ‘09 Tensor Toolbox for MATLAB: www.tensortoolbox.org Bader, Kolda, Acar, Dunlavy, and othersContact: Tammy Kolda, www.kolda.net, tgkolda@sandia.gov