SlideShare a Scribd company logo
1 of 23
Download to read offline
Estimation of Inverse Covariance Matrix in
Compositional Data
Aditya Mishra
Flatiron Institute, Simons Foundation
Operator Splitting Methods in Data Analysis, SAMSI
Raleigh, NC
March 22, 2018
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 1 / 16
Motivation: Human Microbiome Project
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 2 / 16
Microbial Ecology and Human
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 3 / 16
Generation: Compositional Data
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 4 / 16
Generation: Compositional Data
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 5 / 16
Compositional Data of OTU
OTU are given by index set: gi = { index set of ith OTU} with
cardinality pi.
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
Compositional Data of OTU
OTU are given by index set: gi = { index set of ith OTU} with
cardinality pi.
Absolute abundance of components are unknown;
W =





w1g1 w1g2 w1g3 . . . w1gk
w2g1 w2g2 w2g3 . . . w2gk
...
...
...
...
...
wng1 wng2 wng3 . . . wngk





n×p=p1+...+pk
where operational taxonomic unit (OTU)
wigj = [wigj(1), wigj(2), . . . , wigj(pj)]
Let W be observation for random variable w = [wg1 , . . . , wgk
].
Define yigj = log wigj , and matrix Y = (yigj )ij. For Y, a random
variable y = [yg1 , . . . , ygk
].
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
Compositional Data of OTU
OTU are given by index set: gi = { index set of ith OTU} with
cardinality pi.
Absolute abundance of components are unknown;
W =





w1g1 w1g2 w1g3 . . . w1gk
w2g1 w2g2 w2g3 . . . w2gk
...
...
...
...
...
wng1 wng2 wng3 . . . wngk





n×p=p1+...+pk
where operational taxonomic unit (OTU)
wigj = [wigj(1), wigj(2), . . . , wigj(pj)]
Let W be observation for random variable w = [wg1 , . . . , wgk
].
Define yigj = log wigj , and matrix Y = (yigj )ij. For Y, a random
variable y = [yg1 , . . . , ygk
].
We are interested in finding inverse of covariance matrix
(Σy
) of random variable y [Aitchison, 1982].
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
Desirable Property of CDA Methods [Aitchison, 1982]
Scale invariance
Permutation invariance
Subcompositional coherence: Same results in a subcomposition,
regardless of whether we analyze only that subcomposition or a
larger composition containing other parts.
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 7 / 16
Compositional Data of OTU
Absolute abundance of component is unknown;
W =





w1g1 w1g2 w1g3 . . . w1gk
w2g1 w2g2 w2g3 . . . w2gk
...
...
...
...
...
wng1 wng2 wng3 . . . wngk





n×p=p1+...+pk
Define sub-composition matrix:
CT
= c1 c1 c1 . . . ck
T
=





1T
p1
0 . . . 0
0 1T
p2
. . . 0
...
...
...
...
0 0 . . . 1T
pk





k×p
(1)
where 1pk
is all-ones vector of size pk.
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 8 / 16
Compositional Data of OTU
Based count data W, define ¯xigj =
wigj
¯wij
where ¯wij =
pj
k=1 wigj(k).
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
Compositional Data of OTU
Based count data W, define ¯xigj =
wigj
¯wij
where ¯wij =
pj
k=1 wigj(k).
Unknown relative abundance data:
¯X =





¯x1g1 ¯x1g2 ¯x1g3 . . . ¯x1gk
¯x2g1 ¯x2g2 ¯x2g3 . . . ¯x2gk
...
...
...
...
...
¯xng1 ¯xng2 ¯xng3 . . . ¯xngk





n×p=p1+...+pk
where OTU ¯xigj = [¯xigj(1), ¯xigj(2), . . . , ¯xigj(pj)].
Corresponding to ¯X, we have random variable ¯x = [¯xg1 , . . . , ¯xgk
]
with ¯xgj = [¯xgj(1), . . . , ¯xgj(pj)].
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
Compositional Data of OTU
Based count data W, define ¯xigj =
wigj
¯wij
where ¯wij =
pj
k=1 wigj(k).
Unknown relative abundance data:
¯X =





¯x1g1 ¯x1g2 ¯x1g3 . . . ¯x1gk
¯x2g1 ¯x2g2 ¯x2g3 . . . ¯x2gk
...
...
...
...
...
¯xng1 ¯xng2 ¯xng3 . . . ¯xngk





n×p=p1+...+pk
where OTU ¯xigj = [¯xigj(1), ¯xigj(2), . . . , ¯xigj(pj)].
Corresponding to ¯X, we have random variable ¯x = [¯xg1 , . . . , ¯xgk
]
with ¯xgj = [¯xgj(1), . . . , ¯xgj(pj)].
Then CT¯xT = 0.
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
Covariance in Relative Abundance Data
In term of absolute count random variable w = [wg1 , . . . , wgk
], we
can write an element of relative count random variable ¯x, i.e.,
¯xgi(k) = wgi(k)/ ¯wi where ¯wi = pi
k=1 wgi(k).
Let ¯w = [ ¯wi, . . . , ¯wk] (sum of each subgroup random variable
w:)
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 10 / 16
Covariance in Relative Abundance Data
In term of absolute count random variable w = [wg1 , . . . , wgk
], we
can write an element of relative count random variable ¯x, i.e.,
¯xgi(k) = wgi(k)/ ¯wi where ¯wi = pi
k=1 wgi(k).
Let ¯w = [ ¯wi, . . . , ¯wk] (sum of each subgroup random variable
w:)
For any (i,j,k,l), we get:
cov(log ¯xgi(k), log ¯xgj(l)) = cov(log wgi(k), log wgj(l))−
cov(log wgi(k), log ¯wj)−
cov(log wgj(l), log ¯wi)+
cov(log ¯wi, log ¯wj)
On writing the expression of covariance matrix of random variable
¯x, we get
cov(log ¯x, log ¯x) =cov(y, y) − cov(C ¯w, y) − [cov(C ¯w, y)]T
+ cov(C ¯w, C ¯w) (2)
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 10 / 16
Observed Relative Abundance Data of OTU
Based of observed abundance data available:
X =





x1g1 x1g2 x1g3 . . . x1gk
x2g1 x2g2 x2g3 . . . x2gk
...
...
...
...
...
xng1 xng2 xng3 . . . xngk





n×p=p1+...+pk
where operational taxonomic unit (OTU)
xigj = [˜xigj(1), ˜xigj(2), . . . , ˜xigj(pj)]
Define xigj =
xigj
pi
k=1 ˜xigj(k)
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 11 / 16
Observed Relative Abundance Data of OTU
Based of observed abundance data available:
X =





x1g1 x1g2 x1g3 . . . x1gk
x2g1 x2g2 x2g3 . . . x2gk
...
...
...
...
...
xng1 xng2 xng3 . . . xngk





n×p=p1+...+pk
where operational taxonomic unit (OTU)
xigj = [˜xigj(1), ˜xigj(2), . . . , ˜xigj(pj)]
Define xigj =
xigj
pi
k=1 ˜xigj(k)
Using xigj , we have matrix of observed relative abundance
X = (xigj )ij.
Let observation X be corresponding to random variable x.
Using X, we can estimate
cov(log x, log x) = Σ
x
= cov(log ¯x, log ¯x).
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 11 / 16
Covariance Estimation in CDA
From the result in equation (2):
Σ
x
= Σy
− cov(Cwg, y) − [cov(Cwg, y)]T
+ cov(Cwg, Cwg)
Consider the transformation matrix: F = I − Pc where
Pc = C(CTC)−1CT.
Using the transformation matrix, we can say
FΣ
x
F = FΣy
F
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 12 / 16
Existing Approach in Unconstrained Setting
Graphical lasso formulation [Friedman et al., 2008]
min
Ω
− log Ω + tr(Σy
Ω) + λn Ω 1
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 13 / 16
Existing Approach in Unconstrained Setting
Graphical lasso formulation [Friedman et al., 2008]
min
Ω
− log Ω + tr(Σy
Ω) + λn Ω 1
Consider that Σy
is known. CLIME estimator [Cai et al., 2011]
for its inverse Ω is given by
min Ω 1 s.t. ΩΣy
− I ∞ ≤ λn
Also it can be formulated as:
min Ω 1 s.t. Ω−1
− Σy
∞ ≤ λn
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 13 / 16
Open Problem
On relaxing the nearness condition of Ω−1
− Σy
∞ ≤ λn for the
case of compositional data, we have
FΩ−1
F − FΣy
F ∞ ≤ λn
Given that FΣ
x
F = FΣy
F. Can we formulate the estimation of
sparse precision matrix as:
min Ω 1 s.t. FΩ−1
F − FΣ
x
F ∞ ≤ λn
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 14 / 16
Reference
John Aitchison. The statistical analysis of compositional data. Journal
of the Royal Statistical Society. Series B (Methodological), pages
139–177, 1982.
Tony Cai, Weidong Liu, and Xi Luo. A constrained 1 minimization
approach to sparse precision matrix estimation. Journal of the
American Statistical Association, 106(494):594–607, 2011.
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse
inverse covariance estimation with the graphical lasso. Biostatistics,
9(3):432–441, 2008.
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 15 / 16
Thank You
Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 16 / 16

More Related Content

What's hot

Large-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
Large-Scale Nonparametric Estimation of Vehicle Travel Time DistributionsLarge-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
Large-Scale Nonparametric Estimation of Vehicle Travel Time DistributionsRikiya Takahashi
 
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-Model
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-ModelA-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-Model
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-ModelProf Dr S.M.Aqil Burney
 
Genetic Algorithm for solving Dynamic Supply Chain Problem
Genetic Algorithm for solving Dynamic Supply Chain Problem  Genetic Algorithm for solving Dynamic Supply Chain Problem
Genetic Algorithm for solving Dynamic Supply Chain Problem AI Publications
 
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILIN
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILINRESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILIN
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILINWireilla
 
2013추계학술대회 인쇄용
2013추계학술대회 인쇄용2013추계학술대회 인쇄용
2013추계학술대회 인쇄용Byung Kook Ha
 
Metaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsMetaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsXin-She Yang
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
 
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control SystemChaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control Systemijtsrd
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishtuxette
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology tuxette
 
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...Xin-She Yang
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysistuxette
 
Lagrangian Fluid Simulation with Continuous Convolutions
Lagrangian Fluid Simulation with Continuous ConvolutionsLagrangian Fluid Simulation with Continuous Convolutions
Lagrangian Fluid Simulation with Continuous Convolutionsfarukcankaya
 
A Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of InterestA Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of InterestNational Cheng Kung University
 
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...AM Publications
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biologytuxette
 

What's hot (17)

GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
 
Large-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
Large-Scale Nonparametric Estimation of Vehicle Travel Time DistributionsLarge-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
Large-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
 
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-Model
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-ModelA-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-Model
A-New-Quantile-Based-Fuzzy-Time-Series-Forecasting-Model
 
Genetic Algorithm for solving Dynamic Supply Chain Problem
Genetic Algorithm for solving Dynamic Supply Chain Problem  Genetic Algorithm for solving Dynamic Supply Chain Problem
Genetic Algorithm for solving Dynamic Supply Chain Problem
 
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILIN
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILINRESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILIN
RESEARCH ON WIND ENERGY INVESTMENT DECISION MAKING: A CASE STUDY IN JILIN
 
2013추계학술대회 인쇄용
2013추계학술대회 인쇄용2013추계학술대회 인쇄용
2013추계학술대회 인쇄용
 
Metaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsMetaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open Problems
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
 
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control SystemChaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...
A Discrete Firefly Algorithm for the Multi-Objective Hybrid Flowshop Scheduli...
 
'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis
 
Lagrangian Fluid Simulation with Continuous Convolutions
Lagrangian Fluid Simulation with Continuous ConvolutionsLagrangian Fluid Simulation with Continuous Convolutions
Lagrangian Fluid Simulation with Continuous Convolutions
 
A Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of InterestA Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of Interest
 
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...
COGNITIVE MODELING OF 1D CONDUCTIVE THERMAL TRANSFER IN A MONO-WALL PLAN-WALL...
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 

Similar to QMC: Operator Splitting Workshop, Estimation of Inverse Covariance Matrix in Compositional Data - Aditya Mishra, Mar 21, 2018

Enhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points AnalysisEnhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points Analysisjfrchicanog
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsSeonho Park
 
A Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationA Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationIJECEIAES
 
IRJET- Performance Analysis of Optimization Techniques by using Clustering
IRJET- Performance Analysis of Optimization Techniques by using ClusteringIRJET- Performance Analysis of Optimization Techniques by using Clustering
IRJET- Performance Analysis of Optimization Techniques by using ClusteringIRJET Journal
 
One Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationOne Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationWork-Bench
 
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Waqas Tariq
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 

Similar to QMC: Operator Splitting Workshop, Estimation of Inverse Covariance Matrix in Compositional Data - Aditya Mishra, Mar 21, 2018 (20)

2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...
2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...
2018 Modern Math Workshop - Foundations of Statistical Learning Theory: Quint...
 
Enhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points AnalysisEnhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points Analysis
 
QMC: Undergraduate Workshop, Monte Carlo Techniques in Earth Science - Amit A...
QMC: Undergraduate Workshop, Monte Carlo Techniques in Earth Science - Amit A...QMC: Undergraduate Workshop, Monte Carlo Techniques in Earth Science - Amit A...
QMC: Undergraduate Workshop, Monte Carlo Techniques in Earth Science - Amit A...
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithms
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
 
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
A Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationA Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity Propagation
 
IRJET- Performance Analysis of Optimization Techniques by using Clustering
IRJET- Performance Analysis of Optimization Techniques by using ClusteringIRJET- Performance Analysis of Optimization Techniques by using Clustering
IRJET- Performance Analysis of Optimization Techniques by using Clustering
 
Classification
ClassificationClassification
Classification
 
Ica group 3[1]
Ica group 3[1]Ica group 3[1]
Ica group 3[1]
 
AINL 2016: Strijov
AINL 2016: StrijovAINL 2016: Strijov
AINL 2016: Strijov
 
Clustering-beamer.pdf
Clustering-beamer.pdfClustering-beamer.pdf
Clustering-beamer.pdf
 
Bayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdfBayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdf
 
One Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationOne Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical Computation
 
Slides ub-7
Slides ub-7Slides ub-7
Slides ub-7
 
Scikit-learn1
Scikit-learn1Scikit-learn1
Scikit-learn1
 
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 

Recently uploaded (20)

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 

QMC: Operator Splitting Workshop, Estimation of Inverse Covariance Matrix in Compositional Data - Aditya Mishra, Mar 21, 2018

  • 1. Estimation of Inverse Covariance Matrix in Compositional Data Aditya Mishra Flatiron Institute, Simons Foundation Operator Splitting Methods in Data Analysis, SAMSI Raleigh, NC March 22, 2018 Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 1 / 16
  • 2. Motivation: Human Microbiome Project Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 2 / 16
  • 3. Microbial Ecology and Human Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 3 / 16
  • 4. Generation: Compositional Data Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 4 / 16
  • 5. Generation: Compositional Data Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 5 / 16
  • 6. Compositional Data of OTU OTU are given by index set: gi = { index set of ith OTU} with cardinality pi. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
  • 7. Compositional Data of OTU OTU are given by index set: gi = { index set of ith OTU} with cardinality pi. Absolute abundance of components are unknown; W =      w1g1 w1g2 w1g3 . . . w1gk w2g1 w2g2 w2g3 . . . w2gk ... ... ... ... ... wng1 wng2 wng3 . . . wngk      n×p=p1+...+pk where operational taxonomic unit (OTU) wigj = [wigj(1), wigj(2), . . . , wigj(pj)] Let W be observation for random variable w = [wg1 , . . . , wgk ]. Define yigj = log wigj , and matrix Y = (yigj )ij. For Y, a random variable y = [yg1 , . . . , ygk ]. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
  • 8. Compositional Data of OTU OTU are given by index set: gi = { index set of ith OTU} with cardinality pi. Absolute abundance of components are unknown; W =      w1g1 w1g2 w1g3 . . . w1gk w2g1 w2g2 w2g3 . . . w2gk ... ... ... ... ... wng1 wng2 wng3 . . . wngk      n×p=p1+...+pk where operational taxonomic unit (OTU) wigj = [wigj(1), wigj(2), . . . , wigj(pj)] Let W be observation for random variable w = [wg1 , . . . , wgk ]. Define yigj = log wigj , and matrix Y = (yigj )ij. For Y, a random variable y = [yg1 , . . . , ygk ]. We are interested in finding inverse of covariance matrix (Σy ) of random variable y [Aitchison, 1982]. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 6 / 16
  • 9. Desirable Property of CDA Methods [Aitchison, 1982] Scale invariance Permutation invariance Subcompositional coherence: Same results in a subcomposition, regardless of whether we analyze only that subcomposition or a larger composition containing other parts. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 7 / 16
  • 10. Compositional Data of OTU Absolute abundance of component is unknown; W =      w1g1 w1g2 w1g3 . . . w1gk w2g1 w2g2 w2g3 . . . w2gk ... ... ... ... ... wng1 wng2 wng3 . . . wngk      n×p=p1+...+pk Define sub-composition matrix: CT = c1 c1 c1 . . . ck T =      1T p1 0 . . . 0 0 1T p2 . . . 0 ... ... ... ... 0 0 . . . 1T pk      k×p (1) where 1pk is all-ones vector of size pk. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 8 / 16
  • 11. Compositional Data of OTU Based count data W, define ¯xigj = wigj ¯wij where ¯wij = pj k=1 wigj(k). Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
  • 12. Compositional Data of OTU Based count data W, define ¯xigj = wigj ¯wij where ¯wij = pj k=1 wigj(k). Unknown relative abundance data: ¯X =      ¯x1g1 ¯x1g2 ¯x1g3 . . . ¯x1gk ¯x2g1 ¯x2g2 ¯x2g3 . . . ¯x2gk ... ... ... ... ... ¯xng1 ¯xng2 ¯xng3 . . . ¯xngk      n×p=p1+...+pk where OTU ¯xigj = [¯xigj(1), ¯xigj(2), . . . , ¯xigj(pj)]. Corresponding to ¯X, we have random variable ¯x = [¯xg1 , . . . , ¯xgk ] with ¯xgj = [¯xgj(1), . . . , ¯xgj(pj)]. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
  • 13. Compositional Data of OTU Based count data W, define ¯xigj = wigj ¯wij where ¯wij = pj k=1 wigj(k). Unknown relative abundance data: ¯X =      ¯x1g1 ¯x1g2 ¯x1g3 . . . ¯x1gk ¯x2g1 ¯x2g2 ¯x2g3 . . . ¯x2gk ... ... ... ... ... ¯xng1 ¯xng2 ¯xng3 . . . ¯xngk      n×p=p1+...+pk where OTU ¯xigj = [¯xigj(1), ¯xigj(2), . . . , ¯xigj(pj)]. Corresponding to ¯X, we have random variable ¯x = [¯xg1 , . . . , ¯xgk ] with ¯xgj = [¯xgj(1), . . . , ¯xgj(pj)]. Then CT¯xT = 0. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 9 / 16
  • 14. Covariance in Relative Abundance Data In term of absolute count random variable w = [wg1 , . . . , wgk ], we can write an element of relative count random variable ¯x, i.e., ¯xgi(k) = wgi(k)/ ¯wi where ¯wi = pi k=1 wgi(k). Let ¯w = [ ¯wi, . . . , ¯wk] (sum of each subgroup random variable w:) Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 10 / 16
  • 15. Covariance in Relative Abundance Data In term of absolute count random variable w = [wg1 , . . . , wgk ], we can write an element of relative count random variable ¯x, i.e., ¯xgi(k) = wgi(k)/ ¯wi where ¯wi = pi k=1 wgi(k). Let ¯w = [ ¯wi, . . . , ¯wk] (sum of each subgroup random variable w:) For any (i,j,k,l), we get: cov(log ¯xgi(k), log ¯xgj(l)) = cov(log wgi(k), log wgj(l))− cov(log wgi(k), log ¯wj)− cov(log wgj(l), log ¯wi)+ cov(log ¯wi, log ¯wj) On writing the expression of covariance matrix of random variable ¯x, we get cov(log ¯x, log ¯x) =cov(y, y) − cov(C ¯w, y) − [cov(C ¯w, y)]T + cov(C ¯w, C ¯w) (2) Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 10 / 16
  • 16. Observed Relative Abundance Data of OTU Based of observed abundance data available: X =      x1g1 x1g2 x1g3 . . . x1gk x2g1 x2g2 x2g3 . . . x2gk ... ... ... ... ... xng1 xng2 xng3 . . . xngk      n×p=p1+...+pk where operational taxonomic unit (OTU) xigj = [˜xigj(1), ˜xigj(2), . . . , ˜xigj(pj)] Define xigj = xigj pi k=1 ˜xigj(k) Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 11 / 16
  • 17. Observed Relative Abundance Data of OTU Based of observed abundance data available: X =      x1g1 x1g2 x1g3 . . . x1gk x2g1 x2g2 x2g3 . . . x2gk ... ... ... ... ... xng1 xng2 xng3 . . . xngk      n×p=p1+...+pk where operational taxonomic unit (OTU) xigj = [˜xigj(1), ˜xigj(2), . . . , ˜xigj(pj)] Define xigj = xigj pi k=1 ˜xigj(k) Using xigj , we have matrix of observed relative abundance X = (xigj )ij. Let observation X be corresponding to random variable x. Using X, we can estimate cov(log x, log x) = Σ x = cov(log ¯x, log ¯x). Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 11 / 16
  • 18. Covariance Estimation in CDA From the result in equation (2): Σ x = Σy − cov(Cwg, y) − [cov(Cwg, y)]T + cov(Cwg, Cwg) Consider the transformation matrix: F = I − Pc where Pc = C(CTC)−1CT. Using the transformation matrix, we can say FΣ x F = FΣy F Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 12 / 16
  • 19. Existing Approach in Unconstrained Setting Graphical lasso formulation [Friedman et al., 2008] min Ω − log Ω + tr(Σy Ω) + λn Ω 1 Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 13 / 16
  • 20. Existing Approach in Unconstrained Setting Graphical lasso formulation [Friedman et al., 2008] min Ω − log Ω + tr(Σy Ω) + λn Ω 1 Consider that Σy is known. CLIME estimator [Cai et al., 2011] for its inverse Ω is given by min Ω 1 s.t. ΩΣy − I ∞ ≤ λn Also it can be formulated as: min Ω 1 s.t. Ω−1 − Σy ∞ ≤ λn Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 13 / 16
  • 21. Open Problem On relaxing the nearness condition of Ω−1 − Σy ∞ ≤ λn for the case of compositional data, we have FΩ−1 F − FΣy F ∞ ≤ λn Given that FΣ x F = FΣy F. Can we formulate the estimation of sparse precision matrix as: min Ω 1 s.t. FΩ−1 F − FΣ x F ∞ ≤ λn Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 14 / 16
  • 22. Reference John Aitchison. The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological), pages 139–177, 1982. Tony Cai, Weidong Liu, and Xi Luo. A constrained 1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494):594–607, 2011. Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, 2008. Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 15 / 16
  • 23. Thank You Aditya Mishra Flatiron Institute, Simons Foundation (Operator Splitting Methods in Data Analysis, SAMSI RPrecision Matrix Estimation March 22, 2018 16 / 16