SlideShare a Scribd company logo
‘ACCOST’ for di erential HiC analysis‘ACCOST’ for di erential HiC analysis
Nathalie Vialaneix, INRAE/MIATNathalie Vialaneix, INRAE/MIAT
Chrocogen, July 10th, 2020Chrocogen, July 10th, 2020
1 / 241 / 24
2 / 242 / 24
First ofall... in the previous episodes...First ofall... in the previous episodes...
3 / 243 / 24
Topic
(What is this presentation about?)
When two sets of Hi-C matrices have been collected in two different
conditions, what are the available methods to compare the matrices and
identify regions that are significantly different between the conditions?
Comparison usually means: at a bin pair level.
4 / 24
Notations and formal de nition ofthe problem
Hi-C matrices: for , Hi-C matrices
Conditions: 2 conditions and such that and
Interactions: is the interaction frequency (in ) for bin pair
where and are two genomic loci, in the matrix
Question: for all pair , test the following assumption:
in which is the random variable that represents the number of contacts
(interaction frequency) between loci and in condition .
H
t
t =, … , T T
C1 C2 C1 ∪ C2 = {1, … , T }
C1 ∩ C2 = ∅
h
t
ij
N
+
(i, j)
i j t
(i, j)
H
ij
0
:  N
C1
ij
= N
C2
ij
N
Cr
ij
i j Cr
5 / 24
1. Prior di erential analysis
Most methods start to correct sequencing bias (between matrices
normalization)
Standard sequencing depth normalization [Anders & Huber, 2010] to
obtain equal total number of counts between the different samples (R
package edgeR)
MA plot correction [Lun & Smyth, 2015] and improvement by [Stansfield
et al, 2019] to correct trend in MA (mean versus difference) plots for every
pair of samples (R packages diffHic/csaw and multiHiCcompare)
MD plot correction [Stansfield et al, 2018] to correct trend in MD (distance
versus difference) plots for every pair of samples (R package HiCcompare)
6 / 24
Normalization
7 / 24
2. Compute a -value per bin
Z score computation [Stansfield et al, 2018] that is based on quantiles of
scaled and centered M values (R package HiCcompare)
that is used when there is no replicate (one sample per
condition)
that is very fast and easy to use
but is a bit low on the theoretical side (no strong evidence)
p
T = 2
8 / 24
2. Compute a -value per bin
Z score computation [Stansfield et al, 2018], (R package HiCcompare)
models [Lun & Smyth, 2015] that is based on Negative Binomial GLM
and statistical tests (R package diffHic)
that needs at least 3 replicates per condition to be used
that is not restricted to two conditions and that can include various
covariates
but is statistically better justified
[Stansfield et al., 2019] (R package multiHiCompare) also do that with small
changes (normalization...)
[Zaborowski and Wilczyński, 2020] also use this distribution but within
distance pools and counts are explained by counts in the other condition
rather than by the condition itself
p
N B
9 / 24
2. Compute a -value per bin
Z score computation [Stansfield et al, 2018], (R package HiCcompare)
models [Lun & Smyth, 2015], (R package diffHic)
In both these approaches, a -value is computed for every bin pair and -
values are corrected by multiple correction procedures (not described)
But spatial dependencies between pairs of bins are not included in the
methods!!
p
N B
p p
10 / 24
2. Compute a -value per bin taking spatial
dependencies into account
Using an analogy with neuroimaging and spatial Poisson processes
[Djekidel et al, 2018] (R package FIND)
needs at least 2 replicates per condition to be used
seems to be restricted to two conditions (but could maybe be easily
extended to more) and can include various covariates
is statistically (more or less) justified (from previous work on image
analysis)
uses tests at bin pair level with multiple corrections but those tests are
based on the value of the bin pair and its neighbors
is shown to work well for high resolution differential analysis (seems
to provide better results for 5kb bins)
p
11 / 24
2. Compute a -value per bin taking spatial
dependencies into account
Using an analogy with neuroimaging and spatial Poisson processes
[Djekidel et al, 2018] (R package FIND)
Using distance based correction and Gaussian filter comparison [Ardakany
et al, 2019] (Python/Matlab scripts selfish available on Github):
not sensitive to sequencing bias and does not require (between
matrix) normalization
only suited to (no replicate) for 2 conditions
p
T = 2
12 / 24
Other tools (not reviewed for the moment... but
mentionned in the article)
HOMER (binomial based test between two samples)
ChromoR (transformation into Gaussian measurements and Bayesian factor
analysis)
HiBrowse (based on edgeR, as diffHic and others)
13 / 24
Now... coming back to ACCOST!Now... coming back to ACCOST!
14 / 2414 / 24
ACCOST overview
suited only for 2 conditions with replicates (even though the
computations may work even without replicates)
based on DESeq (very similar to diffHic or multiHiCompare)
first: ICE normalization (within matrix normalization)
accounts for the distance effect in the matrix with the addition of an offset
in the model (does not require within matrix normalization, nor distance
based correction)
bitckucket python scripts available
15 / 24
Main hypotheses ofACCOST
with mean and standard deviation with:
where is the condition for sample and
is an experiment specific vector of locus biaises for locus in
sample
is a distance specific size factor that accounts for the genomic
distance effect
is the true (unknown) number of interactions between and in
condition (on which the test is based)
a similar decomposition for that depends on the parametric estimation
of a function , which models the dispersion as a smooth non-
negative function of the interaction
N
t
ij
∼ N B μ
t
ij
σ
t
ij
μ
t
ij
= β
t
i
β
t
j
s
t
|i−j|
q
k(t)
ij
k(t) t
β
t
i
i
t
s
t
|ij|
q
k(t)
ij
i j
k(t)
σ
t
ij
ν
k(t)
(q
k(t)
ij
16 / 24
Howare ACCOST parameters obtained?
is set as the ICE normalization factor of ICE for locus in sample
is obtained as the median of ICE normalized counts for pairs of loci
at distance ,
is then obtained by averaging corrected counts accross replicates of
the same condition:
is finally estimated by a polynomial regression (details skipped for
the sake of clarity but basically very similar to what is performed in DESeq
with the distance based corrected counts )
μ
t
ij
= β
t
i
β
t
j
s
t
|i−j|
q
k(t)
ij
β
t
i
i t
s
t
|i−j|
|i − j| median|i
′
−j
′
|=d
N
t
i′j′
β
t
i
′
β
t
j
′
q
k(t)
ij
q
k(t)
ij
= ∑
t∈Ck
1
|Ck|
N
t
ij
β
t
i
β
t
j
s
k
|i−j|
ν
k(t)
q
k(t)
ij
17 / 24
Validation ofACCOST
datasets: two human cell lines from [Rao et al, 2014], two mouse datasets
from [Dixon et al, 2012] and [Sehn et al, 2012], a Plasmodium dataset with
two distinct stages of the parasite from [Ay et al, 2014]
methods: diffHic and FIND
18 / 24
Normalized counts
19 / 24
p-value distribution
20 / 24
p-value distribution
short vs long range distances
21 / 24
Signi cant results locations
increase of significant contacts at 50 kb corresponds to a threshold related to
LOESS normalization
22 / 24
References
Ardakany, A.R., Ay, F., and Lonardi, S. (2019). Selfish: discovery of differential chromatin
interactions via a self-similarity measure. Bioinformatics, 35(14):i145--i153.
Ay, F., Bunnik, E.M., Varoquaux, N., Bol, S.M., Prudhomme, J., Vert, J.P., Noble, W.S., Le Roch,
K.G. (2014) Three-dimensional modeling of the P.falciparum genome during the erythrocytic
cycle reveals a strong connection between genome architecture and gene expression.
Genome Research, 24:974--988.
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B. (2012)
Topological domains in mammalian genomes identified by analysis of chromatin
interactions. Nature, 485: 376--380.
Djekidel, M.N., Chen, Y., and Zhang, M. Q. (2018). FIND: difFerential chromatin INteractions
Detection using a spatial Poisson process. Genome Research, 28:412--422.
Lun, A. and Smyth, G. (2015). diffHic: a Bioconductor package to detect differential genomic
interactions in Hi-C data. BMC Bioinformatics, 16:258.
Rao, S.S.P. et al. (2014). A 3D map of the human genome at kilobase resolution reveals
principles of chromatin looping. Cell, 159: 1665--1680.
Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L.,
Lobanenkov, V.V. et al. (2012) A map of the cis-regulatory sequences in the mouse genome.
Nature, 488: 116-120.
23 / 24
References
Stansfield, J.C., Cresswell, K.G., Vladimirov, V.I., and Dozmorov, M.G. (2018). HiCcompare: an
R-package for joint normalization and comparison of HI-C datasets. BMC Bioinformatics,
19:279.
Stansfield, J.C., Cresswell, K.G., and Dozmorov, M.G. (2019). multiHiCcompare: joint
normalization and comparative analysis of complex Hi-C experiments. Bioinformatics,
35(17): 2916-2923.
Zaborowski, R. and Wilczyński, B. (2020). DiADeM: differential analysis via dependency
modelling of chromatin interactions with robust generalized linear models. bioRxiv preprint.
24 / 24

More Related Content

What's hot

Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
tuxette
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
tuxette
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
tuxette
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
tuxette
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
tuxette
 
An introduction to neural networks
An introduction to neural networksAn introduction to neural networks
An introduction to neural networks
tuxette
 
Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...
eSAT Journals
 
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
The Statistical and Applied Mathematical Sciences Institute
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learningbutest
 
On the Mining of Numerical Data with Formal Concept Analysis
On the Mining of Numerical Data with Formal Concept AnalysisOn the Mining of Numerical Data with Formal Concept Analysis
On the Mining of Numerical Data with Formal Concept Analysis
INSA Lyon - L'Institut National des Sciences Appliquées de Lyon
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
University of Groningen
 
The Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - PhdassistanceThe Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - Phdassistance
PhD Assistance
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
The Statistical and Applied Mathematical Sciences Institute
 
An introduction to neural network
An introduction to neural networkAn introduction to neural network
An introduction to neural network
tuxette
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
tuxette
 
B colouring
B colouringB colouring
B colouringxs76250
 
Kernel Methods and Relational Learning in Computational Biology
Kernel Methods and Relational Learning in Computational BiologyKernel Methods and Relational Learning in Computational Biology
Kernel Methods and Relational Learning in Computational BiologyMichiel Stock
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
mustafa sarac
 
Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4Xenia miscouridou wi mlds 4
Tinkerplots
TinkerplotsTinkerplots
Tinkerplots
gueste18b4fcd
 

What's hot (20)

Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
An introduction to neural networks
An introduction to neural networksAn introduction to neural networks
An introduction to neural networks
 
Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...
 
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learning
 
On the Mining of Numerical Data with Formal Concept Analysis
On the Mining of Numerical Data with Formal Concept AnalysisOn the Mining of Numerical Data with Formal Concept Analysis
On the Mining of Numerical Data with Formal Concept Analysis
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
The Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - PhdassistanceThe Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - Phdassistance
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
An introduction to neural network
An introduction to neural networkAn introduction to neural network
An introduction to neural network
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 
B colouring
B colouringB colouring
B colouring
 
Kernel Methods and Relational Learning in Computational Biology
Kernel Methods and Relational Learning in Computational BiologyKernel Methods and Relational Learning in Computational Biology
Kernel Methods and Relational Learning in Computational Biology
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4
 
Tinkerplots
TinkerplotsTinkerplots
Tinkerplots
 

Similar to 'ACCOST' for differential HiC analysis

Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
tuxette
 
Kernel based approaches in drug target interaction prediction
Kernel based approaches in drug target interaction predictionKernel based approaches in drug target interaction prediction
Kernel based approaches in drug target interaction prediction
Xinyi Z.
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
IJAEMSJORNAL
 
Prediction model of algal blooms using logistic regression and confusion matrix
Prediction model of algal blooms using logistic regression and confusion matrix Prediction model of algal blooms using logistic regression and confusion matrix
Prediction model of algal blooms using logistic regression and confusion matrix
IJECEIAES
 
Colombo14a
Colombo14aColombo14a
Colombo14a
AlferoSimona
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
ijsc
 
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
ijsc
 
Advanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAdvanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshay
Akshay Deshmukh
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
Mark Gerstein
 
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
Kazumasa Kaneko
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Alexander Decker
 
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
theijes
 
Selection system: Biplots and Mapping genotyoe
Selection system: Biplots and Mapping genotyoeSelection system: Biplots and Mapping genotyoe
Selection system: Biplots and Mapping genotyoe
Alex Harley
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
Double Check ĆŐNSULTING
 
201977 1-1-3-pb
201977 1-1-3-pb201977 1-1-3-pb
201977 1-1-3-pb
AssociateProfessorKM
 
Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)
준식 최
 
Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...
ijics
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
ijcisjournal
 
Using Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
Using Partitioned Design Matrices in Analyzing Nested-Factorial ExperimentsUsing Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
Using Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
International journal of scientific and technical research in engineering (IJSTRE)
 

Similar to 'ACCOST' for differential HiC analysis (20)

Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
Kernel based approaches in drug target interaction prediction
Kernel based approaches in drug target interaction predictionKernel based approaches in drug target interaction prediction
Kernel based approaches in drug target interaction prediction
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
 
Prediction model of algal blooms using logistic regression and confusion matrix
Prediction model of algal blooms using logistic regression and confusion matrix Prediction model of algal blooms using logistic regression and confusion matrix
Prediction model of algal blooms using logistic regression and confusion matrix
 
Colombo14a
Colombo14aColombo14a
Colombo14a
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
 
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
Treatment by Alternative Methods of Regression Gas Chromatographic Retention ...
 
Advanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAdvanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshay
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
 
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
腸内細菌叢のメタゲノム解析に関する調査 / A survey on metagenomic analysis for gut microbiota
 
AROPUB-IJPGE-14-30
AROPUB-IJPGE-14-30AROPUB-IJPGE-14-30
AROPUB-IJPGE-14-30
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...
 
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...
 
Selection system: Biplots and Mapping genotyoe
Selection system: Biplots and Mapping genotyoeSelection system: Biplots and Mapping genotyoe
Selection system: Biplots and Mapping genotyoe
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
201977 1-1-3-pb
201977 1-1-3-pb201977 1-1-3-pb
201977 1-1-3-pb
 
Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)
 
Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
 
Using Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
Using Partitioned Design Matrices in Analyzing Nested-Factorial ExperimentsUsing Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
Using Partitioned Design Matrices in Analyzing Nested-Factorial Experiments
 

More from tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
tuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
tuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
tuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
tuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
tuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
tuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
tuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
tuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
tuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
tuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
tuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
tuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
tuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
tuxette
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
tuxette
 

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 

Recently uploaded

Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 

Recently uploaded (20)

Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 

'ACCOST' for differential HiC analysis

  • 1. ‘ACCOST’ for di erential HiC analysis‘ACCOST’ for di erential HiC analysis Nathalie Vialaneix, INRAE/MIATNathalie Vialaneix, INRAE/MIAT Chrocogen, July 10th, 2020Chrocogen, July 10th, 2020 1 / 241 / 24
  • 2. 2 / 242 / 24
  • 3. First ofall... in the previous episodes...First ofall... in the previous episodes... 3 / 243 / 24
  • 4. Topic (What is this presentation about?) When two sets of Hi-C matrices have been collected in two different conditions, what are the available methods to compare the matrices and identify regions that are significantly different between the conditions? Comparison usually means: at a bin pair level. 4 / 24
  • 5. Notations and formal de nition ofthe problem Hi-C matrices: for , Hi-C matrices Conditions: 2 conditions and such that and Interactions: is the interaction frequency (in ) for bin pair where and are two genomic loci, in the matrix Question: for all pair , test the following assumption: in which is the random variable that represents the number of contacts (interaction frequency) between loci and in condition . H t t =, … , T T C1 C2 C1 ∪ C2 = {1, … , T } C1 ∩ C2 = ∅ h t ij N + (i, j) i j t (i, j) H ij 0 :  N C1 ij = N C2 ij N Cr ij i j Cr 5 / 24
  • 6. 1. Prior di erential analysis Most methods start to correct sequencing bias (between matrices normalization) Standard sequencing depth normalization [Anders & Huber, 2010] to obtain equal total number of counts between the different samples (R package edgeR) MA plot correction [Lun & Smyth, 2015] and improvement by [Stansfield et al, 2019] to correct trend in MA (mean versus difference) plots for every pair of samples (R packages diffHic/csaw and multiHiCcompare) MD plot correction [Stansfield et al, 2018] to correct trend in MD (distance versus difference) plots for every pair of samples (R package HiCcompare) 6 / 24
  • 8. 2. Compute a -value per bin Z score computation [Stansfield et al, 2018] that is based on quantiles of scaled and centered M values (R package HiCcompare) that is used when there is no replicate (one sample per condition) that is very fast and easy to use but is a bit low on the theoretical side (no strong evidence) p T = 2 8 / 24
  • 9. 2. Compute a -value per bin Z score computation [Stansfield et al, 2018], (R package HiCcompare) models [Lun & Smyth, 2015] that is based on Negative Binomial GLM and statistical tests (R package diffHic) that needs at least 3 replicates per condition to be used that is not restricted to two conditions and that can include various covariates but is statistically better justified [Stansfield et al., 2019] (R package multiHiCompare) also do that with small changes (normalization...) [Zaborowski and Wilczyński, 2020] also use this distribution but within distance pools and counts are explained by counts in the other condition rather than by the condition itself p N B 9 / 24
  • 10. 2. Compute a -value per bin Z score computation [Stansfield et al, 2018], (R package HiCcompare) models [Lun & Smyth, 2015], (R package diffHic) In both these approaches, a -value is computed for every bin pair and - values are corrected by multiple correction procedures (not described) But spatial dependencies between pairs of bins are not included in the methods!! p N B p p 10 / 24
  • 11. 2. Compute a -value per bin taking spatial dependencies into account Using an analogy with neuroimaging and spatial Poisson processes [Djekidel et al, 2018] (R package FIND) needs at least 2 replicates per condition to be used seems to be restricted to two conditions (but could maybe be easily extended to more) and can include various covariates is statistically (more or less) justified (from previous work on image analysis) uses tests at bin pair level with multiple corrections but those tests are based on the value of the bin pair and its neighbors is shown to work well for high resolution differential analysis (seems to provide better results for 5kb bins) p 11 / 24
  • 12. 2. Compute a -value per bin taking spatial dependencies into account Using an analogy with neuroimaging and spatial Poisson processes [Djekidel et al, 2018] (R package FIND) Using distance based correction and Gaussian filter comparison [Ardakany et al, 2019] (Python/Matlab scripts selfish available on Github): not sensitive to sequencing bias and does not require (between matrix) normalization only suited to (no replicate) for 2 conditions p T = 2 12 / 24
  • 13. Other tools (not reviewed for the moment... but mentionned in the article) HOMER (binomial based test between two samples) ChromoR (transformation into Gaussian measurements and Bayesian factor analysis) HiBrowse (based on edgeR, as diffHic and others) 13 / 24
  • 14. Now... coming back to ACCOST!Now... coming back to ACCOST! 14 / 2414 / 24
  • 15. ACCOST overview suited only for 2 conditions with replicates (even though the computations may work even without replicates) based on DESeq (very similar to diffHic or multiHiCompare) first: ICE normalization (within matrix normalization) accounts for the distance effect in the matrix with the addition of an offset in the model (does not require within matrix normalization, nor distance based correction) bitckucket python scripts available 15 / 24
  • 16. Main hypotheses ofACCOST with mean and standard deviation with: where is the condition for sample and is an experiment specific vector of locus biaises for locus in sample is a distance specific size factor that accounts for the genomic distance effect is the true (unknown) number of interactions between and in condition (on which the test is based) a similar decomposition for that depends on the parametric estimation of a function , which models the dispersion as a smooth non- negative function of the interaction N t ij ∼ N B μ t ij σ t ij μ t ij = β t i β t j s t |i−j| q k(t) ij k(t) t β t i i t s t |ij| q k(t) ij i j k(t) σ t ij ν k(t) (q k(t) ij 16 / 24
  • 17. Howare ACCOST parameters obtained? is set as the ICE normalization factor of ICE for locus in sample is obtained as the median of ICE normalized counts for pairs of loci at distance , is then obtained by averaging corrected counts accross replicates of the same condition: is finally estimated by a polynomial regression (details skipped for the sake of clarity but basically very similar to what is performed in DESeq with the distance based corrected counts ) μ t ij = β t i β t j s t |i−j| q k(t) ij β t i i t s t |i−j| |i − j| median|i ′ −j ′ |=d N t i′j′ β t i ′ β t j ′ q k(t) ij q k(t) ij = ∑ t∈Ck 1 |Ck| N t ij β t i β t j s k |i−j| ν k(t) q k(t) ij 17 / 24
  • 18. Validation ofACCOST datasets: two human cell lines from [Rao et al, 2014], two mouse datasets from [Dixon et al, 2012] and [Sehn et al, 2012], a Plasmodium dataset with two distinct stages of the parasite from [Ay et al, 2014] methods: diffHic and FIND 18 / 24
  • 21. p-value distribution short vs long range distances 21 / 24
  • 22. Signi cant results locations increase of significant contacts at 50 kb corresponds to a threshold related to LOESS normalization 22 / 24
  • 23. References Ardakany, A.R., Ay, F., and Lonardi, S. (2019). Selfish: discovery of differential chromatin interactions via a self-similarity measure. Bioinformatics, 35(14):i145--i153. Ay, F., Bunnik, E.M., Varoquaux, N., Bol, S.M., Prudhomme, J., Vert, J.P., Noble, W.S., Le Roch, K.G. (2014) Three-dimensional modeling of the P.falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression. Genome Research, 24:974--988. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485: 376--380. Djekidel, M.N., Chen, Y., and Zhang, M. Q. (2018). FIND: difFerential chromatin INteractions Detection using a spatial Poisson process. Genome Research, 28:412--422. Lun, A. and Smyth, G. (2015). diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics, 16:258. Rao, S.S.P. et al. (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159: 1665--1680. Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., Lobanenkov, V.V. et al. (2012) A map of the cis-regulatory sequences in the mouse genome. Nature, 488: 116-120. 23 / 24
  • 24. References Stansfield, J.C., Cresswell, K.G., Vladimirov, V.I., and Dozmorov, M.G. (2018). HiCcompare: an R-package for joint normalization and comparison of HI-C datasets. BMC Bioinformatics, 19:279. Stansfield, J.C., Cresswell, K.G., and Dozmorov, M.G. (2019). multiHiCcompare: joint normalization and comparative analysis of complex Hi-C experiments. Bioinformatics, 35(17): 2916-2923. Zaborowski, R. and Wilczyński, B. (2020). DiADeM: differential analysis via dependency modelling of chromatin interactions with robust generalized linear models. bioRxiv preprint. 24 / 24