This document presents an overview of weighted correlation network analysis (WGCNA), an R package used to identify clusters (modules) of highly correlated genes in a biological network. It describes the main steps of WGCNA, including data preprocessing, constructing a weighted correlation network, identifying modules of co-expressed genes, relating modules to external traits, studying relationships between modules, and finding key driver genes. The goal is to discover how groups of interacting genes work together to impact phenotypic traits.
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
WGCNA: an R package for weighted correlation network analysis
1. Biological Sciences faculty
Biophysics Department
WGCNA: an R package for weighted
correlation network analysis
Presented By
Alireza Doustmohammadi
Graduate Student in Bioinformatics
December 2019
WGCNA Tarbiat Modares University 1 of 37
4. Introduction
4 of 37
Identify modules :
[Daniel H. Geschwind & Genevieve Konopka. Neuroscience in the era of functional genomics and systems biology, Nature 461, 908-915]
WGCNA Tarbiat Modares University
Gene group A
Gene group C
Gene group B
5. Introduction
5 of 37WGCNA Tarbiat Modares University
weight
glucose
insulin
Blood
pressure
cholesterol
Relation modules and clinical traits
Identify driver genes
6. Introduction
6 of 37WGCNA Tarbiat Modares University
Data Cleaning & Preprocessing
Construct weighted correlation network
Identify modules
Relate modules to external information
Study module relationships
Find the key drivers in interesting modules
WGCNA workflow:
7. Introduction
7 of 37
Data description:
WGCNA Tarbiat Modares University
livers of female & male mouse of a specific F2 intercross
[https://www.slideserve.com/cais/statistical-methods-for-quantitative-trait-loci-qtl-mapping]
livers of female mouse of a specific F2 intercross
livers of male mouse of a specific F2 intercross
Clinical Traits
Gene Annotation
[https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/
Rpackages/WGCNA/Tutorials/index.html]
8. Data PreprocessingConstruct Network
Data Cleaning & Preprocessing
WGCNA
8 of 37WGCNA Tarbiat Modares University
Preprocessing:
Missing values
Outlier microarray samples
Modules Analysis
9. Data Cleaning & Preprocessing
WGCNA
9 of 37WGCNA Tarbiat Modares University
Preprocessing:
Missing values
Data PreprocessingConstruct NetworkModules Analysis
10. Data input and cleaning
WGCNA
10 of 37WGCNA Tarbiat Modares University
Preprocessing:
Outlier microarray samples
Data PreprocessingConstruct NetworkModules Analysis
11. Construct weighted correlation network
WGCNA
11 of 37WGCNA Tarbiat Modares University
Automatic, one-step network construction
Step-by-step network construction
block-wise network construction
WGCNA can uses parallel computation.
Data PreprocessingConstruct NetworkModules Analysis
12. Construct weighted correlation network
WGCNA
12 of 37WGCNA Tarbiat Modares University
correlation matrix
adjacency matrix
Weighted correlation network
Data PreprocessingConstruct NetworkModules Analysis
13. Construct weighted correlation network
WGCNA
13 of 37WGCNA Tarbiat Modares University
Construct adjacency matrix:
Strong correlation
weak correlation
Data PreprocessingConstruct NetworkModules Analysis
15. Construct weighted correlation network
WGCNA
15 of 37WGCNA Tarbiat Modares University
Objective function:
Pick lowest possible that leads to an approximately
scale-free network topology
[https://www.researchgate.net/figure/Scale-Free-Network-Left-Power-Law-Degree-Distribution-curve-Right-on-log-log-scale_fig1_310261624]
[https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-559]
Data PreprocessingConstruct NetworkModules Analysis
16. Construct weighted correlation network
WGCNA
16 of 37WGCNA Tarbiat Modares University
Data PreprocessingConstruct NetworkModules Analysis
17. Identify modules
WGCNA
17 of 37WGCNA Tarbiat Modares University
Modules = co-express genes
Topological Overlap Measure(TOM) = Similarity between genes
Data PreprocessingConstruct NetworkModules Analysis
18. Identify modules
WGCNA
18 of 37WGCNA Tarbiat Modares University
𝑪𝒐𝒎𝒑𝒖𝒕𝒆 𝒅𝒊𝒔𝒔𝒊𝒎𝒊𝒍𝒂𝒓𝒊𝒕𝒚 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒈𝒆𝒏𝒆𝒔:
Topological Overlap Measure (TOM) matrix:
[https://www.researchgate.net/post/What_do_adjacency_matrix_and_Topology_Overlap_Matrix_from_WGCNA_package_tell_about_the_data]
Data PreprocessingConstruct NetworkModules Analysis
19. Identify modules
WGCNA
19 of 37WGCNA Tarbiat Modares University
𝑪𝒐𝒎𝒑𝒖𝒕𝒆 𝒅𝒊𝒔𝒔𝒊𝒎𝒊𝒍𝒂𝒓𝒊𝒕𝒚 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒈𝒆𝒏𝒆𝒔:
Topological Overlap Measure (TOM) matrix:
[https://link.springer.com/article/10.3786/s12859-019-2598-7]
Data PreprocessingConstruct NetworkModules Analysis
20. Identify modules
WGCNA
20 of 37WGCNA Tarbiat Modares University
Perform hierarchical clustering of genes
Data PreprocessingConstruct NetworkModules Analysis
21. Identify modules
WGCNA
21 of 37WGCNA Tarbiat Modares University
Divide clustered genes into modules
Fix height branch cut
Dynamic Tree Cut
[https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/BranchCutting/Supplement.pdf]
Data PreprocessingConstruct NetworkModules Analysis
22. Identify modules : Divide clustered genes into modules
WGCNA
22 of 37WGCNA Tarbiat Modares University
Dynamic Tree Cut algorithm
[https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/BranchCutting/Supplement.pdf]
Data PreprocessingConstruct NetworkModules Analysis
23. Identify modules : Divide clustered genes into modules
WGCNA
22 of 37WGCNA Tarbiat Modares University
Dynamic Tree Cut
[https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/BranchCutting/Supplement.pdf]
0.2
0.3
0.4
0.4
0.5
0.6
0.7
0.8
Data PreprocessingConstruct NetworkModules Analysis
25. Identify modules
WGCNA
24 of 37WGCNA Tarbiat Modares University
Merge very similar modules
eigengene
1st principal component of the expression data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
88 614 316 311 257 235 225 212 158 153 121 106 102 100 94 91 78 76 65 58 58 48 34
modules
[https://www.quora.com/What-are-eigengenes-and-gene-modules]
Data PreprocessingConstruct NetworkModules Analysis
26. Identify modules
WGCNA
25 of 37WGCNA Tarbiat Modares University
Merge very similar modules
Calculate eigengene
• Example: module z eigengene
genA genB gen C ….. gen N
S1
S2
S3
…..
S123
PC 1
S1
S2
S3
…..
S123
PCA
Module z eigengene
Data PreprocessingConstruct NetworkModules Analysis
27. Identify modules
WGCNA
26 of 37WGCNA Tarbiat Modares University
Merge very similar modules
Modules eigengene
Data PreprocessingConstruct NetworkModules Analysis
28. Identify modules
WGCNA
27 of 37WGCNA Tarbiat Modares University
Merge very similar modules
Perform hierarchical clustering of eigengenes
Data PreprocessingConstruct NetworkModules Analysis
29. Identify modules
WGCNA
28 of 37WGCNA Tarbiat Modares University
Merge very similar modules
Merge modules
Data PreprocessingConstruct NetworkModules Analysis
31. Construct weighted correlation network & Identify modules
WGCNA
30 of 37WGCNA Tarbiat Modares University
Step-by-step network construction
Automatic, one-step network construction
block-wise network construction
Data PreprocessingConstruct NetworkModules Analysis
32. Construct weighted correlation network & Identify modules
WGCNA
31 of 37WGCNA Tarbiat Modares University
Step-by-step network construction
Automatic, one-step network construction
block-wise network construction
Relation between block size and memory space:
Data PreprocessingConstruct NetworkModules Analysis
33. Construct weighted correlation network & Identify modules
WGCNA
32 of 37WGCNA Tarbiat Modares University
block-wise network construction:
Split block
Data PreprocessingConstruct NetworkModules Analysis
34. Construct weighted correlation network & Identify modules
WGCNA
33 of 37WGCNA Tarbiat Modares University
block-wise network construction:
Data PreprocessingConstruct NetworkModules Analysis