When Is Hub Gene Selection Better
than Standard Meta-Analysis?
Peter Langfelder, Paul S. Mischel, and
Steve Horvath
2013
Protein interaction network of Caenorhabditis elegans
Almaas E J Exp Biol 2007;210:1548-1558
©2007 by The Company of Biologists Ltd
Hub, Module, Bottleneck
b. Red indicates
higher expression
and blue indicates
lower expression.
Larger nodes are
major bridges
between different
parts of a network
c. Major hub
Law et al. 2013 Systems virology: host-directed approaches to viral pathogenesis and drug targeting
Background
• Highly connected hub nodes are central to
network architecture
• Protein knockout experiments have shown
that hub proteins tend to be essential for
survival
– Yeast, Fly, Worm
• There is a debate about hub importance, but
authors argue hubs are often not important
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
The Main Question
• Does hub gene selection lead to more
meaningful gene lists than a standard
statistical analysis based on significance
testing?
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
rankPvalue: Scale Method
1. Scales the individual importance measures in
each study to mean 0 and variance 1
2. Averages the statistics and relies on the
central limit theorem to approximate the null
distribution of the resulting meta-analysis
statistic
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
rankPvalue: Rank Method
1. If assumptions of central limit theorem are not
met, then use the Rank method.
2. Replaces the values of importance measures by
rankings
3. Rankings are divided so that the resulting value
lies in the unit interval
4. Sum of ranking is meta analysis test statistic
5. Distribution can be estimated from convoluting
the distributions of K independent uniformly
distributed variables
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Table 1. Overview of meta-analysis methods used in this article.
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Question 1
• Are whole-network hug genes relevant or
should one exclusively focus on intramodular
hubs?
– Correlation network applications show that one
should focus on intramodular hubs in trait-related
modules
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Weighted Correlation Network
Analysis (WGCNA)
Finding consensus modules and intramodular hubs
• Can calibrate weighted networks before combining them
• It is straightforward to combine weighted networks across
independent data sets
• It provides module eigengenes that can be used to relate
modules to sample traits
• It affords measures of module membership, which can be
used for finding hub genes in consensus modules
• A trait-related consensus module is selected across the
individual data sets
• Variables with highest overall module membership are
identified
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Table 2. Overview of data sets used in this article.
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Biological Data Sets
• Genes associated with adenocarcinoma
survival time in human expression data
• CpGs hypermethylated with age in human
blood and brain methylation data
• Genes positively correlated with total
cholesterol in mouse liver expression data
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Figure S1: Lung Cancer Data
Figure S2: Lung Cancer Data
• Each row corresponds to one module identified in
the consensus module analysis
• First column shows meta-analysis Z statistic and
corresponding p-value
Question 4
• Do network-based gene selection strategies
lead to gene lists that are biologically more
informative than those based on standard
marginal approaches?
– Yes, gene selection based on intramodular
connectivity leads to biologically more informative
gene lists. In contrast, whole-network connectivity
leads to the least informative gene lists
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Figure 1. Meta-analysis of module membership leads to gene lists
with stronger functional enrichment.
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Figure 2. Marginal meta-analysis tends to lead to gene lists with
better validation in independent data.
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Figure 3. Simulation studies of gene screening success of meta-
analysis methods.
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
Key Findings
1. Hub genes defined with respect to whole-
network connectivity are often uninteresting
in correlation networks constructed from
data from higher organisms
2. Selecting intramodular hubs in a relevant
module often leads to gene lists with cleaner
biological annotation.
3. Marginal meta-analysis leads to superior
validation success of gene-trait associations
in 2 of 3 applications
Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505.
doi:10.1371/journal.pone.0061505
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505

Hub gene selection_ds

  • 1.
    When Is HubGene Selection Better than Standard Meta-Analysis? Peter Langfelder, Paul S. Mischel, and Steve Horvath 2013
  • 2.
    Protein interaction networkof Caenorhabditis elegans Almaas E J Exp Biol 2007;210:1548-1558 ©2007 by The Company of Biologists Ltd
  • 3.
    Hub, Module, Bottleneck b.Red indicates higher expression and blue indicates lower expression. Larger nodes are major bridges between different parts of a network c. Major hub Law et al. 2013 Systems virology: host-directed approaches to viral pathogenesis and drug targeting
  • 4.
    Background • Highly connectedhub nodes are central to network architecture • Protein knockout experiments have shown that hub proteins tend to be essential for survival – Yeast, Fly, Worm • There is a debate about hub importance, but authors argue hubs are often not important Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 5.
    The Main Question •Does hub gene selection lead to more meaningful gene lists than a standard statistical analysis based on significance testing? Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 6.
    rankPvalue: Scale Method 1.Scales the individual importance measures in each study to mean 0 and variance 1 2. Averages the statistics and relies on the central limit theorem to approximate the null distribution of the resulting meta-analysis statistic Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 7.
    rankPvalue: Rank Method 1.If assumptions of central limit theorem are not met, then use the Rank method. 2. Replaces the values of importance measures by rankings 3. Rankings are divided so that the resulting value lies in the unit interval 4. Sum of ranking is meta analysis test statistic 5. Distribution can be estimated from convoluting the distributions of K independent uniformly distributed variables Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 8.
    Table 1. Overviewof meta-analysis methods used in this article. Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 9.
    Question 1 • Arewhole-network hug genes relevant or should one exclusively focus on intramodular hubs? – Correlation network applications show that one should focus on intramodular hubs in trait-related modules Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 10.
    Weighted Correlation Network Analysis(WGCNA) Finding consensus modules and intramodular hubs • Can calibrate weighted networks before combining them • It is straightforward to combine weighted networks across independent data sets • It provides module eigengenes that can be used to relate modules to sample traits • It affords measures of module membership, which can be used for finding hub genes in consensus modules • A trait-related consensus module is selected across the individual data sets • Variables with highest overall module membership are identified Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 11.
    Table 2. Overviewof data sets used in this article. Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 12.
    Biological Data Sets •Genes associated with adenocarcinoma survival time in human expression data • CpGs hypermethylated with age in human blood and brain methylation data • Genes positively correlated with total cholesterol in mouse liver expression data Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 13.
    Figure S1: LungCancer Data
  • 14.
    Figure S2: LungCancer Data • Each row corresponds to one module identified in the consensus module analysis • First column shows meta-analysis Z statistic and corresponding p-value
  • 15.
    Question 4 • Donetwork-based gene selection strategies lead to gene lists that are biologically more informative than those based on standard marginal approaches? – Yes, gene selection based on intramodular connectivity leads to biologically more informative gene lists. In contrast, whole-network connectivity leads to the least informative gene lists Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 16.
    Figure 1. Meta-analysisof module membership leads to gene lists with stronger functional enrichment. Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 17.
    Figure 2. Marginalmeta-analysis tends to lead to gene lists with better validation in independent data. Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 18.
    Figure 3. Simulationstudies of gene screening success of meta- analysis methods. Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505
  • 19.
    Key Findings 1. Hubgenes defined with respect to whole- network connectivity are often uninteresting in correlation networks constructed from data from higher organisms 2. Selecting intramodular hubs in a relevant module often leads to gene lists with cleaner biological annotation. 3. Marginal meta-analysis leads to superior validation success of gene-trait associations in 2 of 3 applications Langfelder P, Mischel PS, Horvath S (2013) When Is Hub Gene Selection Better than Standard Meta-Analysis?. PLoS ONE 8(4): e61505. doi:10.1371/journal.pone.0061505 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061505

Editor's Notes

  • #14 Figure S1. Gene clustering tree based on the consensus Topological Overlap similarity across 8 adenocarcinoma (lung cancer) data sets. Each short vertical line (“leaf” of the tree) corresponds to a single gene. Branches of the clustering tree group together genes with high consensus similarity and hence define modules. The modules are indicated by colors below the clustering tree (line Modules). Grey color corresponds to genes not assigned to any of the modules. In this analysis, likely because of the poor reproducibility among the 8 data sets, we only find 5 small modules and most genes remain unassigned. The 8 color rows below the module indicator indicate the gene significance, defined as the correlation of survival time deviance with gene expression. Green color denotes negative correlation with survival deviance (i.e., the gene over-expression is associated with lower risk of death) while red color denotes positive correlation with survival deviance (i.e., the gene over-expression is associated with higher risk of death). This representation makes apparent the poor reproducibility of gene significance for survival time between the 8 data sets.
  • #15 Figure S2. Module significance for survival deviance in consensus analysis of 8 adenocarcinoma expression data sets. Each row of this table corresponds to one module identified in the consensus module analysis. Modules are labeled by the module number and (where appropriate) a GO-derived functional label. Columns 2–9 give correlations between the survival time deviance and the module eigengenes and the associated p-values. The first column shows the meta-analysis Z statistic and the corresponding p-value. We observe that module 2 exhibits a relatively weak but consistent association with survival time deviance across the 8 data sets. The meta-analysis Z score and p-value for module 2 are significant. The significant association and cell cycle-related functional annotation We note that while module 5 also attains nominal significance at the 0.05 level, a Bonferroni correction for the number of tested modules (5) would render the p-value non-significant. Functional enrichment analysis (Supplementary Table ??) did not reveal strong enrichment in biologically plausible categories, and we do not consider this module in the following.