Pathway prediction in (meta)genomes usingPathway Signature Genes Lucas Brouwers, MSc student, Nijmegen
Goal How to predict the metabolic capacity in a metagenomic sample, given incomplete data?Current practice:
Percentage of sequences assigned to pathways used for estimating pathway abundance
2 out of 5 sequences map to pathway X:Abundance of X is 40% We propose to identify the (metabolic) pathways present in a community, making fulluse of the metagenomic data available
Approach We use presence of OGs to predict presence of pathways Some species have pathway X X X X X OG A OG C X OG B PWY X X PresenceSignature Weak Signature AbsenceSignature
Approach 630 species and their OGs 1,200 (metabolic) pathways
Signature genes add information Do the signatures add information on the presence of a pathway? (after all some pathways are rare, others are ubiquitous) Specific information defined as a difference in Shannon entropy: how much extra information does the presence of an OG give about the presence of a pathway?
What makes a signature? N = 29,661 N = 46,176,543
Absence signatures in a pathway COG0437 (221 species): Fe-S-cluster-containing hydrogenases Formatedehydrogenase Formylmethanofuran-dehydrogenase Glycolaldehyde-dehydrogenase Nitrate reductase
Integrate signature scores with pathway prediction in metagenomes For quantitative analysis of pathways in metagenomic samples: Consider Chlorphyll biosynthesis:Sa: Average of all OG scores, of OGs in species without chlorophyll biosynthesis Sp: Average of all OG scores, of OGs in species with … Si: Average of all OG scores found in a metagenomic sample Correcting for genome sizes becomes …
Performance in sub-sampling
Sample percentage of OGs from every species in STRING
Predict percentage of species with pathway & compare with actual occurrence
Application to metagenomes Dinsdale, E. A., Edwards, R. A., Hall, D., Angly, F., Breitbart, M., Brulc, J. M., et al. (2008). Functional metagenomic profiling of nine biomes. Nature, 452 (7187), 629-32. Different biomes:
Subterranean
Hypersaline
Freshwater
Fish
Coral
Marine
Simulated metagenomes
Sample OGs according to species distribution in metagenome
Predict percentage of species with pathway & compare with actual occurrence
Sampling 1%
Pathway description of biome diversity Pathway occurrence was predicted in 35 metagenomic datasets Principal component analysis revealed the pathways responsible for differences between the biomes
Conclusions Pathway signature genes allow us to interpret biomes on a pathway level Genes do not need to be part of a pathway to predict its presence We can quantitatively and accurately describe the pathway content of a metagenome
In this presentation, the concept of pathway signat more
In this presentation, the concept of pathway signature genes is explained. These signatures are used in predicting the abundance of pathways in a metagenomically sequenced community.
This presentation was given on the M3 Special Interest Group during the ISMB in Stockholm, 2009. less
0 comments
Post a comment