Phylotastic metagenomics


Published on

Examples of metagenomics use cases for the Phylotastic! web tools. Presented a the Phylotastic hackathon, June 4-8 2012:

Published in: Technology
1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • You can ignore all the other bad –Omic words you hear – conservome?!
  • Regardless of methdology, focus on:Species assemblages and taxonomic diversityCommunity patterns over space and time – Cosmoplitanism or Regionally restricted?Community changes as a result of natural/human disturbance
  • Marker genes across all domains – bacteria, archeaa,eukaryotes & virusesrRNA genes,Protein-coding orthologs, lineage-specific gene families----- Meeting Notes (5/22/12 10:42) -----Marker genes to make higher level taxon assignmentsLineages-specific gene families to narrow down assignments to lower taxonomic levels
  • Head-tail patterns may help us to delimit species and separate out rare taxa (who will have Head-tail patterns) from errors (no apparent pattern)----- Meeting Notes (5/22/12 10:42) -----pplacer and EPA are great tools developed in the last few years.
  • I see name matching as not just species names, but matching between NCBI taxon ID synonyms
  • rRNAdata especially needs to be interpreted in a phylogenetic contextPhylo placement allows:1) More robust taxon assignments2) ID divergent/undersampled lineages (that aren't apparent via BLAST searches)What's the ecology/function of these divergent lineages?
  • Phylotastic metagenomics

    1. 1. Phylotastic!Metagenomics Use Cases Holly Bik, UC Davis
    2. 2. -Omic Dictionary• Marker gene studies – amplification of a conserved homologous gene (18S, 16S rRNA) from environmental samples• Metagenomics – shotgun sequencing of random genomic fragments from environmental DNA
    3. 3. Biodiversity? Phylogeography? Environmental Impacts?
    4. 4. Extract Environmental DNA EASY EASY Amplify rRNA Diverse marine communityCommunity analysis VERY Difficult! EASY High-throughput sequencing
    5. 5.
    6. 6. Explicitly Phylogenetic Approaches Aligned Evolutionary environmental Placement of sequences short reads Guide Tree
    7. 7. Tree Reconciliation in PhyloSift Environmental Named Sequences Taxa
    8. 8. Pruning Subtrees from Megatrees• User inputs a list of reference sequences with NCBI Taxon IDs  Pulls down tree topology• Unclassified sequences in a reference phylogeny could be “named” with the most appropriate higher level taxon
    9. 9. Name Matching and TNRS• Different taxonomic synonyms have different NCBI taxon IDS – Shigella: 620 and E.coli: 562 – Species/genus boundaries still debated• TNRS would provide a “matrix” for standardizing IDs – E.g. E.coli/Shigella supergroup: 12345
    10. 10. Integrating Comparative Data• Metadata is a standard part of any well- constructed metagenomics study – Depth (marine samples) – Aquatic/Terrestrial – Temperature – pH – Dissolved Oxygen
    11. 11. Integrating Comparative Data• Metadata also includes information about the sequences themselves – Abundance information – Distribution across sample sites Branch thickness can be incorporated into XML tree files and visualized within Archaeopteryx
    12. 12. Mashup with Online Data• Pull down NCBI metadata for a given reference sequence accession – Habitat metadata – Ecological associations –e.g. symbionts – Genome availability – Related publications – Pictures, etc. would be awesome
    13. 13. Exploring Trees Ecologically, wh at are these reference taxa doing??
    14. 14. Pertinent info for biologicalinterpretations of DNA data!!