Genomics
and
Metagenomics
Mads Albertsen
Introduction to community systems microbiology
2013

CENTER FOR MICROBIAL COMMUNITIES
Agenda
Genomics
•
•
•
•

Introduction
Assembly
Validation
Metabolic reconstruction (SM @ Thursday)

Metagenomics
• History
• Pitfalls
• Potentials

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction

Genome = Parts list of a single genome
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction

How to get from sequenced DNA to metabolic model?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction
Wet lab work

Shear DNA

Extract DNA

Sequence

Bioinformatics
N
Reads
50-500 bp

Assembly

Contigs
1kb – 100 kbp

Scaffolding

N

Scaffolds
Hopefully Mbp

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Definitions

> 1 kbp insert

A sequenced piece of DNA

Paired-end read

Sequencing both ends of a short DNA fragment

Mate-pair read

Sequencing both ends of a long DNA fragment

The length of the DNA fragment

Contig

300-600 bp insert

Read

Insert size

50-500 bp

A set of overlapping DNA segments that
represents a consensus region of DNA

Scaffold

Contigs separated by gaps of known length

Coverage

The number of times a specific position in the
genome is covered by reads

length

N

4x

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Genome

Fragment

Sequence
Paired-end reads

Assemble

Contig 1

Contig 10
Scaffold 1

Inspiration: http://goo.gl/VOZVVg

Contig 19
Scaffold 2

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Sequencing

Genome
(3.000.000 letters)

Inspiration: http://goo.gl/VOZVVg

Assembly

Reads
(50-500 letters each)

Genome
(3.000.000 letters)

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

“It was the best of times, it was the worst of times, it was the age of
wisdom, it was the age of foolishness, it was the epoch of belief, it was
the epoch of incredulity,.... “
Dickens, Charles. A Tale of Two Cities. 1859. London: Chapman Hall

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Way too much data to make all vs. all comparison
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly
Step 1: Convert reads into kmers
Reads
theageofwi

sthebestof

astheageof

worstoftim

Imesitwast

the
hea
eag
age
geo
eof
ofw
fwi

sth
the
heb
ebe
bes
est
sto
tof

ast
sth
the
hea
eag
age
geo
eof

wor
ors
rst
sto
tof
oft
fti
tim

ime
mes
esi
sit
itw
twa
was
ast

Kmers (k = 3)
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly
Step 2: Join kmers with n-1 overlap
ast

sth

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

the

hea

eag

age

geo

eof

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly
Step 2: Join kmers with n-1 overlap
ast

eag
eag

age
age

geo
geo

eof
eof

ofw

ebe

bes

est

sto
sto

tof
tof

wor

ast

hea
hea
heb

sth
sth

the
the
the

ors

rst

was

fwi

oft

fti

twa

itw

sit

esi

mes

ime

tim

Do the same for all reads…
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly
Step 3: Simplify the graph

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Contigs
It was the

=

≠

incredulity

age
be

st
epoch

times
wor

wisdom

of
foolishness

belief

“It was the best of times, it was the worst of times, it was the age of
wisdom, it was the age of foolishness, it was the epoch of belief, it was
the epoch of incredulity,.... “
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

Dickens, Charles. A Tale of Two Cities. 1859. London: Chapman Hall

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

What is the minimum kmer size that
results in a single contig?

Kmer = 3

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

What is the minimum kmer size that
results in a single contig?

Kmer = 3

Kmer = 10 Itwasthebestoftimesitwastheworstoftimesitwastheageofwisdomitwastheageoffoolishnessitwastheepochofbeliefitwastheepochofincredulity
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Repeat = repeated DNA sequence
that can’t be spanned by reads

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly

Why not just increase the kmer size?

Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assembly
theageofwi
Kmer = 3

the
hea
eag
age
geo
eof
ofw
fwi

Kmer = 9

theageofw
heageofwi
Kmers with errors = 2/2

Errors!

Kmers with errors = 3/8
Example: http://goo.gl/nMWDAk
Velvet example courtesy of J. Leipzig 2010

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

I’ve assembled my 4.3 Mbp genome
into 25 scaffolds
with a N50 of 553 kbp.
Is it a good assembly?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

N50

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation
Estimating repeat content

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

4 repeats in 2 copies each

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation
How could I close this genome?
4 repeats in 2 copies each

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

How complete is the genome?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation
Survey of essential single copy genes
across sequenced phyla

Genes

100-106 Essential
single copy genes
(can also be used to identify contamination)

Phyla
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

Inspect the assembly

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Validation

• N50 does not make much sense
• Repeat content versus the number of scaffolds
• Calculate the percentage of essential genes
• Inspect the assembly

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metabolic reconstruction

4.3 Mbp genome
… and so what?
(@Thursday)

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics
Mads Albertsen
Introduction to community systems microbiology
2013

CENTER FOR MICROBIAL COMMUNITIES
Introduction

Genome = Parts list of a single genome
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction

Photo: D. Kunkel; color, E. Latypova

Metagenome = Parts list of the community
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction

”...functional analysis of the collective genomes of soil
microflora, which we term the metagenome of the soil.”
- J. Handelsman et al., 1998

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction
”...functional analysis of the collective genomes of soil
microflora, which we term the metagenome of the soil.”
- J. Handelsman et al., 1998

PubMed: metagenom*[Title/Abstract]

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction
”...functional analysis of the collective genomes of soil
microflora, which we term the metagenome of the soil.”
- J. Handelsman et al., 1998

Sequencing costs

PubMed: metagenom*[Title/Abstract]

http://www.genome.gov/sequencingcosts/

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Introduction

Metagenomics ≠ Amplicon sequencing

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Sequencing and assembly

≈3.000.000 bp
pr. genome

150 bp reads

≈1000 bp+
contigs

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Assigning information

Function

Contigs
Databases

Binning

Taxonomy
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What have metagenomics been used for?
Exploration

Rusch et al., 2007 Plos Biology
• 6.3 Gbp of sequence (2x Human genomes,
2000 x Bacterial genomes)
• Most sequences were novel compared to
the databases

Qin et al., 2010 Nature
•
•
•
•

127 Human gut metagenomes
600 Gbp sequence (200 x Human genomes)
3.3 million genes identified
Minimal gut metagenome definded

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What have metagenomics been used for?
Comparative

Dinsdale et al., 2008 Nature

• A characteristic microbial fingerprint for
each of the nine different ecosystem types

Specific functions

Hess et al., 2011 Science

• Identified 27.755 putative carbohydrate-active
genes from a cow rumen metagenome
• Expressed 90 candidates of which 57% had
enzymatic activity against cellulosic substrates

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What have metagenomics been used for?
Extracting genomes

Garcia Martin et al., 2006 Nat. Biotechnol.

Albertsen et al., 2013 Nat. Biotechnol.

• Genome extraction from low complexity
metagenome
• Candidatus Accumulibacter phosphatis
• The first genome of a polyphosphate
accumulating organism (PAO) with a major
role en enhanced biological phosphorus
removal

• Genome extraction of low abundant species
(< 0.1%) from metagenomes
• First complete TM7 genome
• Access to genomes of the ”uncultured
majority”

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Pitfalls
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics made easy

Great resources – but use with care
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
MG-RAST example

Contigs

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Dataset overview

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Taxonomy and Function overview

Taxonomy

Function

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Compare with other samples
Samples

Functional categories

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Pitfalls
You always get billions of data!

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Pitfalls
Is your DNA extraction OK?
... and the samples you want to compare with?

Did you sequence enough?
Did you know the GC bias of your protocol?
Did you normalize for sequencing depth?
Did you use the same sequencing platform?

Assembly = data not quantitative!
Are you comparing assembled data with reads?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Databases

Contigs
Databases

Annotated metagenome

...you only see what is in the database

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What is in the databases?

Finshed Genomes in IMG
Vs.
Greengenes 16S rRNA database

Genomes
16S
Phyla
29
90
Class
46
249
Order
100
405
Species 1268
99322*
*97% clustering

Note: only including 1 strain pr. species

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
MG-RAST example

Contigs

650.000 EBPR proteins with taxonomy assigned

How similar are they to the
genomes in the database?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Sludge microbes vs. Database genomes
650.000 EBPR proteins

Note: not abundance weighted

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Sludge microbes vs. Database genomes
650.000 EBPR proteins
1.260.000 Human gut
Qin et al., 2010 Nature
RAST ID: 4448044.3

Note: not abundance weighted

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Sludge microbes vs. Database genomes
The 7 genera with most EBPR proteins assigned

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
What is the effect of not having closely related
genomes in the database?

1. Remove a genome from the database

2. Search the removed genome against the database

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Accumulibacter phosphatis

blastp

4326 proteins
Best hit

Related genomes
Bacteria
1268
Proteobacteria
564
Betaproteobacteria
84
Rhodocyclales
5
Rhodocyclaceae
5
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Accumulibacter phosphatis

blastp
Azoarcus
4326 proteins
Best hit

Related genomes
Bacteria
1268
Proteobacteria
564
Betaproteobacteria
84
Rhodocyclales
5
Rhodocyclaceae
5
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Accumulibacter phosphatis

blastp

4326 proteins
MEGAN LCA
Lowest common ancester (LCA) approach:
Hit 1: Beta-proteobacteria 80% ID
Hit 2: Gamma-proteobacteria 79% ID
Hit 3: Actinobacteria 59% ID
Assigned to Proteobacteria

Related genomes
Bacteria
1268
Proteobacteria
564
Betaproteobacteria
84
Rhodocyclales
5
Rhodocyclaceae
5
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Accumulibacter phosphatis

blastp

4326 proteins
MEGAN LCA
Lowest common ancester (LCA) approach:
Hit 1: Beta-proteobacteria 80% ID
Hit 2: Gamma-proteobacteria 79% ID
Hit 3: Actinobacteria 59% ID

Bacteria 325
Beta- 853

Genus

4326 proteins:
• 27% correctly
classified on
genus level
• 54% not
assigned the
correct class
• 101 genera
identified
Rhodocyclaceae 1149

Assigned to Proteobacteria

Proteobacteria 860

Related genomes
Bacteria
1268
Proteobacteria
564
Betaproteobacteria
84
Rhodocyclales
5
Rhodocyclaceae
5

No hits 261
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Phylum
Nitrospira defluvii

blastp

4268 proteins:
• 1% correctly
classified on
phylum level

MEGAN LCA

Related genomes
Bacteria
Nitrospirae

1268
3

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Nitrospira defluvii

blastp

MEGAN LCA
+
KEGG

What about function?

Related genomes
Bacteria
Nitrospirae

1268
3

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Nitrospira defluvii

blastp

MEGAN LCA
+
KEGG

Related genomes
Bacteria
Nitrospirae

1268
3

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Nitrospira defluvii

blastp

MEGAN LCA
+
KEGG

Related genomes
Bacteria
Nitrospirae

1268
3

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Implication of missing genomes
Function A

Function B

Function C

Function D

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Pitfalls
You always get billions of data!

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics

”If you want to
understand the ecosystem
you need to
understand the individual species
in the ecosystem”

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics

Lion + Eagle ≠ Flying Lion
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Potentials
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Who - when, where and why?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
How do we get the genomes?

Culturing
Few microorganisms can be easily cultured (<<5%)
Microorganisms needs to be studied in their environment

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
How do we get the genomes?
What you think you study

What you actually study

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
How do we get the genomes?

Culturing
Few microorganisms can be easily cultured (<<5%)
Microorganisms needs to be studied in their environment

Single cell genomics
Only routinely performed in specialized labs
Very incomplete genomes (mean 40%, range 10-90%)
https://www.bigelow.org/

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
How do we get the genomes?

Culturing
Few microorganisms can be easily cultured (<<5%)
Microorganisms needs to be studied in their environment

Single cell genomics
Only routinely performed in specialized labs
Very incomplete genomes (mean 40%, range 10-90%)
https://www.bigelow.org/

Metagenomics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics
Reads
DNA extraction
Sequencing
100++ Abundant
species (≈3 Mbp each)

100-150 bp

Assembly
Why not full
genomes?

Contigs

1000+ bp

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics
Reads
DNA extraction
Sequencing
100++ Abundant
species (≈3 Mbp each)

100-150 bp

Assembly
Why not full
genomes?

Contigs

1000+ bp

1. Micro-diversity
2. Separation of genomes (Binning)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Extracting genomes

Not 1 strain

AAAAAAAAAAAAAA
AAAAAAAAATAAAA
AAAAAAAAACAAAA

What you get
TAAAA

Assembly

AAAAAAAAA
AAAAA

CAAAA

Many closely related strains

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Extracting genomes

High micro-diversity

Low micro-diversity

Short term
enrichment

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenomics
Reads
DNA extraction
Sequencing
100++ Abundant
species (≈3 Mbp each)

100-150 bp

Assembly
Why not full
genomes?

Contigs

1000+ bp

1. Micro-diversity
2. Separation of genomes (Binning)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
PhD student

Genomic signatures:
- GC / Codon usage
- Tetranucleotide frequency + statistical method

”Binning”

Complex sample

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
PhD student

Genomic signatures:
- GC / Codon usage
- Tetranucleotide frequency + statistical method

”Binning”

Complex sample

Problems:
- Short pieces of sequence (1-10kbp)
- Local sequence divergence
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning

Abundance

Abundance

Sequence composition-independent binning

Sample 1

Sample 2

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning

Abundance

Abundance

Sequence composition-independent binning

Sample 2

Abundance Sample 2

Sample 1

Abundance Sample 1
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
1. Reduce micro-diversity

Abundance Sample 2

2. Use multiple related samples

Abundance Sample 1

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
1. Reduce micro-diversity

Abundance Sample 2

2. Use multiple related samples

Abundance Sample 2

Abundance Sample 1

Abundance Sample 1

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning

• Nitrospira enrichment
running for years
• 3 dominant species
• No micro-diversity

H. Daims & C. Dorninger, DOME, University of Vienna

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
SBR reactor

Full-scale EBPR plant

Short term
enrichment

Days
Albertsen et al., 2013 Nat. Biotech.

1. Reduction of (micro)-diversity
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
SBR reactor

Full-scale EBPR plant

Short term
enrichment

2. Two
different
DNA
extraction
methods

Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Colored using a set of 100 phylogenetic marker genes

Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Colored using a set of 100 phylogenetic marker genes

TM7-1 (1.6%)

TM7-2 (0.7%)
TM7-3 (0.2%)
TM7-4 (0.06%)

Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Colored using a set of 100 phylogenetic marker genes

Zoom on target

TM7-2 (0.7%)

Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Colored using a set of 100 phylogenetic marker genes

Zoom on target

PCA on genomic
signatures

PC2

TM7-2 (0.7%)

TM7-2
PC1
Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Colored using a set of 100 phylogenetic marker genes

Candidatus Saccharimonas aalborgensis
TM7-1 (1.6%)

Candidate phylum TM7

Saccharibacteria

Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Genome validation
Assembly inspection

Essential single copy genes
Genes (HMM models)

Phyla
Albertsen et al., 2013 Nat. Biotech.

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Multi-metagenome
http://madsalbertsen.github.io/multi-metagenome/
Short: goo.gl/0ctA3
•
•
•
•
•

Albertsen et al., 2013 Nat. Biotech.

Guides
Workflow scripts
Example data
All the code
Reccomendations

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Complex samples

...add more samples!

S. M. Karst, AAU

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
It’s just a potential!

..and a poorly translated description of it.
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Understanding ecosystems
Metabolites

Meta-bolomics

Proteins

Extraction

mRNA

Meta-proteomics

Meta-transcriptomics

DNA

In Situ methods

Community structure

Microbial functions

Meta-genomics

Microbial needs

P-Removal:
N-Removal:
-Removal:
Foaming:
Ethanol production:

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Questions?

ma@bio.aau.dk

@MadsAlbertsen85
MadsAlbertsen

Per H. Nielsen
Simon J. McIllroy
Søren M. Karst
EB group

University of Queensland

C. Dorringer H. Daims

M. Wagner

University of Vienna

G.W. Tyson

P. Hugenholtz

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

[2013.10.29] albertsen genomics metagenomics

  • 1.
    Genomics and Metagenomics Mads Albertsen Introduction tocommunity systems microbiology 2013 CENTER FOR MICROBIAL COMMUNITIES
  • 2.
    Agenda Genomics • • • • Introduction Assembly Validation Metabolic reconstruction (SM@ Thursday) Metagenomics • History • Pitfalls • Potentials CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 3.
    Introduction Genome = Partslist of a single genome CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 4.
    Introduction How to getfrom sequenced DNA to metabolic model? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 5.
    Introduction Wet lab work ShearDNA Extract DNA Sequence Bioinformatics N Reads 50-500 bp Assembly Contigs 1kb – 100 kbp Scaffolding N Scaffolds Hopefully Mbp CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 6.
    Definitions > 1 kbpinsert A sequenced piece of DNA Paired-end read Sequencing both ends of a short DNA fragment Mate-pair read Sequencing both ends of a long DNA fragment The length of the DNA fragment Contig 300-600 bp insert Read Insert size 50-500 bp A set of overlapping DNA segments that represents a consensus region of DNA Scaffold Contigs separated by gaps of known length Coverage The number of times a specific position in the genome is covered by reads length N 4x CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 7.
    Assembly Genome Fragment Sequence Paired-end reads Assemble Contig 1 Contig10 Scaffold 1 Inspiration: http://goo.gl/VOZVVg Contig 19 Scaffold 2 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 8.
    Assembly Sequencing Genome (3.000.000 letters) Inspiration: http://goo.gl/VOZVVg Assembly Reads (50-500letters each) Genome (3.000.000 letters) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 9.
    Assembly “It was thebest of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity,.... “ Dickens, Charles. A Tale of Two Cities. 1859. London: Chapman Hall Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 10.
    Assembly Way too muchdata to make all vs. all comparison Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 11.
    Assembly Step 1: Convertreads into kmers Reads theageofwi sthebestof astheageof worstoftim Imesitwast the hea eag age geo eof ofw fwi sth the heb ebe bes est sto tof ast sth the hea eag age geo eof wor ors rst sto tof oft fti tim ime mes esi sit itw twa was ast Kmers (k = 3) Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 12.
    Assembly Step 2: Joinkmers with n-1 overlap ast sth Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 the hea eag age geo eof CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 13.
    Assembly Step 2: Joinkmers with n-1 overlap ast eag eag age age geo geo eof eof ofw ebe bes est sto sto tof tof wor ast hea hea heb sth sth the the the ors rst was fwi oft fti twa itw sit esi mes ime tim Do the same for all reads… Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 14.
    Assembly Step 3: Simplifythe graph Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 15.
    Assembly Contigs It was the = ≠ incredulity age be st epoch times wor wisdom of foolishness belief “Itwas the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity,.... “ Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 Dickens, Charles. A Tale of Two Cities. 1859. London: Chapman Hall CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 16.
    Assembly What is theminimum kmer size that results in a single contig? Kmer = 3 Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 17.
    Assembly What is theminimum kmer size that results in a single contig? Kmer = 3 Kmer = 10 Itwasthebestoftimesitwastheworstoftimesitwastheageofwisdomitwastheageoffoolishnessitwastheepochofbeliefitwastheepochofincredulity Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 18.
    Assembly Repeat = repeatedDNA sequence that can’t be spanned by reads Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 19.
    Assembly Why not justincrease the kmer size? Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 20.
    Assembly theageofwi Kmer = 3 the hea eag age geo eof ofw fwi Kmer= 9 theageofw heageofwi Kmers with errors = 2/2 Errors! Kmers with errors = 3/8 Example: http://goo.gl/nMWDAk Velvet example courtesy of J. Leipzig 2010 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 21.
    Validation I’ve assembled my4.3 Mbp genome into 25 scaffolds with a N50 of 553 kbp. Is it a good assembly? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 22.
    Validation N50 CENTER FOR MICROBIALCOMMUNITIES | AALBORG UNIVERSITY
  • 23.
    Validation Estimating repeat content CENTERFOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 24.
    Validation 4 repeats in2 copies each CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 25.
    Validation How could Iclose this genome? 4 repeats in 2 copies each CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 26.
    Validation How complete isthe genome? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 27.
    Validation Survey of essentialsingle copy genes across sequenced phyla Genes 100-106 Essential single copy genes (can also be used to identify contamination) Phyla CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 28.
    Validation Inspect the assembly CENTERFOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 29.
    Validation • N50 doesnot make much sense • Repeat content versus the number of scaffolds • Calculate the percentage of essential genes • Inspect the assembly CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 30.
    Metabolic reconstruction 4.3 Mbpgenome … and so what? (@Thursday) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 31.
    Metagenomics Mads Albertsen Introduction tocommunity systems microbiology 2013 CENTER FOR MICROBIAL COMMUNITIES
  • 32.
    Introduction Genome = Partslist of a single genome CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 33.
    Introduction Photo: D. Kunkel;color, E. Latypova Metagenome = Parts list of the community CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 34.
    Introduction ”...functional analysis ofthe collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 35.
    Introduction ”...functional analysis ofthe collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 PubMed: metagenom*[Title/Abstract] CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 36.
    Introduction ”...functional analysis ofthe collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 Sequencing costs PubMed: metagenom*[Title/Abstract] http://www.genome.gov/sequencingcosts/ CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 37.
    Introduction Metagenomics ≠ Ampliconsequencing CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 38.
    Sequencing and assembly ≈3.000.000bp pr. genome 150 bp reads ≈1000 bp+ contigs CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 39.
  • 40.
    What have metagenomicsbeen used for? Exploration Rusch et al., 2007 Plos Biology • 6.3 Gbp of sequence (2x Human genomes, 2000 x Bacterial genomes) • Most sequences were novel compared to the databases Qin et al., 2010 Nature • • • • 127 Human gut metagenomes 600 Gbp sequence (200 x Human genomes) 3.3 million genes identified Minimal gut metagenome definded CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 41.
    What have metagenomicsbeen used for? Comparative Dinsdale et al., 2008 Nature • A characteristic microbial fingerprint for each of the nine different ecosystem types Specific functions Hess et al., 2011 Science • Identified 27.755 putative carbohydrate-active genes from a cow rumen metagenome • Expressed 90 candidates of which 57% had enzymatic activity against cellulosic substrates CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 42.
    What have metagenomicsbeen used for? Extracting genomes Garcia Martin et al., 2006 Nat. Biotechnol. Albertsen et al., 2013 Nat. Biotechnol. • Genome extraction from low complexity metagenome • Candidatus Accumulibacter phosphatis • The first genome of a polyphosphate accumulating organism (PAO) with a major role en enhanced biological phosphorus removal • Genome extraction of low abundant species (< 0.1%) from metagenomes • First complete TM7 genome • Access to genomes of the ”uncultured majority” CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 43.
    Pitfalls CENTER FOR MICROBIALCOMMUNITIES | AALBORG UNIVERSITY
  • 44.
    Metagenomics made easy Greatresources – but use with care CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 45.
    MG-RAST example Contigs CENTER FORMICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 46.
    Dataset overview CENTER FORMICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 47.
    Taxonomy and Functionoverview Taxonomy Function CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 48.
    Compare with othersamples Samples Functional categories CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 49.
    Pitfalls You always getbillions of data! CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 50.
    Pitfalls Is your DNAextraction OK? ... and the samples you want to compare with? Did you sequence enough? Did you know the GC bias of your protocol? Did you normalize for sequencing depth? Did you use the same sequencing platform? Assembly = data not quantitative! Are you comparing assembled data with reads? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 51.
    Databases Contigs Databases Annotated metagenome ...you onlysee what is in the database CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 52.
    What is inthe databases? Finshed Genomes in IMG Vs. Greengenes 16S rRNA database Genomes 16S Phyla 29 90 Class 46 249 Order 100 405 Species 1268 99322* *97% clustering Note: only including 1 strain pr. species CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 53.
    MG-RAST example Contigs 650.000 EBPRproteins with taxonomy assigned How similar are they to the genomes in the database? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 54.
    Sludge microbes vs.Database genomes 650.000 EBPR proteins Note: not abundance weighted CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 55.
    Sludge microbes vs.Database genomes 650.000 EBPR proteins 1.260.000 Human gut Qin et al., 2010 Nature RAST ID: 4448044.3 Note: not abundance weighted CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 56.
    Sludge microbes vs.Database genomes The 7 genera with most EBPR proteins assigned CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 57.
    Effect of missinggenomes What is the effect of not having closely related genomes in the database? 1. Remove a genome from the database 2. Search the removed genome against the database CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 58.
    Effect of missinggenomes Accumulibacter phosphatis blastp 4326 proteins Best hit Related genomes Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 59.
    Effect of missinggenomes Accumulibacter phosphatis blastp Azoarcus 4326 proteins Best hit Related genomes Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 60.
    Effect of missinggenomes Accumulibacter phosphatis blastp 4326 proteins MEGAN LCA Lowest common ancester (LCA) approach: Hit 1: Beta-proteobacteria 80% ID Hit 2: Gamma-proteobacteria 79% ID Hit 3: Actinobacteria 59% ID Assigned to Proteobacteria Related genomes Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 61.
    Effect of missinggenomes Accumulibacter phosphatis blastp 4326 proteins MEGAN LCA Lowest common ancester (LCA) approach: Hit 1: Beta-proteobacteria 80% ID Hit 2: Gamma-proteobacteria 79% ID Hit 3: Actinobacteria 59% ID Bacteria 325 Beta- 853 Genus 4326 proteins: • 27% correctly classified on genus level • 54% not assigned the correct class • 101 genera identified Rhodocyclaceae 1149 Assigned to Proteobacteria Proteobacteria 860 Related genomes Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5 No hits 261 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 62.
    Effect of missinggenomes Phylum Nitrospira defluvii blastp 4268 proteins: • 1% correctly classified on phylum level MEGAN LCA Related genomes Bacteria Nitrospirae 1268 3 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 63.
    Effect of missinggenomes Nitrospira defluvii blastp MEGAN LCA + KEGG What about function? Related genomes Bacteria Nitrospirae 1268 3 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 64.
    Effect of missinggenomes Nitrospira defluvii blastp MEGAN LCA + KEGG Related genomes Bacteria Nitrospirae 1268 3 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 65.
    Effect of missinggenomes Nitrospira defluvii blastp MEGAN LCA + KEGG Related genomes Bacteria Nitrospirae 1268 3 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 66.
    Implication of missinggenomes Function A Function B Function C Function D CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 67.
    Pitfalls You always getbillions of data! CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 68.
    Metagenomics ”If you wantto understand the ecosystem you need to understand the individual species in the ecosystem” CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 69.
    Metagenomics Lion + Eagle≠ Flying Lion CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 70.
    Potentials CENTER FOR MICROBIALCOMMUNITIES | AALBORG UNIVERSITY
  • 71.
    Who - when,where and why? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 72.
    How do weget the genomes? Culturing Few microorganisms can be easily cultured (<<5%) Microorganisms needs to be studied in their environment CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 73.
    How do weget the genomes? What you think you study What you actually study CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 74.
    How do weget the genomes? Culturing Few microorganisms can be easily cultured (<<5%) Microorganisms needs to be studied in their environment Single cell genomics Only routinely performed in specialized labs Very incomplete genomes (mean 40%, range 10-90%) https://www.bigelow.org/ CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 75.
    How do weget the genomes? Culturing Few microorganisms can be easily cultured (<<5%) Microorganisms needs to be studied in their environment Single cell genomics Only routinely performed in specialized labs Very incomplete genomes (mean 40%, range 10-90%) https://www.bigelow.org/ Metagenomics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 76.
    Metagenomics Reads DNA extraction Sequencing 100++ Abundant species(≈3 Mbp each) 100-150 bp Assembly Why not full genomes? Contigs 1000+ bp CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 77.
    Metagenomics Reads DNA extraction Sequencing 100++ Abundant species(≈3 Mbp each) 100-150 bp Assembly Why not full genomes? Contigs 1000+ bp 1. Micro-diversity 2. Separation of genomes (Binning) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 78.
    Extracting genomes Not 1strain AAAAAAAAAAAAAA AAAAAAAAATAAAA AAAAAAAAACAAAA What you get TAAAA Assembly AAAAAAAAA AAAAA CAAAA Many closely related strains CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 79.
    Extracting genomes High micro-diversity Lowmicro-diversity Short term enrichment CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 80.
    Metagenomics Reads DNA extraction Sequencing 100++ Abundant species(≈3 Mbp each) 100-150 bp Assembly Why not full genomes? Contigs 1000+ bp 1. Micro-diversity 2. Separation of genomes (Binning) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 81.
    Binning PhD student Genomic signatures: -GC / Codon usage - Tetranucleotide frequency + statistical method ”Binning” Complex sample CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 82.
    Binning PhD student Genomic signatures: -GC / Codon usage - Tetranucleotide frequency + statistical method ”Binning” Complex sample Problems: - Short pieces of sequence (1-10kbp) - Local sequence divergence CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 83.
    Binning Abundance Abundance Sequence composition-independent binning Sample1 Sample 2 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 84.
    Binning Abundance Abundance Sequence composition-independent binning Sample2 Abundance Sample 2 Sample 1 Abundance Sample 1 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 85.
    Binning 1. Reduce micro-diversity AbundanceSample 2 2. Use multiple related samples Abundance Sample 1 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 86.
    Binning 1. Reduce micro-diversity AbundanceSample 2 2. Use multiple related samples Abundance Sample 2 Abundance Sample 1 Abundance Sample 1 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 87.
    Binning • Nitrospira enrichment runningfor years • 3 dominant species • No micro-diversity H. Daims & C. Dorninger, DOME, University of Vienna CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 88.
    SBR reactor Full-scale EBPRplant Short term enrichment Days Albertsen et al., 2013 Nat. Biotech. 1. Reduction of (micro)-diversity CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 89.
    SBR reactor Full-scale EBPRplant Short term enrichment 2. Two different DNA extraction methods Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 90.
    Colored using aset of 100 phylogenetic marker genes Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 91.
    Colored using aset of 100 phylogenetic marker genes TM7-1 (1.6%) TM7-2 (0.7%) TM7-3 (0.2%) TM7-4 (0.06%) Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 92.
    Colored using aset of 100 phylogenetic marker genes Zoom on target TM7-2 (0.7%) Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 93.
    Colored using aset of 100 phylogenetic marker genes Zoom on target PCA on genomic signatures PC2 TM7-2 (0.7%) TM7-2 PC1 Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 94.
    Colored using aset of 100 phylogenetic marker genes Candidatus Saccharimonas aalborgensis TM7-1 (1.6%) Candidate phylum TM7 Saccharibacteria Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 95.
    Genome validation Assembly inspection Essentialsingle copy genes Genes (HMM models) Phyla Albertsen et al., 2013 Nat. Biotech. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 96.
    Multi-metagenome http://madsalbertsen.github.io/multi-metagenome/ Short: goo.gl/0ctA3 • • • • • Albertsen etal., 2013 Nat. Biotech. Guides Workflow scripts Example data All the code Reccomendations CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 97.
    Complex samples ...add moresamples! S. M. Karst, AAU CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 98.
    It’s just apotential! ..and a poorly translated description of it. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 99.
    Understanding ecosystems Metabolites Meta-bolomics Proteins Extraction mRNA Meta-proteomics Meta-transcriptomics DNA In Situmethods Community structure Microbial functions Meta-genomics Microbial needs P-Removal: N-Removal: -Removal: Foaming: Ethanol production: CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  • 100.
    Questions? ma@bio.aau.dk @MadsAlbertsen85 MadsAlbertsen Per H. Nielsen SimonJ. McIllroy Søren M. Karst EB group University of Queensland C. Dorringer H. Daims M. Wagner University of Vienna G.W. Tyson P. Hugenholtz CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Editor's Notes