A presentation to a lay audience at Melbourne Knowledge Week on how bacteria are a part of our life and what we are doing with genomics to manage them.
9. We had a bunch of nice long DNA
(each 4 million letters long)
We got back millions of short DNA
(each only 200 letters long)
We want our nice long DNA back!
(please)
Can’t always get what you want
11. ● No box
● Millions of pieces
● Missing and duplicate pieces
● Broken pieces
● No corner or edge pieces
→ Usually end up with ~200 sequences
Like a jigsaw puzzle, but ...
12. Contains ~4,000 genes
Each gene is ~800 letters long
Genes start and end with special triplets
Finding genes
←ATGCATGATTAGCTTTTAGTCTTATAATGTCTTATATATCGCATTTAAGCCCTGATTCTATGAATG→
Genome is ~4,000,000 letters long
13. ● Identify new species
● Find resistance genes
● Understand evolution
● Trace outbreak origin
Applications
14. 2000 finished genomes
10,000 assembled draft genomes
200,000 downloadable genomes
2,000,000 sitting on USB disks?
Genome assembly is different
- RAM more useful than CPU
Computational challenge