Early Career Research Fellow at Royal Botanic Gardens, Kew
Jul. 7, 2015•0 likes•1,103 views
1 of 17
'Omics in extreme Environments (Lightweight bioinformatics)
Jul. 7, 2015•0 likes•1,103 views
Download to read offline
Report
Science
Presentation on lightweight bioinformatics (Raspi / cloud computing) for real-time field-based analyses.
Presented at iEOS2015, St. Andrews, 3-6th July 2015.
'Omics in extreme Environments (Lightweight bioinformatics)
1. Omics in extreme Environments
(Lightweight bioinformatics)!
Joe Parker"
Royal Botanic Gardens, Kew"
2. Compute time is (much) cheaper than you think"
"… and much cheaper than your time."
Physical portability requires software portability."
3. Kew"
One of the largest living and tissue collections in the
world: ca ~6000 genera (~1/3 plant genera)"
2020 Strateigic Output: Plant And Fungal Trees Of Life!
4. Why in the field"
• Spatial analysis"
• ID & naming"
• Image recognition"
9. The cloud"
• Power closely linked to
budget (as limited as)"
• Almost infinitely
scalable"
• Have to have a
connection to get data
up there (and down!)"
• Fiddly setup"
11. Workflow"
Setup
BLAST
2.2.30
CEGMA
genes
Short reads
Concatenate hits to
CEGMA alignments
Muscle
3.8.31
RAxML
7.2.8+
Set up workflow, binaries, and reference /
alignment data.
Deploy to machines.
Protein-protein blast reads (from MG-
RAST repository, Bass Strait oil field)
against 458 core eukaryote genes from
CEGMA. Keep only top hits. Use max.
num_threads available.
Append top hit sequences to CEGMA
alignments.
For each:
Align in MUSCLE using default
parameters
Infer de novo phylogeny in RAxML
under Dayhoff, random starting tree
and max. PTHREADS.
Output and parse times.
15. The cloud in practice"
• Fiddly setup, easy to
replicate"
• Need a connection to
get data up there (and
down!)"
16. Conclusions"
• Pi opportunities but not there yet, also you’ll
still need a connection unless you’re very
lucky.. "
• Installation in situ?"
• Consider cloud computing (connections can
only improve)"
• Portability of the workflow enhances
portability of the system!
– …which you should be embracing anyway for
reproducibility…"