SlideShare a Scribd company logo
1 of 25
Insights into the evolution and
development of planarian
regeneration from the genome of the
flatworm Girardia tigrina
SUJAI KUMAR
2014-07-24 VIENNA EURO EVODEVO
WHAT SHOULD
BIOINFORMATICS DO FOR
EVODEVO?
EVODEVO
SUJAI KUMAR
SUJAI KUMAR
"Winkel triple projection SW" by Strebe - Own work
Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons
http://commons.wikimedia.org/wiki/File:Winkel_triple_projection_SW.jpg
Cartoonist and
mathematics
teacher in
New Delhi
SUJAI KUMAR
Finding patterns in
sequences:
TIMSS 1999 video study
MS in
Educational
Psychology at
the University of
Illinois
SUJAI KUMAR
Self-organising
systems research
in New Delhi
SUJAI KUMAR
Sequenced four
nematode
genomes for PhD
in Blaxter Lab,
Edinburgh
SUJAI KUMAR
Planarian
regeneration
genomics in
Aboobaker Lab,
Oxford
Outline of this talk
1. Regeneration, planarian flatworms, and Girardia tigrina
2. Creating G tigrina genomic resources
3. Using these resources to understand regeneration
4. What should bioinformatics do for EvoDevo
1. Regeneration,
planarian flatworms,
and Girardia tigrina
Bely and Nyberg, 2010 DOI:10.1016/j.tree.2009.08.005
1. Regeneration,
planarian flatworms,
and Girardia tigrina
Kao, 2014. PhD Thesis “Transcriptome assembly and analysis
of the freshwater planarian Schmidtea mediterranea”
Platyhelminthes
Cestoda
Monogenea
Trematoda
Rhabditophora
Turbellaria
Tricladida
Macrostomorpha
Lecithoepitheliata
Rhabdocoela
T
T
T
T
T
T
Girardia tigrina
aboobakerlab.com/genomes
G
Schmidtea mediterranea
smedgd.neuro.utah.edu
G
Polycladida
1. Regeneration,
planarian flatworms,
and Girardia tigrina
• What we know already
• Some genes and pathways that are essential for WBR
• Some transcription expression profiles
• No transgenics in any planarian
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
Illumina HiSeq: Workhorse
Short paired reads
~$£€ 1,000 / 100 MegaBase
Mate pairs essential
PacBio: expensive
High quality fly genome
~$£€ 10,000 / 100 MegaBase
Nanopore –
not a game
changer
just yet
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Quality Control
• Raw data QC fastqc
• Preliminary assembly Blobology
• Separate components contaminants/ endosymbionts/ mitochondrial
• Assess insert sizes Bad mate pair libraries confound scaffolding
Each point is a contig
from a preliminary
assembly
(Caenorhabditis Sp. 5)
Taxon-annotated
GC-Coverage
(TAGC)
Plots
a.k.a
“Blobology”
GC Content
Readcoverage
Girardia tigrina
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Quality Control
• Raw data QC fastqc
• Preliminary assembly Blobology
• Separate components contaminants/ endosymbionts/ mitochondrial
• Assess insert sizes Bad mate pair libraries confound scaffolding
• Generate many assemblies
• ABySS, CLC, MaSurCA, SGA, Spades, ALLPATHS-LG
• Evaluate assemblies
• FRCbam, REAPR, CGAL
• CEGMA, alignments to known sequences
• Freeze and release
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• NOT a great assembly
• But it was GoodEnough™
• Next version with long-insert mate pairs
• Diploid, but high heterozygosity
Assembly version nGt.0.3 nGt.0.5
Raw read data ~500M short read pairs
160 GBases
Consolidating near identical
contigs
Total Span Gbases 1.898 1.500
Num Contigs 581,558 422,617
Span Contigs >10kb 541,653,308 536,575,093
Num Contigs >10kb 29,050 27,495
N50 5,751 6,827
CEGMA 45% 56%
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Gene prediction
• RNA-seq
• Predictors Augustus, SNAP, GeneMark
• Consolidators MAKER, EVM, ENSEMBL genebuild
• Evaluate use Annotation Edit Distance (AED) as a metric
• Functional annotation
• InterProScan, Trinotate, Blast2GO
• Community annotation
• WebApollo, Community Annotation Portal
Annotation
Version
Num of
Genes
Num of Genes with
AED>0.5
Mean aa
length
Num of Genes with
InterPro annotations
nGt.0.5.1 39,119 35,061 268 22,747
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Genome Browser
• Blast server
• Bulk data downloads
• Interface
• Badger, Tripal, InterMine, Ensembl
3. Using these resources to understand
regeneration
• Individual genes and pathways
• Transgenics
• Protein ortholog analysis
• 4 triclads, 1 other platyhelminth, 2 ecdysozoa, 4 deuterostomes
• 14k out of 40k G tigrina proteins in strict ortholog clusters
• ~8000 triclad-specific clusters
• ~800 triclad-specific clusters with all 4 species represented
• Cis-regulatory analysis
• Neoblast specific regulatory regions
4. What should bioinformatics do for EvoDevo
• What should I do for an experimental EvoDevo lab
• Visual > Text
• View additional information in place
• Plot everything vs everything
• Create gene models visually
• Routine analyses should not require bioinformatician
• Clear explanations of how a resource was created
• Not too many versions
• Minimum standards
4. What should bioinformatics do for EvoDevo
• What should the bioinformatics community do for me as an
EvoDevo bioinformatician
• Best practice documentation for analyses
• Easy to install tools
• Minimum standards for assembly, metadata, annotation, and delivery
• Grants for coordination, tools, resources
Summary
• Please use the resources at aboobakerlab.com/genomes
• Tell us what other resources you’d like to see as standard
• Fund technology development and training
Acknowledgements
• AboobakerLab.com
• Aziz Aboobaker
• Natalia Pouchkina-Stantcheva
• Damian Kao
• Yuliana Mihaylova
• Aphrodite Zhao
• Blaxter Lab (nematodes.org)
• Ben Elsworth (Badger)
• Sequencing
• Edinburgh Genomics
• Funding
• BBSRC
• BSDB / Company of Biologists travel grant

More Related Content

What's hot

dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020
dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020
dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020dkNET
 
Plant genome project, tahira ali rai
Plant genome project, tahira ali raiPlant genome project, tahira ali rai
Plant genome project, tahira ali raiTahira Rai
 
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...dkNET
 
The benefits of environment specific curation of the public databases for tax...
The benefits of environment specific curation of the public databases for tax...The benefits of environment specific curation of the public databases for tax...
The benefits of environment specific curation of the public databases for tax...Aaron Marc Saunders
 
Edwards.Miller CSTEP 2016
Edwards.Miller CSTEP 2016Edwards.Miller CSTEP 2016
Edwards.Miller CSTEP 2016Hailee Edwards
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...Dr. Haxel Consult
 
Genomes On Rails
Genomes On RailsGenomes On Rails
Genomes On RailsMatt Wood
 
Jc synthetic biology 6-15-2012
Jc synthetic biology   6-15-2012Jc synthetic biology   6-15-2012
Jc synthetic biology 6-15-2012Diane Wu
 
281 lec3 mendel_genetics
281 lec3 mendel_genetics281 lec3 mendel_genetics
281 lec3 mendel_geneticshhalhaddad
 

What's hot (20)

dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020
dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020
dkNET Webinar: Addgene, The Nonprofit Plasmid Repository 04/24/2020
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Introduction to METAGENOTE
 
Plant genome project, tahira ali rai
Plant genome project, tahira ali raiPlant genome project, tahira ali rai
Plant genome project, tahira ali rai
 
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...
dkNET Webinar "YCharOS: Antibody Characterization Through Open Science" 10/22...
 
The benefits of environment specific curation of the public databases for tax...
The benefits of environment specific curation of the public databases for tax...The benefits of environment specific curation of the public databases for tax...
The benefits of environment specific curation of the public databases for tax...
 
Edwards.Miller CSTEP 2016
Edwards.Miller CSTEP 2016Edwards.Miller CSTEP 2016
Edwards.Miller CSTEP 2016
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
2014 ucl
2014 ucl2014 ucl
2014 ucl
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
CRISPRs
CRISPRsCRISPRs
CRISPRs
 
John La Salle - Opening Plenary
John La Salle - Opening PlenaryJohn La Salle - Opening Plenary
John La Salle - Opening Plenary
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...
II-SDV 2016 Denis Bayada - Concomitant Ontology-Driven Patent and Non-Patent ...
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Genomes On Rails
Genomes On RailsGenomes On Rails
Genomes On Rails
 
Jc synthetic biology 6-15-2012
Jc synthetic biology   6-15-2012Jc synthetic biology   6-15-2012
Jc synthetic biology 6-15-2012
 
281 lec3 mendel_genetics
281 lec3 mendel_genetics281 lec3 mendel_genetics
281 lec3 mendel_genetics
 

Similar to What should Bioinformatics do for EvoDevo?

Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysisYun Lung Li
 
Johannes Bergsten Dna Barcoding
Johannes Bergsten Dna BarcodingJohannes Bergsten Dna Barcoding
Johannes Bergsten Dna Barcodingbioinfocourse
 
Rewriting the Genome Using CRISPR and Synthetic Biology
Rewriting the Genome Using CRISPR and Synthetic Biology Rewriting the Genome Using CRISPR and Synthetic Biology
Rewriting the Genome Using CRISPR and Synthetic Biology Integrated DNA Technologies
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsGigaScience, BGI Hong Kong
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approachHong ChangBum
 
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesReverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesLeighton Pritchard
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Stuart MacGowan
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleGenomeInABottle
 
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...Automated Nucleic Acid Purification from Diverse Sample types using dedicated...
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...QIAGEN
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slidesEric Holmes
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
 
Use of DNA Barcoding in Insect Taxonomy
Use of DNA Barcoding in InsectTaxonomyUse of DNA Barcoding in InsectTaxonomy
Use of DNA Barcoding in Insect TaxonomyShweta Patel
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesEBI
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionMonica Munoz-Torres
 

Similar to What should Bioinformatics do for EvoDevo? (20)

Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysis
 
Johannes Bergsten Dna Barcoding
Johannes Bergsten Dna BarcodingJohannes Bergsten Dna Barcoding
Johannes Bergsten Dna Barcoding
 
Microbial physiology in genomic era
Microbial physiology in genomic eraMicrobial physiology in genomic era
Microbial physiology in genomic era
 
Rewriting the Genome Using CRISPR and Synthetic Biology
Rewriting the Genome Using CRISPR and Synthetic Biology Rewriting the Genome Using CRISPR and Synthetic Biology
Rewriting the Genome Using CRISPR and Synthetic Biology
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesReverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...Automated Nucleic Acid Purification from Diverse Sample types using dedicated...
Automated Nucleic Acid Purification from Diverse Sample types using dedicated...
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slides
 
PAG-2004-Roe
PAG-2004-RoePAG-2004-Roe
PAG-2004-Roe
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
Use of DNA Barcoding in Insect Taxonomy
Use of DNA Barcoding in InsectTaxonomyUse of DNA Barcoding in InsectTaxonomy
Use of DNA Barcoding in Insect Taxonomy
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
 
Gene cloning
Gene cloningGene cloning
Gene cloning
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 Introduction
 

Recently uploaded

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 

Recently uploaded (20)

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 

What should Bioinformatics do for EvoDevo?

  • 1. Insights into the evolution and development of planarian regeneration from the genome of the flatworm Girardia tigrina SUJAI KUMAR 2014-07-24 VIENNA EURO EVODEVO WHAT SHOULD BIOINFORMATICS DO FOR EVODEVO?
  • 3. SUJAI KUMAR "Winkel triple projection SW" by Strebe - Own work Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons http://commons.wikimedia.org/wiki/File:Winkel_triple_projection_SW.jpg Cartoonist and mathematics teacher in New Delhi
  • 4. SUJAI KUMAR Finding patterns in sequences: TIMSS 1999 video study MS in Educational Psychology at the University of Illinois
  • 6. SUJAI KUMAR Sequenced four nematode genomes for PhD in Blaxter Lab, Edinburgh
  • 8. Outline of this talk 1. Regeneration, planarian flatworms, and Girardia tigrina 2. Creating G tigrina genomic resources 3. Using these resources to understand regeneration 4. What should bioinformatics do for EvoDevo
  • 9. 1. Regeneration, planarian flatworms, and Girardia tigrina Bely and Nyberg, 2010 DOI:10.1016/j.tree.2009.08.005
  • 10. 1. Regeneration, planarian flatworms, and Girardia tigrina Kao, 2014. PhD Thesis “Transcriptome assembly and analysis of the freshwater planarian Schmidtea mediterranea” Platyhelminthes Cestoda Monogenea Trematoda Rhabditophora Turbellaria Tricladida Macrostomorpha Lecithoepitheliata Rhabdocoela T T T T T T Girardia tigrina aboobakerlab.com/genomes G Schmidtea mediterranea smedgd.neuro.utah.edu G Polycladida
  • 11. 1. Regeneration, planarian flatworms, and Girardia tigrina • What we know already • Some genes and pathways that are essential for WBR • Some transcription expression profiles • No transgenics in any planarian
  • 12. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery
  • 13. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery Illumina HiSeq: Workhorse Short paired reads ~$£€ 1,000 / 100 MegaBase Mate pairs essential PacBio: expensive High quality fly genome ~$£€ 10,000 / 100 MegaBase Nanopore – not a game changer just yet
  • 14. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery • Quality Control • Raw data QC fastqc • Preliminary assembly Blobology • Separate components contaminants/ endosymbionts/ mitochondrial • Assess insert sizes Bad mate pair libraries confound scaffolding
  • 15. Each point is a contig from a preliminary assembly (Caenorhabditis Sp. 5) Taxon-annotated GC-Coverage (TAGC) Plots a.k.a “Blobology”
  • 17. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery • Quality Control • Raw data QC fastqc • Preliminary assembly Blobology • Separate components contaminants/ endosymbionts/ mitochondrial • Assess insert sizes Bad mate pair libraries confound scaffolding • Generate many assemblies • ABySS, CLC, MaSurCA, SGA, Spades, ALLPATHS-LG • Evaluate assemblies • FRCbam, REAPR, CGAL • CEGMA, alignments to known sequences • Freeze and release
  • 18. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery • NOT a great assembly • But it was GoodEnough™ • Next version with long-insert mate pairs • Diploid, but high heterozygosity Assembly version nGt.0.3 nGt.0.5 Raw read data ~500M short read pairs 160 GBases Consolidating near identical contigs Total Span Gbases 1.898 1.500 Num Contigs 581,558 422,617 Span Contigs >10kb 541,653,308 536,575,093 Num Contigs >10kb 29,050 27,495 N50 5,751 6,827 CEGMA 45% 56%
  • 19. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery • Gene prediction • RNA-seq • Predictors Augustus, SNAP, GeneMark • Consolidators MAKER, EVM, ENSEMBL genebuild • Evaluate use Annotation Edit Distance (AED) as a metric • Functional annotation • InterProScan, Trinotate, Blast2GO • Community annotation • WebApollo, Community Annotation Portal Annotation Version Num of Genes Num of Genes with AED>0.5 Mean aa length Num of Genes with InterPro annotations nGt.0.5.1 39,119 35,061 268 22,747
  • 20. 2. Creating G tigrina genomic resources Sequencing > Assembly > Annotation > Delivery • Genome Browser • Blast server • Bulk data downloads • Interface • Badger, Tripal, InterMine, Ensembl
  • 21. 3. Using these resources to understand regeneration • Individual genes and pathways • Transgenics • Protein ortholog analysis • 4 triclads, 1 other platyhelminth, 2 ecdysozoa, 4 deuterostomes • 14k out of 40k G tigrina proteins in strict ortholog clusters • ~8000 triclad-specific clusters • ~800 triclad-specific clusters with all 4 species represented • Cis-regulatory analysis • Neoblast specific regulatory regions
  • 22. 4. What should bioinformatics do for EvoDevo • What should I do for an experimental EvoDevo lab • Visual > Text • View additional information in place • Plot everything vs everything • Create gene models visually • Routine analyses should not require bioinformatician • Clear explanations of how a resource was created • Not too many versions • Minimum standards
  • 23. 4. What should bioinformatics do for EvoDevo • What should the bioinformatics community do for me as an EvoDevo bioinformatician • Best practice documentation for analyses • Easy to install tools • Minimum standards for assembly, metadata, annotation, and delivery • Grants for coordination, tools, resources
  • 24. Summary • Please use the resources at aboobakerlab.com/genomes • Tell us what other resources you’d like to see as standard • Fund technology development and training
  • 25. Acknowledgements • AboobakerLab.com • Aziz Aboobaker • Natalia Pouchkina-Stantcheva • Damian Kao • Yuliana Mihaylova • Aphrodite Zhao • Blaxter Lab (nematodes.org) • Ben Elsworth (Badger) • Sequencing • Edinburgh Genomics • Funding • BBSRC • BSDB / Company of Biologists travel grant

Editor's Notes

  1. Target audience Biologists who want genomic resources for their favourite species but are not sure of what is possible Bioinformaticians who are creating these resources
  2. Pic of sequencing from Lex – highlight technologies, put costs/advantages on side Drosophila contigs better than Sanger Sequencing 42 SMRT for 160 MB cells, approx cost for 60 = 18,000, fo 40 should cost ~12,000 PacBio promising 4X improvement so even lower Human 54X coverage ~ 40,000 so much less
  3. 1 in 200 bp 1 in 500 bp
  4. Delivery – whatever you choose, be sure it can support a large number of draft contigs/scaffolds – many tools that work well for a few chroosomes or a few hundred scaffolds don’t work so well for thousands (as we had)
  5. Think of a good way to get Triclad only Lopho only Protostomia only All --- (rename sp as cols, and check cols) using regexp (with counts)
  6. Work in progress. We do some things ok. 1. Planmine and Badger vs FTP downloads 2. Additional info in one place (don’t make users click around too much). Gbrowse is fabulous - http://banana-genome.cirad.fr great example, inline help as well 3. In the future – on the fly? (what takes time is the computation and recomputation for a whole genome, but individual genes/contigs should be doable on the fly) 3. Say something provocative
  7. Work in progress. We do some things ok. 1. Compare badger to – FTP download site (because it has descriptive information) 2. Additional info in one place (don’t make users click around too much). Gbrowse is fabulous - http://banana-genome.cirad.fr great example, inline help as well In the future – on the fly? (what takes time is the computation and recomputation for a whole genome, but individual genes/contigs should be doable on the fly) 3. Say something provocative