Presentation at ZSJ 2013 by Shigehiro Kuraku

C
Presentation at ZSJ 2013 by Shigehiro Kuraku
・
・
・
・
Presentation at ZSJ 2013 by Shigehiro Kuraku
‘the complete set of phylogenetic trees derived
from the proteome of an organism’
Sicheritz-Pontén and Andersson, 2001. Nuc. Acids Res. 29: 545

genome-wide events
+
gene family-specific events

August 2012. At Daitoku-ji Temple, Kyoto
Hypothesis A

Hypothesis B

Hypothesis C

chicken

chicken

chicken

shark

shark

shark

lamprey
hagfish

lamprey

hagfish

lamprey
hagfish

Cyclostomes

human

Cyclostomes

human
Cyclostomes

human

amphioxus

amphioxus

amphioxus

tunicate

tunicate

tunicate

- Mol. phylogeny of 55 gene families
Kuraku et al., 2009. MBE

- Globin gene phylogeny
Hoffmann et al., 2010. PNAS

- Sea lamprey genome analysis
Smith, Kuraku et al., 2013.
Nature Genetics

- Composition of Hox/Dlx clusters
Neidert et al., 2001. PNAS
Irvine et al., 2002. J Exp Zool B
Force et al., 2002. J Exp Zool B etc
- Mol. phylogeny of 33 gene families
Escriva et al., 2002. MBE
- Amphioxus genome
Putnam et al., 2008. Nature

- ParaHox clusters
Furlong et al., 2007. MBE
Kuraku and Kuratani, 2011

Heuristic ML
JTT+G4
ML-BP/NJ-BP
(Kuraku & Kuratani, 2011. Genome Biol. Evol.)

(cf. hidden paralogy)
Informatics

Modern sequencing

Genome Resource & Analysis Unit
Center for Developmental Biology
RIKEN, Kobe, Japan

Molecular Developmental Biology
Sanger sequencing, Cell sorting with FACS, clone distribution, etc.

illumina HiSeq1500

~150 bp reads
in Rapid Run mode

Installed in November 2011
Not only sequencing

Kuraku et al., 2013. Nucleic Acids Res.

Amemiya et al., 2013
・
・
・
・
Our experiences at GRAS
・Main applications: RNA-seq & ChIP-seq
・Diverse non-model organisms for RNA-seq
・Trouble shooting with tight wet-dry communication
・Many requests with limited sample amounts
For retrieving complete genome and original transcriptome

・Sequencers ‘can’ produce ‘data’ from problematic samples
Low quality DNA/RNA, contamination, over-amplification, …

・Look carefully for acceptable pricing and service contents
e.g. How many reads do you need?

・Longer illumina reads are not necessarily beneficial
~150bp on HiSeq & ~300bp MiSeq (as of September 2013)
Prep of libraries with longer inserts
・
・
・
・
Species

Sequenced
at

Gene model by

Sequencing
technology

Published in

# of
authors

Started
in

sea
lamprey

Wash. Univ.

Yandell lab /
Ensembl

Sanger

Nat. Genet.
(2013)

59

2005?

soft-shelled
turtle

BGI

BGI / Ensembl

illumina

Nat. Genet.
(2013)

34

2010

coelacanth

Broad
Institute

Broad / Ensembl

illumina

Nature
(2013)

91

2011
Sequenced at Wash. Univ. Genome Institute

International consortium
Smith, Kuraku, et al. 2013.
Nature Genetics
Contributed analysis
Vertebrate ‘new genes’
GC & codon usage bias
Myelin-associated genes

In-house annotation effort
Trained gene prediction setting
available at Augustus web server

GC-content & codon usage bias
Qiu et al., 2011. BMC Genomics

Horizontal gene transfer
Kuraku et al., 2012. Genome Biol. Evol.

http://www.ensembl.org/Petromyzon_marinus/Info/Index

Coding genes: 10,415

Incomplete genome assembly: Pax6 missing
Incomplete gene annotation: Fgf8/17-A missing
(as of September 2013; release 73)
Amino acid composition

CA

Methods: Correspondence analysis for frequencies of 20 amino acids

CA

Deviation of ‘gene model’ in lamprey genome
Smith, Kuraku, et al. 2013. Nature Genetics
Codon usage bias
Methods: RSCU (Sharp et al., 1986) and ENc (Wright, 1990)
N
sea lamprey
stickleback
Tetraodon
Takifugu
platypus
medaka
dog
human
mouse
ghost shark
zebrafish
chicken
anole lizard
opossum
X. tropicalis

Heavy use of GC-rich codons
Qiu et al., 2011. BMC Genomics
Genomic DNA
Sanger, 454, illumina, or/and PacBio
Heterochromatin etc.

Raw reads
Assembly
Repeats, regions with low depth

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)
‘Unusual’ genes

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
Genomic DNA
Sanger, 454, illumina, or/and PacBio

Raw reads
Assembly

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
(cf. Assemblathon2 - Bradnam et al., 2013)

‘NG50’ instead of N50
CEGMA (Parra et al., 2007) – coverage of CEGs
CGAL, REAPR, ALE – evaluation by identifying misassemblies

QUAST – computation of assembly summary
Species

Assembly release

# of CEGs found
(including ‘partial’)

Published?

human

GRCh37 (hg19)

248

First draft in 2001

mouse

GRCm38 (mm10)

239

First draft in 2002

X. tropicalis

JGI_4.2

239

Hellsten et al., 2010

coelacanth

LatCal1

236

Amemiya et al., 2013

spotted gar

LepOcu1

235

soft-shell turtle

PelSin_1.0

232

Wang et al., 2013

anole lizard

AnoCar2.0

231

Alföldi et al., 2011

zebrafish

Zv9

230

Howe et al., 2013

chicken

galGal4

220

chicken

WASHUC2.63 (galGal3)

210

First draft in 2004

Japanese lamprey

LetCam1

199

Mehta et al., 2013

sea lamprey

PerMar1

172

Smith et al., 2013

little skate

version2

77

elephant shark

(1.4x)

58

unpublished

unpublished
Venkatesh et al., 2007

248 core eukaryotic genes (CEGs)
Genomic DNA
Sanger, 454, illumina, or/and PacBio

Raw reads
Assembly

Genome assembly (contigs/scaffolds)
Gene prediction (after ‘training’)

‘Gene model’
(protein-coding sequences)

Reference: transcriptome, annotated genes in GenBank
(cf. Assemblathon2 - Bradnam et al., 2013)

‘NG50’ instead of N50
CEGMA (Parra et al., 2007) – coverage of CEGs
CGAL, REAPR, ALE – evaluation by identifying misassemblies

QUAST – computation of assembly summary

‘Annotation Turnover’ and ‘AED’ (Eilbeck et al., 2009)
Also, run CEGMA to check transcript diversity?
– Nakamura et al., 2013
・
・
・
・
- Phylogenetic property of the species of your interest
e.g. Ploidy level, distance to close relatives, …

www.genomesize.com, www.timetree.org

- Any clue about its molecular attributes ?
e.g. GC-content, repeats, intron/UTR length, …
Using existing resources at SRA & Sanger traces at NCBI dbEST
- Genome or transcriptome to sequence ?
Any existing or emerging resources?

- RNA-seq: sequence identification or quantification?
- Sample prep mostly determines the fate of the project
Quantification with Qubit; rRNA removal controlled with BioAnalyzer
Replication > Depth (Rapaport et al., 2013. Genome Biol.)

- Rigorous QC of prepared libraries before sequencing
ChIP-qPCR before ChIP-seq
- Fostering more productive sequencing facilities in Japan
GRAS

accepts visits of facility managers/staffs

- Education of researchers
with dual (wet/dry) capabilities
‘A sequencer or a bioinformatician ?‘
Learning material: ‘Unix & Perl for Biologists’ by Korf Lab
http://korflab.ucdavis.edu/unix_and_Perl/

- Importing latest information from overseas
→ shigehiro-kuraku@cdb.riken.jp
1 of 29

Recommended

Epigenetic and Environmental Influences on the Shellfish Immune Response by
Epigenetic and Environmental Influences on the Shellfish Immune ResponseEpigenetic and Environmental Influences on the Shellfish Immune Response
Epigenetic and Environmental Influences on the Shellfish Immune Responsesr320
2.3K views105 slides
Full Bayesian comparative biogeography of Philippine geckos challenges predic... by
Full Bayesian comparative biogeography of Philippine geckos challenges predic...Full Bayesian comparative biogeography of Philippine geckos challenges predic...
Full Bayesian comparative biogeography of Philippine geckos challenges predic...Jamie Oaks
223 views35 slides
Microbial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics by
Microbial Phylogenomics (EVE161) Class 16: Shotgun MetagenomicsMicrobial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 16: Shotgun MetagenomicsJonathan Eisen
953 views77 slides
Does DNA methylation facilitate phenotypic plasticity in marine invertebrates? by
Does DNA methylation facilitate phenotypic plasticity in marine invertebrates?Does DNA methylation facilitate phenotypic plasticity in marine invertebrates?
Does DNA methylation facilitate phenotypic plasticity in marine invertebrates?sr320
1.9K views62 slides
Genomic approaches to assessing ecosystem health by
Genomic approaches to assessing ecosystem healthGenomic approaches to assessing ecosystem health
Genomic approaches to assessing ecosystem healthsr320
717 views27 slides
EVE 161 Lecture 6 by
EVE 161 Lecture 6EVE 161 Lecture 6
EVE 161 Lecture 6Jonathan Eisen
1.7K views24 slides

More Related Content

What's hot

PhD Research by
PhD ResearchPhD Research
PhD Researchjdcarrick
115 views58 slides
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi... by
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
110 views143 slides
UC Davis EVE161 Lecture 17 by @phylogenomics by
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomicsJonathan Eisen
4K views46 slides
EVE 161 Winter 2018 Class 15 by
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15Jonathan Eisen
249 views76 slides
Comparing the Amount and Quality of Information from Different Sequencing Str... by
Comparing the Amount and Quality of Information from Different Sequencing Str...Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...jembrown
422 views29 slides
Artículo alzheimer by
Artículo alzheimerArtículo alzheimer
Artículo alzheimerAlejandro Montero Iglesias
213 views6 slides

What's hot(20)

PhD Research by jdcarrick
PhD ResearchPhD Research
PhD Research
jdcarrick115 views
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi... by Jonathan Eisen
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Jonathan Eisen110 views
UC Davis EVE161 Lecture 17 by @phylogenomics by Jonathan Eisen
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomics
Jonathan Eisen4K views
EVE 161 Winter 2018 Class 15 by Jonathan Eisen
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
Jonathan Eisen249 views
Comparing the Amount and Quality of Information from Different Sequencing Str... by jembrown
Comparing the Amount and Quality of Information from Different Sequencing Str...Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...
jembrown422 views
UC Davis EVE161 Lecture 18 by @phylogenomics by Jonathan Eisen
 UC Davis EVE161 Lecture 18 by @phylogenomics UC Davis EVE161 Lecture 18 by @phylogenomics
UC Davis EVE161 Lecture 18 by @phylogenomics
Jonathan Eisen1.5K views
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics by Jonathan Eisen
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Jonathan Eisen813 views
Goldy ABRCMS Poster Final by Goldy Landau
Goldy ABRCMS Poster FinalGoldy ABRCMS Poster Final
Goldy ABRCMS Poster Final
Goldy Landau194 views
Swansea University (October-2020): Challenges of using GWAS in bacteria by Ben Pascoe
Swansea University (October-2020): Challenges of using GWAS in bacteriaSwansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteria
Ben Pascoe82 views
The Seagrass Microbiome Project by Jonathan Eisen
The Seagrass Microbiome Project The Seagrass Microbiome Project
The Seagrass Microbiome Project
Jonathan Eisen1.3K views
RelationshipofZebrafishNeuromastbetween2dpfand7dpf by Shermann Alconcel
RelationshipofZebrafishNeuromastbetween2dpfand7dpfRelationshipofZebrafishNeuromastbetween2dpfand7dpf
RelationshipofZebrafishNeuromastbetween2dpfand7dpf
Shermann Alconcel295 views
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics by Jonathan Eisen
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Jonathan Eisen646 views
American Gut Project presentation at Masaryk University by mcdonadt
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
mcdonadt1.7K views
1-s2.0-S0531556514002551-main(1) by Xavier Manière
1-s2.0-S0531556514002551-main(1)1-s2.0-S0531556514002551-main(1)
1-s2.0-S0531556514002551-main(1)
Xavier Manière240 views
EVE 161 Winter 2018 Class 14 by Jonathan Eisen
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14
Jonathan Eisen168 views
Polymerase Chain Reaction by Sheetal Wagh
Polymerase Chain ReactionPolymerase Chain Reaction
Polymerase Chain Reaction
Sheetal Wagh227 views
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each... by Larry Smarr
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
Using Supercomputers to Discover the 100 Trillion Bacteria Living Within Each...
Larry Smarr932 views

Viewers also liked

How Cool Brands Stay Hot at ACAM VMVM by
How Cool Brands Stay Hot at ACAM VMVMHow Cool Brands Stay Hot at ACAM VMVM
How Cool Brands Stay Hot at ACAM VMVMInSites on Stage
1.5K views95 slides
Cesc limited by
Cesc limitedCesc limited
Cesc limitedengineeringwatch
1.5K views26 slides
Screenshot Präsentation Feed Engine by
Screenshot Präsentation Feed EngineScreenshot Präsentation Feed Engine
Screenshot Präsentation Feed EngineFeed Engine
1.7K views15 slides
Production Time Profiling Out of the Box by
Production Time Profiling Out of the BoxProduction Time Profiling Out of the Box
Production Time Profiling Out of the BoxMarcus Hirt
1.6K views32 slides
Agent Banking: Future-proofing Investments with Mobile Solutions by
Agent Banking: Future-proofing Investments with Mobile SolutionsAgent Banking: Future-proofing Investments with Mobile Solutions
Agent Banking: Future-proofing Investments with Mobile SolutionsMistral Mobile
1.5K views15 slides
I Ciclo de Talleres Creativos para la Igualdad by
I Ciclo de Talleres Creativos para la IgualdadI Ciclo de Talleres Creativos para la Igualdad
I Ciclo de Talleres Creativos para la IgualdadUniversidad de Sevilla
437 views2 slides

Viewers also liked(6)

How Cool Brands Stay Hot at ACAM VMVM by InSites on Stage
How Cool Brands Stay Hot at ACAM VMVMHow Cool Brands Stay Hot at ACAM VMVM
How Cool Brands Stay Hot at ACAM VMVM
InSites on Stage1.5K views
Screenshot Präsentation Feed Engine by Feed Engine
Screenshot Präsentation Feed EngineScreenshot Präsentation Feed Engine
Screenshot Präsentation Feed Engine
Feed Engine1.7K views
Production Time Profiling Out of the Box by Marcus Hirt
Production Time Profiling Out of the BoxProduction Time Profiling Out of the Box
Production Time Profiling Out of the Box
Marcus Hirt1.6K views
Agent Banking: Future-proofing Investments with Mobile Solutions by Mistral Mobile
Agent Banking: Future-proofing Investments with Mobile SolutionsAgent Banking: Future-proofing Investments with Mobile Solutions
Agent Banking: Future-proofing Investments with Mobile Solutions
Mistral Mobile1.5K views

Similar to Presentation at ZSJ 2013 by Shigehiro Kuraku

Pattemore 2015 by
Pattemore 2015Pattemore 2015
Pattemore 2015Julie Pattemore
305 views5 slides
Synthetic biology by
Synthetic biologySynthetic biology
Synthetic biologyVasyl Mykytyuk
1.8K views63 slides
The Emerging Global Community of Microbial Metagenomics Researchers by
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersLarry Smarr
1.9K views30 slides
Final Draft Convergent Evolution by
Final Draft Convergent EvolutionFinal Draft Convergent Evolution
Final Draft Convergent EvolutionKevin Varty
232 views13 slides
CRISPR PROJECT.pptx by
CRISPR PROJECT.pptxCRISPR PROJECT.pptx
CRISPR PROJECT.pptxAcSni
25 views29 slides
Lab Repotrt Essay by
Lab Repotrt EssayLab Repotrt Essay
Lab Repotrt EssayJean Arnett
2 views39 slides

Similar to Presentation at ZSJ 2013 by Shigehiro Kuraku(20)

The Emerging Global Community of Microbial Metagenomics Researchers by Larry Smarr
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics Researchers
Larry Smarr1.9K views
Final Draft Convergent Evolution by Kevin Varty
Final Draft Convergent EvolutionFinal Draft Convergent Evolution
Final Draft Convergent Evolution
Kevin Varty232 views
CRISPR PROJECT.pptx by AcSni
CRISPR PROJECT.pptxCRISPR PROJECT.pptx
CRISPR PROJECT.pptx
AcSni25 views
Science Article by Murakawa by Adonai Cruz
Science Article by MurakawaScience Article by Murakawa
Science Article by Murakawa
Adonai Cruz266 views
Joe Walsh Thesis by Joe Walsh
Joe Walsh ThesisJoe Walsh Thesis
Joe Walsh Thesis
Joe Walsh141 views
The Fabrication And Modification Of T Cuas With Cellulose... by Christy Hunt
The Fabrication And Modification Of T Cuas With Cellulose...The Fabrication And Modification Of T Cuas With Cellulose...
The Fabrication And Modification Of T Cuas With Cellulose...
Christy Hunt2 views
Investigation of phylogenic relationships of shrew populations using genetic... by Juan Barrera
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
Juan Barrera179 views
Investigation of phylogenic relationships of shrew populations using genetic... by Juan Barrera
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
Juan Barrera124 views
genetics lab poster SRC by Juan Barrera
genetics lab poster SRCgenetics lab poster SRC
genetics lab poster SRC
Juan Barrera162 views
Computational prediction and characterization of genomic islands: insights i... by Morgan Langille
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...
Morgan Langille2.5K views
Genome sequencing in vegetable crops by Bommesh
Genome sequencing in vegetable cropsGenome sequencing in vegetable crops
Genome sequencing in vegetable crops
Bommesh2.2K views

Recently uploaded

SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx by
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptxSURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptxNiranjan Chavan
43 views54 slides
MercerJesse3.0.pdf by
MercerJesse3.0.pdfMercerJesse3.0.pdf
MercerJesse3.0.pdfjessemercerail
37 views6 slides
MercerJesse2.1Doc.pdf by
MercerJesse2.1Doc.pdfMercerJesse2.1Doc.pdf
MercerJesse2.1Doc.pdfjessemercerail
280 views5 slides
Parts of Speech (1).pptx by
Parts of Speech (1).pptxParts of Speech (1).pptx
Parts of Speech (1).pptxmhkpreet001
43 views20 slides
Guess Papers ADC 1, Karachi University by
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi UniversityKhalid Aziz
69 views17 slides
UNIDAD 3 6º C.MEDIO.pptx by
UNIDAD 3 6º C.MEDIO.pptxUNIDAD 3 6º C.MEDIO.pptx
UNIDAD 3 6º C.MEDIO.pptxMarcosRodriguezUcedo
139 views32 slides

Recently uploaded(20)

SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx by Niranjan Chavan
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptxSURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx
Niranjan Chavan43 views
Parts of Speech (1).pptx by mhkpreet001
Parts of Speech (1).pptxParts of Speech (1).pptx
Parts of Speech (1).pptx
mhkpreet00143 views
Guess Papers ADC 1, Karachi University by Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz69 views
When Sex Gets Complicated: Porn, Affairs, & Cybersex by Marlene Maheu
When Sex Gets Complicated: Porn, Affairs, & CybersexWhen Sex Gets Complicated: Porn, Affairs, & Cybersex
When Sex Gets Complicated: Porn, Affairs, & Cybersex
Marlene Maheu99 views
NodeJS and ExpressJS.pdf by ArthyR3
NodeJS and ExpressJS.pdfNodeJS and ExpressJS.pdf
NodeJS and ExpressJS.pdf
ArthyR346 views
11.30.23A Poverty and Inequality in America.pptx by mary850239
11.30.23A Poverty and Inequality in America.pptx11.30.23A Poverty and Inequality in America.pptx
11.30.23A Poverty and Inequality in America.pptx
mary85023962 views
ANGULARJS.pdf by ArthyR3
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdf
ArthyR349 views
Education of marginalized and socially disadvantages segments.pptx by GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati539 views
Six Sigma Concept by Sahil Srivastava.pptx by Sahil Srivastava
Six Sigma Concept by Sahil Srivastava.pptxSix Sigma Concept by Sahil Srivastava.pptx
Six Sigma Concept by Sahil Srivastava.pptx
Sahil Srivastava40 views

Presentation at ZSJ 2013 by Shigehiro Kuraku

  • 4. ‘the complete set of phylogenetic trees derived from the proteome of an organism’ Sicheritz-Pontén and Andersson, 2001. Nuc. Acids Res. 29: 545 genome-wide events + gene family-specific events August 2012. At Daitoku-ji Temple, Kyoto
  • 5. Hypothesis A Hypothesis B Hypothesis C chicken chicken chicken shark shark shark lamprey hagfish lamprey hagfish lamprey hagfish Cyclostomes human Cyclostomes human Cyclostomes human amphioxus amphioxus amphioxus tunicate tunicate tunicate - Mol. phylogeny of 55 gene families Kuraku et al., 2009. MBE - Globin gene phylogeny Hoffmann et al., 2010. PNAS - Sea lamprey genome analysis Smith, Kuraku et al., 2013. Nature Genetics - Composition of Hox/Dlx clusters Neidert et al., 2001. PNAS Irvine et al., 2002. J Exp Zool B Force et al., 2002. J Exp Zool B etc - Mol. phylogeny of 33 gene families Escriva et al., 2002. MBE - Amphioxus genome Putnam et al., 2008. Nature - ParaHox clusters Furlong et al., 2007. MBE
  • 6. Kuraku and Kuratani, 2011 Heuristic ML JTT+G4 ML-BP/NJ-BP
  • 7. (Kuraku & Kuratani, 2011. Genome Biol. Evol.) (cf. hidden paralogy)
  • 8. Informatics Modern sequencing Genome Resource & Analysis Unit Center for Developmental Biology RIKEN, Kobe, Japan Molecular Developmental Biology
  • 9. Sanger sequencing, Cell sorting with FACS, clone distribution, etc. illumina HiSeq1500 ~150 bp reads in Rapid Run mode Installed in November 2011
  • 10. Not only sequencing Kuraku et al., 2013. Nucleic Acids Res. Amemiya et al., 2013
  • 12. Our experiences at GRAS ・Main applications: RNA-seq & ChIP-seq ・Diverse non-model organisms for RNA-seq ・Trouble shooting with tight wet-dry communication ・Many requests with limited sample amounts
  • 13. For retrieving complete genome and original transcriptome ・Sequencers ‘can’ produce ‘data’ from problematic samples Low quality DNA/RNA, contamination, over-amplification, … ・Look carefully for acceptable pricing and service contents e.g. How many reads do you need? ・Longer illumina reads are not necessarily beneficial ~150bp on HiSeq & ~300bp MiSeq (as of September 2013) Prep of libraries with longer inserts
  • 15. Species Sequenced at Gene model by Sequencing technology Published in # of authors Started in sea lamprey Wash. Univ. Yandell lab / Ensembl Sanger Nat. Genet. (2013) 59 2005? soft-shelled turtle BGI BGI / Ensembl illumina Nat. Genet. (2013) 34 2010 coelacanth Broad Institute Broad / Ensembl illumina Nature (2013) 91 2011
  • 16. Sequenced at Wash. Univ. Genome Institute International consortium Smith, Kuraku, et al. 2013. Nature Genetics Contributed analysis Vertebrate ‘new genes’ GC & codon usage bias Myelin-associated genes In-house annotation effort Trained gene prediction setting available at Augustus web server GC-content & codon usage bias Qiu et al., 2011. BMC Genomics Horizontal gene transfer Kuraku et al., 2012. Genome Biol. Evol. http://www.ensembl.org/Petromyzon_marinus/Info/Index Coding genes: 10,415 Incomplete genome assembly: Pax6 missing Incomplete gene annotation: Fgf8/17-A missing (as of September 2013; release 73)
  • 17. Amino acid composition CA Methods: Correspondence analysis for frequencies of 20 amino acids CA Deviation of ‘gene model’ in lamprey genome Smith, Kuraku, et al. 2013. Nature Genetics
  • 18. Codon usage bias Methods: RSCU (Sharp et al., 1986) and ENc (Wright, 1990) N sea lamprey stickleback Tetraodon Takifugu platypus medaka dog human mouse ghost shark zebrafish chicken anole lizard opossum X. tropicalis Heavy use of GC-rich codons Qiu et al., 2011. BMC Genomics
  • 19. Genomic DNA Sanger, 454, illumina, or/and PacBio Heterochromatin etc. Raw reads Assembly Repeats, regions with low depth Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Unusual’ genes ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 20. Genomic DNA Sanger, 454, illumina, or/and PacBio Raw reads Assembly Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 21. (cf. Assemblathon2 - Bradnam et al., 2013) ‘NG50’ instead of N50 CEGMA (Parra et al., 2007) – coverage of CEGs CGAL, REAPR, ALE – evaluation by identifying misassemblies QUAST – computation of assembly summary
  • 22. Species Assembly release # of CEGs found (including ‘partial’) Published? human GRCh37 (hg19) 248 First draft in 2001 mouse GRCm38 (mm10) 239 First draft in 2002 X. tropicalis JGI_4.2 239 Hellsten et al., 2010 coelacanth LatCal1 236 Amemiya et al., 2013 spotted gar LepOcu1 235 soft-shell turtle PelSin_1.0 232 Wang et al., 2013 anole lizard AnoCar2.0 231 Alföldi et al., 2011 zebrafish Zv9 230 Howe et al., 2013 chicken galGal4 220 chicken WASHUC2.63 (galGal3) 210 First draft in 2004 Japanese lamprey LetCam1 199 Mehta et al., 2013 sea lamprey PerMar1 172 Smith et al., 2013 little skate version2 77 elephant shark (1.4x) 58 unpublished unpublished Venkatesh et al., 2007 248 core eukaryotic genes (CEGs)
  • 23. Genomic DNA Sanger, 454, illumina, or/and PacBio Raw reads Assembly Genome assembly (contigs/scaffolds) Gene prediction (after ‘training’) ‘Gene model’ (protein-coding sequences) Reference: transcriptome, annotated genes in GenBank
  • 24. (cf. Assemblathon2 - Bradnam et al., 2013) ‘NG50’ instead of N50 CEGMA (Parra et al., 2007) – coverage of CEGs CGAL, REAPR, ALE – evaluation by identifying misassemblies QUAST – computation of assembly summary ‘Annotation Turnover’ and ‘AED’ (Eilbeck et al., 2009) Also, run CEGMA to check transcript diversity?
  • 25. – Nakamura et al., 2013
  • 27. - Phylogenetic property of the species of your interest e.g. Ploidy level, distance to close relatives, … www.genomesize.com, www.timetree.org - Any clue about its molecular attributes ? e.g. GC-content, repeats, intron/UTR length, … Using existing resources at SRA & Sanger traces at NCBI dbEST
  • 28. - Genome or transcriptome to sequence ? Any existing or emerging resources? - RNA-seq: sequence identification or quantification? - Sample prep mostly determines the fate of the project Quantification with Qubit; rRNA removal controlled with BioAnalyzer Replication > Depth (Rapaport et al., 2013. Genome Biol.) - Rigorous QC of prepared libraries before sequencing ChIP-qPCR before ChIP-seq
  • 29. - Fostering more productive sequencing facilities in Japan GRAS accepts visits of facility managers/staffs - Education of researchers with dual (wet/dry) capabilities ‘A sequencer or a bioinformatician ?‘ Learning material: ‘Unix & Perl for Biologists’ by Korf Lab http://korflab.ucdavis.edu/unix_and_Perl/ - Importing latest information from overseas → shigehiro-kuraku@cdb.riken.jp