• Save
20081216 06陳倩琪 紅麴菌基因體之定序與分析
Upcoming SlideShare
Loading in...5
×
 

20081216 06陳倩琪 紅麴菌基因體之定序與分析

on

  • 2,742 views

20081216_06陳倩琪_紅麴菌基因體之定序與分析

20081216_06陳倩琪_紅麴菌基因體之定序與分析

Statistics

Views

Total Views
2,742
Slideshare-icon Views on SlideShare
2,729
Embed Views
13

Actions

Likes
1
Downloads
0
Comments
0

3 Embeds 13

http://localhost 8
http://www.slideshare.net 3
http://www.slideee.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    20081216 06陳倩琪 紅麴菌基因體之定序與分析 20081216 06陳倩琪 紅麴菌基因體之定序與分析 Presentation Transcript

    • Sequencing and Analysis of Monascus pilosus genome 紅麴菌基因體定序與分析 食品工業發展研究所 生物資源保存及研究中心 陳倩琪 王俊霖 宋立民 邱世浩 邱祖培 廖麗玲 袁國芳 朱文深 廖啟成
    • Outline
      • Genome Sequence
      • Genome Analysis
    • Genome Sequence
    • Gene Feature mRNA Genomic DNA Regulatory region exon intron Splicing
    • Sequencing Strategy for Monascus cDNA library BAC/ fosmid library Expression sequence tag BAC/ fosmid end sequence Whole genome Shotgun library Genome draft finishing BAC/ fosmid clone BAC/ fosmid clone shotgun sequence Function assay Unigene Functional genome Shotgun sequence Annotation
    • Monascus Genome sequence
      • Genome sequence
        • EST, expressed sequence tag
        • WGS, whole genome sequencing
        • Assembly
    • EST EST(expressed sequence tag) 表現序列標幟 extraction mRNA reverse transcription mRNA cDNA replication ds cDNA end sequence 5‘ 3‘ GATCGTCCTGCTAGAA TAGGCTTGGGTAACCT GTAACGTCCTAGCCCT Cell
    • Monasucs EST *Tentative Unigene=4,168 Contigs+2,719 Singletons *Tentative Unigene=4,168 Contigs+2,719 Singletons Statistic of EST 0 211 ND ND mpa02 -- + ++ +++++++ MK production 40,604 5,365 15,881 10,016 9,131 Qualified reads (Q20) + + ++++++ +++++ Pigment production 2,030 mpa05 1,369 mpa03 6,887 Tentative Unigene* 844 mpa08 1,471 mpa04 Contigs No. Library
    • EST Assembly
      • All ESTs were clustered by BLAST with 90% homology for nucleotide
      • All ESTs were assembled by the CAP3 based on two or more ESTs that overlapped for at least 40 bases with at least 94% sequence identity.
      Assemble by CAP3 Contig (Unigene) EST1 EST2 Trim
    • Monasucs EST *Tentative Unigene=4,015 Contigs+3,473 Singletons Statistic of EST and Unigene 0 211 ND ND mpa02 -- + ++ +++++++ MK production 40,604 5,365 15,881 10,016 9,131 Qualified reads (Q20) + + ++++++ +++++ Pigment production 2,030 mpa05 1,369 mpa03 7,488 Tentative Unigene* 844 mpa08 1,471 mpa04 Contigs No. Library
    • Fungal ESTs in Public Source: TIGR Gene Indices, date of June 18, 2008 13,350 4,840 8,510 Gibberella fujikuroi (Gibberella moniliformis) 36,471 24,322 12,149 Alternaria solani Potato_late_blight (new in 2008) 7,488 3,473 4,015 Monascus pilosus 5,933 3,484 2,449 Schizosaccharomyces pombe 6,310 2,203 4,107 Saccharomyces cerevisiae 3,569 15,552 3,290 7,810 9,894 4,111 EST Singleton 10,927 11,432 2,384 6,893 3,662 4,026 EST Contig 5,674 Cryptococcus sp. ( Filobasidiella neoformans ) 13,556 Aspergillus nidulans 14,496 Neurospora crassa 26,984 Magnaporthe grisea 14,703 Coccidioides posadasii 8,137 Aspergillus flavus Tentative Unigene Species
    • Aspergillus nidulans Genome
      • Whole Genome Shotgun methodology
        • DNA is shattered into small fragments (~4 kb or ~40 kb)
        • Each fragment is inserted into a vector and cloned
        • The two ends of the fragment are sequenced, creating paired reads
      • Arachne Assembly methodology
        • The assembly process uses the paired reads to identify contiguous stretches of sequence (contigs)
        • Contigs are ordered and linked together into larger supercontigs by using paired reads lying in different contigs
    • Monasucs WGS  coverage  qualified reads  average read length  genome size BAC Library mpb01-02(80-100kb insert) Fosmid Library mpf01(30-40kb insert) Plasmid Library mpg01-12(2.5-3.5kb insert), 31-34(4.5-7.5kb insert), 61-64(8-10kb insert)
    • Genome Assembly Draft : gap allowed Finish : no gap and 0.01 % error rate Contig A Contig B SuperContig Contig A Contig B Gap
    • Contigs Arranged in Order
    • Linage Information of Gap
    • Genome Assembly Draft  Arachne program licensed from MIT/Whitehead genome center
    • Evaluation of Assembly The N50 length is the length L such that 50% of the bases are contained in contigs/supercontigs of size at least L. N50 length is 224.5Kb, that is 50 % of all bases are contained in contigs of at least 224.5Kb 15,247,465 2,526,505 MPSC 5 2,676,992 MPSC013004 4 2,915,166 MPSC013003 3 3,562,398 MPSC013002 2 3,566,104 MPSC013001 1 Total (bp) ID Size (bp) No. N50 length =2,527Kb Supercontig 13,333,461 224,531 MPGC00020854 39 391,998 MPGC00020775 12 395,331 MPGC00020741 11 403,969 MPGC00020828 10 425,346 MPGC00020790 9 432,590 MPGC00020697 8 433,630 MPGC00020902 7 517,965 MPGC00020893 6 520,988 MPGC00020776 5 564,379 MPGC00020911 4 578,736 MPGC00020728 3 596,733 MPGC00020904 2 623,226 MPGC00020726 1 Total (bp) ID Size (bp) No. N50 length =224.5Kb Contig
    • Contig Coverage Statistics
      • Coverage of genome
        • total length of included sequence/ genome size
        • Contigs of total length 26,428,892 bp
      • Coverage of known sequence
        • total length of included sequence/ known sequence*
        • * known sequence consists 13 BAC contigs of total length 1,256,569 bp
      99.05% 37 97.88% 709 all contigs and gaps coverage of known sequence contigs align to known sequence* coverage of genome contigs sequence included coverage statistics
    • Other Fungal Genome in Public Current Fungal Sequence Projects--44 candidates Source: Broad Institute of MIT&Harvard, date of Dec 12, 2008 Pyrenophora tritici-repentis Puccinia graminis Paracoccidioides brasiliensis Neurospora crassa Magnaporthe grisea Lodderomyces elongisporus Histoplasma capsulatum Fusarium verticillioides Fusarium oxysporum Fusarium graminearum Cryptococcus neoformans Coprinus cinereus Species Ustilago maydis Uncinocarpus reesii Stagonospora nodorum Schizosaccharomyces pombe Schizosaccharomyces japonicus Saccharomyces cerevisiae Sclerotinia sclerotiorum Rhizopus oryzae Species Coccidioides posadasii Candida lusitaniae Candida guilliermondii Coccidioides immitis Chaetomium globosum Candida tropicalis Batrachochytrium dendrobatidis Aspergillus terreus Candida albicans Botrytis cinerea Aspergillus terreus Aspergillus nidulans Species
    • Genome Analysis
    • Genome Analysis
      • Purpose
        • Functional Annotation by Similarity
        • Exon and Intron Information
        • Alternative Splicing
        • Genome Prediction
    • Expressed Gene Annotation 675 unigenes unknown undated to 2008/11/20
    • Highly Expressed Genes EST redundancy 0.75 % 0.90 % 0.97 % similar to HEAT SHOCK PROTEIN 90 HOMOLOG MPUG00014394 1.66 % 0.12 % 0.33 % weakly similar to 30 KD HEAT SHOCK PROTEIN MPUG00016317 0.86 % 1.62 % 0.99 % similar to Heat shock 70 kDa protein MPUG00015777 0.51 % 2.33 % 1.44 % weakly similar to Phosphoenolpyruvate carboxykinase [ATP] MPUG00013548 2.04% 0.53% 1.15% homologue to Elongation factor 1-alpha (EF-1-alpha) MPUG00015614 mpa05 mpa04 mpa03 2.10 % 1.20 % 1.26 % homologue to Glyceraldehyde 3-phosphate dehydrogenase MPUG00015767 Percentage of Library Annotation Cluster_ID
    • Intron and Exon
    • genome EST
    • Monascus Introm
    • Monascus Exon
    • Alternative Splicing (AS)
    • How To Merge the ESTs Alignment Pattern Monascus genome Monascus ESTs Merge same patterns to form 3 patterns
    • Monascus Alternative Splicing
      • 513 gene locus exist alternative splicing
      • 1,293 different transcripts observed
      1 2 3 4
    • Gene Prediction
      • By ab initio method
      • Gene prediction by GlimmerHMM
      * indicates prediction qualities for Aspergillus fumigatus extracted from reference: Bioinformatics , 20: 2878 – 2879 (2004). 0.42 0.90 NA A. fumigatus * 0.72 0.96 9,997 M. pilosus BCRC38072 Exon level Nucleotide level accuracy Predicted genes Predicted genome
    • Gene Prediction
    • 9997 predicted genes Gene Prediction
    • Gene ontology InterPro Gene product properties
    • Genome Annotation 891 genes unknown undated to 2008/11/20
    • Gene Feature
    •  
    • Lovastatin synthase gene cluster1
    • Lovastatin synthase gene cluster2
    •  
    • Polyketide related Gene in Monascus
    • Polyketide related Gene in Monascus
    • How to Access Monascus Database
    • Monascus Genome Database
    •  
    •  
    • Acknowledgment
      • Sponsor
        • Minister of Economic Affairs (MOEA)
      • Bioinformatics
        • Bioinformatic group of BCRC/FIRDI
      • Sequencing
        • The Sequencing Core Facility of National Yang Ming University Genome Center (YMGC)
        • Sequencing service of VitaGenomics
        • Sequence group of BCRC/FIRDI
    • Thanks for your attention
    • Citrinin Synthase vs Activator