• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Ion torrent data analysis
 

Ion torrent data analysis

on

  • 491 views

 

Statistics

Views

Total Views
491
Views on SlideShare
479
Embed Views
12

Actions

Likes
0
Downloads
0
Comments
0

2 Embeds 12

http://www.linkedin.com 9
https://www.linkedin.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Ion torrent data analysis Ion torrent data analysis Presentation Transcript

    • ION TORRENT DATA ANALYSIS SPECIFICALLY THE 400BP CHIP By Ronak Shah
    • OUTLINE  Sedum album Illumina Reference Stats  400-bp reads analysis  BWASW  TMAP  Error Corrected 100bp, 200bp and 400bp reads analysis  BWASW  TMAP  Assembly  Original 400bp data  Clipped data  Clipped and quality trimmed  All Chips Original data  All Chips Error Corrected data 5/18/2012 2 IonTorrentDataAnalysis@MonsantoCo
    • SEDUM ALBUM REFERENCE Made Using Illumina Hiseq Data 5/18/2012IonTorrentDataAnalysis@MonsantoCo 3
    • ILLUMINA INPUT DATA: 20BN BASE PAIR, 180MN READS, 148X COVERAGE Read Read Type Read Len (Mean) Insert Size Total bases Total ( read pairs) Total ( reads) Estimated Coverage (X) Hiseq Paired End 100 400 3,168,468,000 15,842,340 31,684,680 22 Hiseq Paired End 100 400 3,320,637,800 16,603,189 33,206,378 23 Hiseq Paired End 100 400 3,238,613,000 16,193,065 32,386,130 23 Hiseq Short Overlap merged 178 335 2,211,006,384 12,401,099 16 Hiseq Short Overlap merged 178 335 2,209,098,067 12,391,800 16 Hiseq Short Overlap merged 178 335 2,178,151,877 12,215,609 15 Hiseq Short Overlap unmerged 100 335 1,550,423,600 15,504,236 11 Hiseq Short Overlap unmerged 100 335 1,551,440,000 15,514,400 11 Hiseq Short Overlap unmerged 100 335 1,519,735,600 15,197,356 11 Total 20,947,574,328 48,638,594 180,501,688 148 CLC-Bio Input 20,947,574,328 180,501,688 5/18/2012 4 IonTorrentDataAnalysis@MonsantoCo
    • SEDUM ILLUMINA REFERENCE FLOWCYTOMETRY GENOME SIZE: 142MB ESTIMATED GENOME SIZE: 180MB CURRENT GENOME SIZE: 255MB N50(SCAFFOLD): 2.8KB N50(CONTIG): 1.6KB Scaffolding Stats Contigs Stats 5/18/2012 5 IonTorrentDataAnalysis@MonsantoCo Number of scaffolds 219,455 Total size of scaffolds 267,197,078 Total scaffold length as percentage of assumed genome size 2 Longest scaffold 124,757 Shortest scaffold 200 Number of scaffolds > 1K nt 63,451 28.90% Number of scaffolds > 10K nt 2,498 1.10% Number of scaffolds > 100K nt 1 0.00% Mean scaffold size 1,218 Median scaffold size 464 N50 scaffold length 2,848 Percentage of assembly in scaffolded contigs 52.40% Percentage of assembly in unscaffolded contigs 47.60% Average number of contigs per scaffold 1.3 Average length of break (>25 Ns) between contigs in scaffold 162 Number of contigs 292,607 Number of contigs in scaffolds 113,984 Number of contigs not in scaffolds 178,623 Total size of contigs 255,346,080 Longest contig 56,108 Shortest contig 176 Number of contigs > 1K nt 67,774 23.20% Number of contigs > 10K nt 641 0.20% Mean contig size 873 Median contig size 412 N50 contig length 1,615
    • ION TORRENT 400 BP CHIP READ ANALYSIS 5/18/2012IonTorrentDataAnalysis@MonsantoCo 6
    • ION TORRENT INPUT DATA: 5BN BASE PAIR, 19MN READS, 37X COVERAGE Read Read Type Read Len (Mean) Total bases Total reads Estimated Coverage (X) Ion Torrent 400bp chip 286 897,163,323 3,130,643 6 Ion Torrent 400bp chip 241 931,376,271 3,850,295 7 Ion Torrent 400bp chip 269 1,113,089,592 4,126,822 8 Ion Torrent 400bp chip 252 1,098,412,220 4,350,400 8 Ion Torrent 400bp chip 274 1,207,920,840 4,408,077 9 Total 5,247,962,246 19,866,237 37 5/18/2012 7 IonTorrentDataAnalysis@MonsantoCo
    • ALIGNERS USED: BWASW AND TMAP  Parameters used in both aligners were default.  Where for both:  Mismatch penalty:3  Gap open penalty: 5  Gap extension penalty:2 5/18/2012 8 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: 25% INSERTION RATE; 33% DELETION RATE; 85%MISTMATCH Mapping Results reads 23,277,245 mapped reads 21,124,134 mapped bases 3,622,712,040 perfectly mapped 3,143,328 len max 433 len mean 171 len stdev 82 mapq mean 95 mapq stdev 87 snp rate 4 ins rate 25 del rate 33 pct mismatch 85 base qual mean 22 base qual stdev 9 5/18/2012 9 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: 91% READS MAPPED  Total Number of Reads: 23.3M  Number of Reads Mapped:21.1M  Percentage of Reads Mapped: 91% 5/18/2012 10 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: BASE QUALITY DECREASES FROM 100 BP  Mean Base Quality 5/18/2012 11 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100bp
    • MERGED BWA RESULTS: BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 12 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: LOW ERRORS AT THE START; HIGH ERRORS AT THE END  Error Profiles:  The profiles indicate that the Mismatch, Insertion and Deletion are really high and they tend to be low at the start of the sequence and keep on increasing gradually as the sequence gets longer. 5/18/2012 13 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 14 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 15 IonTorrentDataAnalysis@MonsantoCo
    • MERGED BWA RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 16 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS : 28% INSERTION 34% DELETION; 88% MISMATCH Mapping Results reads 19,866,237 mapped reads 17,795,383 mapped bases 3,381,672,736 perfectly mapped 2,053,578 len max 433 len mean 190 len stdev 79 maq mean 14 maq stdev 10 snp rate 5% ins rate 28% del rate 34% pct mismatch 88% base qual mean 22 base qual stdev 9 5/18/2012 17 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS: 90% READ MAPPED  Total Number of Reads: 17.8M  Number of Reads Mapped:19.9M  Percentage of Reads Mapped: 90% 5/18/2012 18 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS: BASE QUALITY DECREASES FROM 100 BP  Mean Base Quality 5/18/2012 19 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100 bp
    • MERGED TMAP RESULTS: BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 20 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 21 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS: HIGH MISMATCH, HIGH INSERTION; HIGH DELETION  Error Profiles 5/18/2012 22 IonTorrentDataAnalysis@MonsantoCo
    • MERGED TMAP RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 23 IonTorrentDataAnalysis@MonsantoCo
    • ERROR CORRECTED 100BP, 200BP AND 400BP READS ANALYSIS 5/18/2012IonTorrentDataAnalysis@MonsantoCo 24
    • ION TORRENT INPUT DATA: 13BN BASE PAIR, 72MN READS, 95X COVERAGE Read Read Type Read Len (Mean) Total bases Total reads Estimated Coverage (X) Ion Torrent ORG 100bp, 200bp 400bp chip 187 13,521,610,812 72,058,773 95 Ion Torrent Corrected 100bp, 200bp 400bp chip 187 13,479,341,388 72,058,773 95 5/18/2012 25 IonTorrentDataAnalysis@MonsantoCo
    • ORG BWA RESULTS: 21% INSERTION; 27% DELETION; 81% MISMATCH CORRECTED BWA RESULTS: 10% INSERTION; 15% DELETION; 70% MISMATCH Corrected BWA Mapping Results reads 79,986,695 mapped reads 75,639,986 mapped bases 14,695,848,107 perfectly mapped 23,006,719 len max 678 len mean 194 len stdev 83 mapq mean 100 mapq stdev 88 snp rate 2% ins rate 10% del rate 15% pct mismatch 70% base qual mean 20 base qual stdev 6 5/18/2012 26 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results reads 80,098,562 mapped reads 71611630 mapped bases 10,456,280,566 perfectly mapped 13,729,260 len max 433 len mean 146 len stdev 66 mapq mean 97 mapq stdev 86 snp rate 3.2% ins rate 21% del rate 27% pct mismatch 81% base qual mean 21 base qual stdev 6
    • ORG BWA RESULTS: 89% READ MAPPED CORRECTED BWA RESULTS: 95% READS MAPPED 5/18/2012 27 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results Total Number of Reads 80.9M Number of Reads Mapped 71.6M Percentage of Reads Mapped 89% Corrected BWA Mapping Results Total Number of Reads 80.0M Number of Reads Mapped 75.6M Percentage of Reads Mapped 95%
    • CORRECTED BWA RESULTS: BASE QUALITY DECREASE FROM 100 5/18/2012 28 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 100 bp
    • CORRECTED BWA RESULTS : BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 29 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 30 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED BWA RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 31 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED BWA RESULTS: OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 250 to 450. 5/18/2012 32 IonTorrentDataAnalysis@MonsantoCo
    • ORG TMAP RESULTS: 20% INSERTION; 23% DELETION; 84% MISMATCH CORRECTED TMAP RESULTS: 13% INSERTION; 18% DELETION; 74% MISMATCH Corrected TMAP Mapping Results reads 72,058,773 mapped reads 68,116,303 mapped bases 12,763,573,084 perfectly mapped 18,029,367 len max 678 len mean 187 len stdev 81 mapq mean 13 mapq stdev 10 snp rate 3 ins rate 13 del rate 18 pct mismatch 74 base qual mean 20 base qual stdev 6 5/18/2012 33 IonTorrentDataAnalysis@MonsantoCo ORG TMAP Mapping Results reads 72,058,773 mapped reads 65,224,903 mapped bases 12,211,168,843 perfectly mapped 10,436,368 len max 638 len mean 187 len stdev 81 mapq mean 14 mapq stdev 10 snp rate 3 ins rate 20 del rate 23 pct mismatch 84 base qual mean 20 base qual stdev 6
    • ORG TMAP RESULTS: 89% READ MAPPED CORRECTED TMAP RESULTS: 95% READS MAPPED 5/18/2012 34 IonTorrentDataAnalysis@MonsantoCo ORG BWA Mapping Results Total Number of Reads 72.1M Number of Reads Mapped 65.2M Percentage of Reads Mapped 91% Corrected BWA Mapping Results Total Number of Reads 72.1M Number of Reads Mapped 68.1M Percentage of Reads Mapped 95%
    • CORRECTED TMAP RESULTS: BASE QUALITY DECREASE FROM 100 5/18/2012 35 IonTorrentDataAnalysis@MonsantoCo Quality keeps on dropping after 200 bp
    • CORRECTED TMAP RESULTS : BASE QUALITY DECREASES FROM 100 BP  Per Base Quality 5/18/2012 36 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 37 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED TMAP RESULTS: HIGH MISMATCH; HIGH INSERTION; HIGH DELETION; BUT 10% SMALLER THEN ORG READS  Error Profiles 5/18/2012 38 IonTorrentDataAnalysis@MonsantoCo
    • CORRECTED TMAP RESULTS : OVER REPRESENTATION BETWEEN 150-450 BP  K-mer Profile  There is over representation of K-mers from position 150 to 450. 5/18/2012 39 IonTorrentDataAnalysis@MonsantoCo
    • ASSEMBLY 5/18/2012IonTorrentDataAnalysis@MonsantoCo 40
    • 400 BP READS: N50 421BP; MAX CONTIG 4.6KB; TOTAL BASES 201MB 400 Bp Reads Assembly Stats Number of contigs 51,7835 Total size of contigs 201,990,292 Longest contig 4,684 Shortest contig 23 Number of contigs > 1K nt 11,939 2.30% Number of contigs > 10K nt 0 0.00% Mean contig size 390 Median contig size 329 N50 contig length 421 5/18/2012 41 IonTorrentDataAnalysis@MonsantoCo
    • 400 BP READS: N50 426BP; MAX CONTIG 4.2KB; TOTAL BASES 201MB 400 Bp Reads clipped at length 450 Assembly Stats Number of contigs 509,308 Total size of contigs 201,527,141 Longest contig 4,272 Shortest contig 23 Number of contigs > 1K nt 13,781 2.70% Number of contigs > 10K nt 0 0.00% Mean contig size 396 Median contig size 331 N50 contig length 426 5/18/2012 42 IonTorrentDataAnalysis@MonsantoCo • Reads Clipped at length 450
    • 400 BP READS: N50 430BP; MAX CONTIG 5.4KB; TOTAL BASES 192MB 400 Bp Reads clipped at length 450 qual 15 Assembly Stats Number of contigs 478,037 Total size of contigs 192,109,210 Longest contig 5,378 Shortest contig 23 Number of contigs > 1K nt 16,737 3.50% Number of contigs > 10K nt 0 0.00% Mean contig size 402 Median contig size 324 N50 contig length 430 5/18/2012 43 IonTorrentDataAnalysis@MonsantoCo • Reads Clipped at length 450 with minimum quality of 15
    • ORG READS: N50 397BP; MAX CONTIG 5KB; TOTAL BASES 185MB Org Reads Assembly Stats Number of contigs 486,255 Total size of contigs 185,584,458 Longest contig 5,878 Shortest contig 24 Number of contigs > 1K nt 15,386 3.20% Number of contigs > 10K nt 0 0.00% Mean contig size 382 Median contig size 299 N50 contig length 397 5/18/2012 44 IonTorrentDataAnalysis@MonsantoCo
    • ERROR CORRECTED READS: N50 550BP; MAX CONTIG 28KB; TOTAL BASES 203MB Error Corrected Reads Assembly Stats Number of contigs 424,264 Total size of contigs 203,921,151 Longest contig 28,009 Shortest contig 24 Number of contigs > 1K nt 33,025 7.80% Number of contigs > 10K nt 43 0.00% Mean contig size 481 Median contig size 328 N50 contig length 550 5/18/2012 45 IonTorrentDataAnalysis@MonsantoCo
    • QUESTIONS 5/18/2012 46 IonTorrentDataAnalysis@MonsantoCo
    • ACKNOWLEDGEMENTS  Todd Michael  Randall Kerstetter  Shiaw-Pyng Yang  Ryan Richt  Xuefeng Zhou 5/18/2012 47 IonTorrentDataAnalysis@MonsantoCo