Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Next Generation Sequencing 
File Formats. 
Pierre Lindenbaum 
@yokofakun 
pierre.lindenbaum@univ-nantes.fr 
http://plinden...
You don't need to have a deep knowledge of those formats. 
(Unless you're doing NGS) 
Pierre Lindenbaum@yokofakun pierre.l...
Understand how people have solved their BIG data problems. 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr h...
Why sequencing ? 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegn...
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFh...
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFh...
Well, that's a little more complicated ... 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGlein...
FASTQ 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.i...
FASTQ 
FASTQ: text-based format for storing both a DNA sequence and 
its corresponding quality scores 
Pierre Lindenbaum@y...
FASTQ 
FASTQ for single end 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanum...
FASTQ 
FASTQ for paired end 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanum...
FASTQ Example 
@IL31_4368:1:1:996:8507/1 
NTGATAAAGTAATGACAAAATAATGACATTATTGTTACTATGGTTACTGTGGGA 
+ 
(94**0-)*7=06>>><<<<<...
FASTQ name 
@EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG 
Col Brief description 
EAS139 the unique instrument nam...
lter (read is bad), N otherwise 
18 0 when none of the control bits are on, otherwise it is an even number 
ATCACG index s...
FASTQ Quality 
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscp...
FASTQ Quality 
A quality value Q is an integer mapping of p (i.e., the probability 
that the corresponding base call is in...
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
File formats for Next Generation Sequencing
Upcoming SlideShare
Loading in …5
×

File formats for Next Generation Sequencing

17,749 views

Published on

Course File formats for Next Generation Analysis . September 2013; Univ-Nantes.

Published in: Health & Medicine, Technology
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • It's perfect,thank you very much.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Excellent explanations!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Merci Pierre! Finalement je comprend les Flags!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • y'a pas de quoi :-)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

File formats for Next Generation Sequencing

  1. 1. Next Generation Sequencing File Formats. Pierre Lindenbaum @yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.com https://github.com/lindenb/courses Institut du Thorax. Nantes. France September 19, 2014 Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  2. 2. You don't need to have a deep knowledge of those formats. (Unless you're doing NGS) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  3. 3. Understand how people have solved their BIG data problems. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  4. 4. Why sequencing ? Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  5. 5. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  6. 6. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  7. 7. Well, that's a little more complicated ... Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  8. 8. FASTQ Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  9. 9. FASTQ FASTQ: text-based format for storing both a DNA sequence and its corresponding quality scores Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  10. 10. FASTQ FASTQ for single end Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  11. 11. FASTQ FASTQ for paired end Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  12. 12. FASTQ Example @IL31_4368:1:1:996:8507/1 NTGATAAAGTAATGACAAAATAATGACATTATTGTTACTATGGTTACTGTGGGA + (94**0-)*7=06>>><<<<<<22@>6;;;5;6:;63:4?-622647..-.5.% @IL31_4368:1:1:996:21421/1 NAAGTTAATTCTTCATTGTCCATTCCTCTGAAATGATTCAGAAATACTGGTAGT + (**+*2396,@<+<:@@@;;5)<0)69606>4;5>;>6&<102)0*+8:&137; @IL31_4368:1:1:997:10572/1 NAATGTATGTAGACCCTTCACATTCAAAGGCAAATACAATATCATCATGTCTTC + (/9**-0032>:>>9>4@@=>??@@:-66,;>;<;6+;255,1;7>>>>3676' @IL31_4368:1:1:997:15684/1 NGCAATCAATGCTATGATTGATCCTGATGGAACTTTGGAGGCTCTGAACAACAT + ()1,*37766>@@@>?@<?@@:>@0>>><-888>8;>*;966>;;;@8@4,.2. @IL31_4368:1:1:997:15249/1 NCGTTATAATGGAATTATTTTTCTTCCTTTATTTAATGTGTTGACAAAGAGAAC Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  13. 13. FASTQ name @EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG Col Brief description EAS139 the unique instrument name 136 the run id FC706VJ the owcell id 2 owcell lane 2104 tile number within the owcell lane 15343 'x'-coordinate of the cluster within the tile 197393 'y'-coordinate of the cluster within the tile 1 the member of a pair, 1 or 2 (paired-end or mate-pair reads only) Y Y if the read fails
  14. 14. lter (read is bad), N otherwise 18 0 when none of the control bits are on, otherwise it is an even number ATCACG index sequence Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  15. 15. FASTQ Quality Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr httNp:e/x/tpGleinnedreantiboanumS.ebqluoegnscpinogtF.icleomFhotrtmpas:t/s/. github.com/lindenb/courses
  16. 16. FASTQ Quality A quality value Q is an integer mapping of p (i.e., the probability that the corresponding base call is incorrect). Qsanger =

×