Assembly: before and after
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Assembly: before and after

on

  • 2,641 views

A talk I gave at the Dec 2013 Assembly Masterclass at UC Davis. Really licensed under CC0. UPDATED May 2014, for the presentation I gave at the combined SeRC Nordic Assembly Workshop in Stockholm, ...

A talk I gave at the Dec 2013 Assembly Masterclass at UC Davis. Really licensed under CC0. UPDATED May 2014, for the presentation I gave at the combined SeRC Nordic Assembly Workshop in Stockholm, Sweden, May 14th 2014

Statistics

Views

Total Views
2,641
Views on SlideShare
1,410
Embed Views
1,231

Actions

Likes
5
Downloads
44
Comments
0

15 Embeds 1,231

http://www.homolog.us 1093
http://feedly.com 58
https://twitter.com 55
http://digg.com 5
http://feedreader.com 4
https://feedly.com 3
https://www.inoreader.com 2
http://www.inoreader.com 2
https://slickreader.com 2
https://digg.com 2
http://inoreader.com 1
https://www.newsblur.com 1
http://newsblur.com 1
https://newsblur.com 1
http://127.0.0.1 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Assembly: before and after Presentation Transcript

  • 1. Assembly – before and after Lex Nederbragt lex.nederbragt@ibv.uio.no @lexnederbragt
  • 2. A warning The list is by no means complete Nor do we have experience with all the programs mentioned
  • 3. Sample DNA Reads Genome assembly Sequencing AssemblyDNA isolation QC QCQC
  • 4. Reads Genome assembly Assembly QC
  • 5. Fastqc
  • 6. Prinseq
  • 7. Many others… www.nipgr.res.in/ngsqctoolkit.html
  • 8. preqc (sga) http://arxiv.org/abs/1307.8026
  • 9. Reads Genome assembly Assembly Grooming
  • 10. Format conversion http://en.wikipedia.org/wiki/FASTQ_format Fastq format hell
  • 11. Adapter/quality trimming http://www.biostars.org/p/53528/ Celera assembler Overlap based trimming Fastx Toolkit Seqtk PrinSeq NGS QC Toolkit Trimmomatic BioPieces Cutadapt … …
  • 12. Mate pair splitting and orientation 150 – 600 bases Illumina paired end reads 2 – 40 kilobases Illumina mate pair reads 2 – 40 kilobases 454 mate pair reads linker
  • 13. Mate pair splitting and orientation Illumina paired end reads Illumina mate pair reads 454 mate pair reads linker junctionjunction + + paired end reads ‘contamination’
  • 14. Mate pair splitting and orientation Illumina paired end reads Illumina mate pair reads 454 mate pair reads linker junctionjunction + + paired end reads ‘contamination’ Check what orientation your assembler expects for the reads!
  • 15. Reads Genome assembly Assembly Preparing
  • 16. Error-correction Stand-alone or built into assembler
  • 17. Merging pairs List from Torsten Seeman’s blog http://thegenomefactory.blogspot.no/2012/11/tools-to-merge-overlapping-paired-end.html COPE http://sourceforge.net/projects/coperead/ SeqPrep https://github.com/jstjohn/SeqPrep FLASH http://www.cbcb.umd.edu/software/flash fastq-join http://code.google.com/p/ea-utils/wiki/FastqJoin PANDAseq https://github.com/neufeld/pandaseq mergePairs.py http://code.google.com/p/standardized-velvet-assembly-report/source/browse/trunk/mergePairs.py Recent addition
  • 18. Extend reads http://140.116.235.124/~tliu/arf-pe/
  • 19. Digital normalisation http://arxiv.org/abs/1203.4802
  • 20. Estimate kmer to use preqc (SGA) http://arxiv.org/abs/1307.8026
  • 21. Reads Genome assembly Assembly What can the reads tell us about the genome
  • 22. kmer-based preqc (SGA) Kmerspectrumanalyzer http://arxiv.org/abs/1307.8026 Khmer from Titus
  • 23. Reads Genome assembly Assembly This talk
  • 24. Reads Genome assembly Assembly QC
  • 25. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 26. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 27. Assemblathon stats http://korflab.ucdavis.edu/datasets/Assemblathon/Assemblathon2/Basic_metrics/assembla thon_stats.pl OR https://github.com/lexnederbragt/sequencetools/
  • 28. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 29. Gap closing IMAGE2
  • 30. Correcting bases Quiver from Pacific Biosciences
  • 31. Separate scaffolding
  • 32. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 33. Assembly merging/reconciliation
  • 34. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 35. Mapped genomic reads FRCBAM
  • 36. Mapped transcriptomic reads
  • 37. Gene finding
  • 38. Binning Bacteroides Proteobacteria Cyanobacteria Per-con g read depth Nederbragt et al, 2010
  • 39. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 40. Genome browser(s) IGV
  • 41. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 42. Comparative measures Log Average Probability (LAP) Assembly Likelihood Evaluation (ALE) See also Howison, Zapata2 and Dunn (2013) Toward a statistically explicit understanding of de novo sequence assembly doi: 10.1093/bioinformatics/btt525
  • 43. Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • 44. Reference comparison Mauve assembly metrics
  • 45. Review
  • 46. Too many tools… http://seqanswers.com/wiki/Software/list
  • 47. Too many tools… http://wwwdev.ebi.ac.uk/fg/hts_mappers 88 short-read mappers
  • 48. Embargo!
  • 49. Benchmarking, anyone?
  • 50. All-in-one assembly pipeline doi:10.1186/1471-2105-15-126