• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Assembly: before and after
 

Assembly: before and after

on

  • 2,429 views

A talk I gave at the Dec 2013 Assembly Masterclass at UC Davis. Really licensed under CC0. UPDATED May 2014, for the presentation I gave at the combined SeRC Nordic Assembly Workshop in Stockholm, ...

A talk I gave at the Dec 2013 Assembly Masterclass at UC Davis. Really licensed under CC0. UPDATED May 2014, for the presentation I gave at the combined SeRC Nordic Assembly Workshop in Stockholm, Sweden, May 14th 2014

Statistics

Views

Total Views
2,429
Views on SlideShare
1,206
Embed Views
1,223

Actions

Likes
5
Downloads
40
Comments
0

15 Embeds 1,223

http://www.homolog.us 1087
http://feedly.com 58
https://twitter.com 53
http://digg.com 5
http://feedreader.com 4
https://feedly.com 3
https://www.inoreader.com 2
http://www.inoreader.com 2
https://slickreader.com 2
https://digg.com 2
http://inoreader.com 1
https://www.newsblur.com 1
http://newsblur.com 1
https://newsblur.com 1
http://127.0.0.1 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

Assembly: before and after Assembly: before and after Presentation Transcript

  • Assembly – before and after Lex Nederbragt lex.nederbragt@ibv.uio.no @lexnederbragt
  • A warning The list is by no means complete Nor do we have experience with all the programs mentioned
  • Sample DNA Reads Genome assembly Sequencing AssemblyDNA isolation QC QCQC
  • Reads Genome assembly Assembly QC
  • Fastqc
  • Prinseq
  • Many others… www.nipgr.res.in/ngsqctoolkit.html
  • preqc (sga) http://arxiv.org/abs/1307.8026
  • Reads Genome assembly Assembly Grooming
  • Format conversion http://en.wikipedia.org/wiki/FASTQ_format Fastq format hell
  • Adapter/quality trimming http://www.biostars.org/p/53528/ Celera assembler Overlap based trimming Fastx Toolkit Seqtk PrinSeq NGS QC Toolkit Trimmomatic BioPieces Cutadapt … …
  • Mate pair splitting and orientation 150 – 600 bases Illumina paired end reads 2 – 40 kilobases Illumina mate pair reads 2 – 40 kilobases 454 mate pair reads linker
  • Mate pair splitting and orientation Illumina paired end reads Illumina mate pair reads 454 mate pair reads linker junctionjunction + + paired end reads ‘contamination’
  • Mate pair splitting and orientation Illumina paired end reads Illumina mate pair reads 454 mate pair reads linker junctionjunction + + paired end reads ‘contamination’ Check what orientation your assembler expects for the reads!
  • Reads Genome assembly Assembly Preparing
  • Error-correction Stand-alone or built into assembler
  • Merging pairs List from Torsten Seeman’s blog http://thegenomefactory.blogspot.no/2012/11/tools-to-merge-overlapping-paired-end.html COPE http://sourceforge.net/projects/coperead/ SeqPrep https://github.com/jstjohn/SeqPrep FLASH http://www.cbcb.umd.edu/software/flash fastq-join http://code.google.com/p/ea-utils/wiki/FastqJoin PANDAseq https://github.com/neufeld/pandaseq mergePairs.py http://code.google.com/p/standardized-velvet-assembly-report/source/browse/trunk/mergePairs.py Recent addition
  • Extend reads http://140.116.235.124/~tliu/arf-pe/
  • Digital normalisation http://arxiv.org/abs/1203.4802
  • Estimate kmer to use preqc (SGA) http://arxiv.org/abs/1307.8026
  • Reads Genome assembly Assembly What can the reads tell us about the genome
  • kmer-based preqc (SGA) Kmerspectrumanalyzer http://arxiv.org/abs/1307.8026 Khmer from Titus
  • Reads Genome assembly Assembly This talk
  • Reads Genome assembly Assembly QC
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Assemblathon stats http://korflab.ucdavis.edu/datasets/Assemblathon/Assemblathon2/Basic_metrics/assembla thon_stats.pl OR https://github.com/lexnederbragt/sequencetools/
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Gap closing IMAGE2
  • Correcting bases Quiver from Pacific Biosciences
  • Separate scaffolding
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Assembly merging/reconciliation
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Mapped genomic reads FRCBAM
  • Mapped transcriptomic reads
  • Gene finding
  • Binning Bacteroides Proteobacteria Cyanobacteria Per-con g read depth Nederbragt et al, 2010
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Genome browser(s) IGV
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Comparative measures Log Average Probability (LAP) Assembly Likelihood Evaluation (ALE) See also Howison, Zapata2 and Dunn (2013) Toward a statistically explicit understanding of de novo sequence assembly doi: 10.1093/bioinformatics/btt525
  • Genome assembly Comparing to each other Metrics Merging Improvement Visualization Validation Comparing to reference
  • Reference comparison Mauve assembly metrics
  • Review
  • Too many tools… http://seqanswers.com/wiki/Software/list
  • Too many tools… http://wwwdev.ebi.ac.uk/fg/hts_mappers 88 short-read mappers
  • Embargo!
  • Benchmarking, anyone?
  • All-in-one assembly pipeline doi:10.1186/1471-2105-15-126