2011-04-26_various-assemblers-presentation

1,993 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,993
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
63
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2011-04-26_various-assemblers-presentation

  1. 1. Assembly tools and VisualisationMatthias Haimel EBI is an Outstation of the European Molecular Biology Laboratory.
  2. 2. Overview • Assemblers • ABySS • SOAPdenovo • Visualisation • Tablet • AbySS-Explorer • Read mapping • Sam / Bam • Visualisation • Artemis • IGV - Integrative Genomics Viewer2 25.04.11 Assemblers
  3. 3. ABySS Assembly By Short Sequences • Genome Sciences Centre, Vancouver • http://www.bcgsc.ca/platform/bioinfo/software/abyss • Open source, BCCA Licence • de Bruijn graph • Trimming (tip clipping), bubble popping • Use paired-end information: resolve ambiguities between contigs • parallel (use cluster) • Files • Fasta / Fastq • Sam/Bam • colour-space3 25.04.11 Assemblers
  4. 4. ABySS • ABYSS (singe end) • e.g. ABYSS -k27 single.fastq -o contigs.fa • abyss-pe (paired end) • e.g. abyss-pe k=27 n=10 in=read_1.fastq read_2.fastq name=ecli • Multiple libraries • ... lib=’read1 read2’ read1=’read1_1.fa read1_2.fa’ read2=’read2_1.fa read1_2.fa’4 25.04.11 Assemblers
  5. 5. SOAPdenovo • Beijing Genomics Institute (BGI), China • http://soap.genomics.org.cn/soapdenovo.html • Panda genome • Closed source • de Bruijn graph • pre-set Kmer frequency threshold • Bubble removing • Build scaffold • mapping reads to contigs • gap filling5 25.04.11 Assemblers
  6. 6. SOAPdenovo • Full run • e.g. SOAPdenovo all -s read.config -K 27 -o contigs.fa • Run sub steps • pregraph = velveth • contig = velvetg • map map reads to contigs • scaff scaffolding • Configuration • Config file input instead of read files • Specify rank, usage (assembly/scaffolding), insert size6 25.04.11 Assemblers
  7. 7. Visualisation http://bioinf.scri.ac.uk/tablet/ • Tablet • Lightweight • Easy to use • Formats • ACE • AFG • BAM • BANK (AMOS)7 25.04.11 Assemblers
  8. 8. Visualisation - Velvet • Tablet • Velvetg ... -amos_file yes • GraphViz • Transform velvet graph into GraphViz format • Contributed by Paul Harrison • <velvet>/contrib/layout/ • Velvet -> .dot file (Python script) • .dot -> png (graphviz)8 25.04.11 Assemblers
  9. 9. Visualisation http://www.bcgsc.ca/platform/bioinfo/software/abyss-explorer • ABySS-Explorer • Visualizes ABySS assemblies • Interactive graph structure • Filter contigs9 25.04.11 Assemblers
  10. 10. Assembler - Practical • Assemblers • ABySS • SOAPdenovo • Visualisation • Tablet • ABySS-Explorer10 25.04.11 Assemblers
  11. 11. Read mapping http://samtools.sourceforge.net/SAM1.pdf • SAM / BAM • Sequence Alignment / Map format (SAM) • Binary form of SAM (BAM) • generic format • Flexible and simple • Compact (BAM) • Allow indexing • Load regions • Support streaming11 25.04.11 Assemblers
  12. 12. SAM • Header • File format version information • Sequence dictionary (name/length/..) • Read group (platform/library/...) • Program info • Body • Alignment information12 25.04.11 Assemblers
  13. 13. SAM Header • @ followed by record type (two characters) @HD VN:1.0 @SQ SN:chr20 LN:62435964 @RG ID:L1 PU:SC_1_10 LB:SC_1 SM:NA12891 @RG ID:L2 PU:SC_2_12 LB:SC_2 SM:NA1289113 25.04.11 Assemblers
  14. 14. SAM Alignment • Tab delimited lines14 25.04.11 Assemblers
  15. 15. SAM Alignment • Tab delimited lines Read_28833_29006_6945 99 chr20 28833 20 10M1D25M = 28993 195 AGCT... <<<<... NM:i:1 RG:Z:L1 read_28701_28881_323b 147 chr20 28834 30 35M = 28701 -168 ACCT... <<7;:... MF:i:18 RG:Z:L215 25.04.11 Assemblers
  16. 16. Tools • Mapping Reads • BWA • Bowtie • SSAHA2 • Manipulate SAM/BAM • SAM Tools package • Picard16 25.04.11 Assemblers
  17. 17. BWA • Burrows-Wheeler Alignment Tool • Map (singe/paired-end/long) reads to a sequence • Index database • bwa index -a bwtsw database.fasta • Align reads • bwa aln database.fasta short_read.fastq > aln_sa.sai • Generate alignments • bwa sampe database.fasta aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln.sam • Long reads • bwa bwasw database.fasta long_read.fastq > aln.sam17 25.04.11 Assemblers
  18. 18. SAM tools • Utilities for SAM format • samtools <command> ... • Commands: • view: SAM <-> BAM • sort: sort BAM file • index: build BAM file index • merge: merges x BAM files • pileup: alignment in the pileup format • tview: integrated Text alignment viewer18 25.04.11 Assemblers
  19. 19. Visualisation Integrative Genomics Viewer http://www.broadinstitute.org/igv/ • IGV • Good integration • Formats • DAS • BAM • GFF • ... • Tools • Run scripts • Export region • ...19 25.04.11 Assemblers
  20. 20. Visualisation http://www.sanger.ac.uk/resources/software/artemis/ • Artemis • Sequence Viewer • Annotation tool • Formats • EMBL • GENBANK • GFF • FASTA • BAM20 25.04.11 Assemblers
  21. 21. Mapping - Practical • Mapping reads + prepare for visalization • BWA • samtools • Visualisation • IGV21 25.04.11 Assemblers

×