SlideShare a Scribd company logo
1 of 49
Potato SNPs


Dan Bolser and David Martin

  Next Gen Bug, Dundee
       01/18/2010



                        1
Aims of the work
1) Learn about handling RNASeq
     
         Create a SNP calling pipeline


2) Select SNPs for genetic mapping
     
         Using Illumina's GoldenGate SNP chip (OPA)




                                         2
Creating a SNP calling pipeline




                       3
4
Align (using BWA)
1) Index the potato genome assembly
bwa index [-a bwtsw|div|is]             [-c]
 <in.fasta>
2) Perform the alignment
bwa aln [options] <in.fasta>
 <in.fq>
3) Output results in SAM format (single end)
bwa samse <in.fasta> <in.sai>
 <in.fq>                  5
Align (using Bowtie)
1) Index the potato genome assembly
bowtie-build [options] <in.fasta>
  <ebwt>
2) Perform the alignment and output results
bowtie [options] <ebwt> <in.fq>
7
Convert (using SAMtools)
1) Convert SAM to BAM for sorting
samtools view -S -b <in.sam>
2) Sort BAM for SNP calling
samtools sort <in.bam> <out.bam.s>


  Alignments are both compressed for long term
storage and sorted for variant discovery.

                                    8
9
Coverage profiles /
  Depth vectors



                 10
SAMtools...

    Dump a coverage profile
samtools mpileup -f <in.fasta>
 <my.bam.s>
    P1   244526   A   10   ...,.,,,..      BBQa`aaaa[
    P1   244527   A   10   ...,.,,,..      BBZ_`^a_a[
    P1   244528   C   10   .$.$.,.,,,..    >>RaZ`aaaa
    P1   244529   C    8   .,.,,,..        NaXaaaa`
    P1   244530   T    8   .,.,,,..        Xa_aaa`
    P1   244531   C    8   .,.,,,..        Rbabbaa
    P1   244532   T    9   .,.,,,..^~.     EE^^^^^^A
    P1   244533   T    9   .,.,,,...       BBB
    P1   244534   T    9   .$,$.,,,...     @@^^^^^^E

                                          11
SAMtools Bio::DB::Sam (BioPerl)
Dump a coverage
 profile 2




                       12
SAMtools Bio::DB::Sam (BioPerl)
P41630
Matches : 9
0233333333333345555555555
 666778888888899999999999
 999999999999999999999999
 999976666666666665444444
 44443332211111111000

                        13
14
mpileup

    samtools mpileup collects summary
    information in the input BAMs, computes the
    likelihood of data given each possible
    genotype and stores the likelihoods in the
    BCF format.

    bcftools view applies the prior and does the
    actual calling.

    Finally, we filter.
                                    15
SNP call
1) Index the potato genome assembly (again!)
samtools faidx in.fasta
2) Run 'mpileup' to generate VCF format
samtools mpileup -ug -f in.fasta
  my1.bam.s my2.bam.s > my.raw.bcf

    Actually, all we did (I think) is perform a
    format conversion (BAM to VCF).
VCF format




             17
VCF format
A standard format for sequence variation:
  SNPs, indels and structural variants.
Compressed and indexed.
Developed for the 1000 Genomes Project.
VCFtools for VCF like SAMtools for SAM.
Specification and tools available from
 http://vcftools.sourceforge.net
                                    18
19
SNP call and filter
1) Call SNPs
bcftools view -bvcg my.raw.bcf >
 my.var.bcf
2) Filter SNPs
bcftools view my.var.bcf |
 vcfutils.pl varFilter my.var.bcf
 > my.var.bcf.filt


                             20
21
Aims of the work
1) Learn about handling RNASeq
     
         Create a SNP calling pipeline


2) Select SNPs for genetic mapping
     
         Using Illumina's GoldenGate SNP chip (OPA)




                                         22
Select SNPs for genetic mapping
 Using Illumina's GoldenGate SNP chip (OPA)




                                23
SNP chip (OPA) construction

    A set of DM SNP positions was provided by
    the SolCAP project (RNASeq derived).

    A subset was selected for developing OPAs
    (Illumina’s SNP chip technology).

    OPAs were run, and results have now been
    compared to RNASeq.


                                   24
Comparison (using an early SAMtools)
Comparison (using an early SAMtools)
27
Comparison (using an early SAMtools)
Comparison (using new SAMtools)
Comparison (using new SAMtools)
Looking into the RNASeq data…




                      34
35
Potato genome
  assembly




      RNASeq          RNASeq
     read library    read library




                    36
37
38
39
40
41
A lot more questions to answer…

    Track down more ‘strange’ SNPs based on
    the expected AFS of the two samples.

    Go beyond bialleleic SNPs

    Check the OPA base...
    −   Was the right base probed by the chip?




                                          42
Thank you for your patience!




                      43
OPAs in 5 steps...
         The DNA sample is
          activated for binding
          to paramagnetic
          particles.
OPAs in 5 steps...
         Three oligos are
          designed for each
          SNP locus. Two are
          specific to each allele
          of the SNP site
          (ASO) and a Locus-
          Specific Oligo (LSO).
OPAs in 5 steps...
        Several wash steps
         remove excess and
         mis-hybridized oligos.
        Extension of the
         appropriate ASO and
         ligation to the LSO joins
         information about the
         genotype to the
         address sequence on
         the LSO.
OPAs in 5 steps...
         The single-stranded,
          dye-labeled DNAs
          are hybridized to
          their complement
          bead type through
          their unique address
          sequences.
OPAs in 5 steps...
         Key to the assay:
         Scalable, multiplexing
          sample preparation
          (one tube reaction).
         Highly parallel array-
           based read-out.
         High-quality data:
           Average call rates
           above 99% accuracy.

More Related Content

What's hot

Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)Akshay Deshmukh
 
20100515 bioinformatics kapushesky_lecture06
20100515 bioinformatics kapushesky_lecture0620100515 bioinformatics kapushesky_lecture06
20100515 bioinformatics kapushesky_lecture06Computer Science Club
 
Genome editing with CRISPR/Cas9
Genome editing with CRISPR/Cas9Genome editing with CRISPR/Cas9
Genome editing with CRISPR/Cas9Saravanan KA
 
CRISPR-Cas systems and applications
CRISPR-Cas systems and applicationsCRISPR-Cas systems and applications
CRISPR-Cas systems and applicationsM.pooya naghshbandi
 
CRISPR/CAS9- THE GENE EDITING TOOL
CRISPR/CAS9- THE GENE EDITING TOOLCRISPR/CAS9- THE GENE EDITING TOOL
CRISPR/CAS9- THE GENE EDITING TOOLChandni Verma
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAGRF_Ltd
 
Sequenced taged sites (sts)
Sequenced taged sites (sts)Sequenced taged sites (sts)
Sequenced taged sites (sts)DHANRAJ GIRIMAL
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
Structural Genomics
Structural GenomicsStructural Genomics
Structural GenomicsAqsa Javed
 
CRISPR in crop Improvement, CRISPR/Cas Genome editing tool
CRISPR in crop Improvement, CRISPR/Cas Genome editing toolCRISPR in crop Improvement, CRISPR/Cas Genome editing tool
CRISPR in crop Improvement, CRISPR/Cas Genome editing toolParthasarathiG2
 

What's hot (20)

NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Crispr technique
Crispr techniqueCrispr technique
Crispr technique
 
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)
 
20100515 bioinformatics kapushesky_lecture06
20100515 bioinformatics kapushesky_lecture0620100515 bioinformatics kapushesky_lecture06
20100515 bioinformatics kapushesky_lecture06
 
Primerdesign
PrimerdesignPrimerdesign
Primerdesign
 
Genome editing with CRISPR/Cas9
Genome editing with CRISPR/Cas9Genome editing with CRISPR/Cas9
Genome editing with CRISPR/Cas9
 
CRISPR-Cas systems and applications
CRISPR-Cas systems and applicationsCRISPR-Cas systems and applications
CRISPR-Cas systems and applications
 
Intro to illumina sequencing
Intro to illumina sequencingIntro to illumina sequencing
Intro to illumina sequencing
 
Crisper cas
Crisper casCrisper cas
Crisper cas
 
Snp genotyping
Snp genotypingSnp genotyping
Snp genotyping
 
Gene discovery
Gene discoveryGene discovery
Gene discovery
 
CRISPR/CAS9- THE GENE EDITING TOOL
CRISPR/CAS9- THE GENE EDITING TOOLCRISPR/CAS9- THE GENE EDITING TOOL
CRISPR/CAS9- THE GENE EDITING TOOL
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
Sequenced taged sites (sts)
Sequenced taged sites (sts)Sequenced taged sites (sts)
Sequenced taged sites (sts)
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Structural Genomics
Structural GenomicsStructural Genomics
Structural Genomics
 
DNA libraries
DNA librariesDNA libraries
DNA libraries
 
CRISPR in crop Improvement, CRISPR/Cas Genome editing tool
CRISPR in crop Improvement, CRISPR/Cas Genome editing toolCRISPR in crop Improvement, CRISPR/Cas Genome editing tool
CRISPR in crop Improvement, CRISPR/Cas Genome editing tool
 

Viewers also liked

20-Line Lifesavers: Coding simple solutions in the GATK
20-Line Lifesavers: Coding simple solutions in the GATK20-Line Lifesavers: Coding simple solutions in the GATK
20-Line Lifesavers: Coding simple solutions in the GATKDan Bolser
 
Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics dataDan Bolser
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Denis C. Bauer
 
Press Release Vietnam -Vietnamese
Press Release Vietnam -VietnamesePress Release Vietnam -Vietnamese
Press Release Vietnam -VietnameseLe Thuy Hanh
 
IBM SaaS Complete A Questionnaire
IBM SaaS Complete A QuestionnaireIBM SaaS Complete A Questionnaire
IBM SaaS Complete A QuestionnaireChris Sparshott
 
Appearances do matter leadership in a crisis
Appearances do matter leadership in a crisisAppearances do matter leadership in a crisis
Appearances do matter leadership in a crisisJane Jordan-Meier
 
Chuong 1 tu bat on vi mo den con duong tai co cau
Chuong 1   tu bat on vi mo den con duong tai co cauChuong 1   tu bat on vi mo den con duong tai co cau
Chuong 1 tu bat on vi mo den con duong tai co cauLe Thuy Hanh
 
Building Your Personal Brand with Social Media
Building Your Personal Brand with Social MediaBuilding Your Personal Brand with Social Media
Building Your Personal Brand with Social MediaErin Dorney
 
Workshop social networking 09
Workshop social networking 09Workshop social networking 09
Workshop social networking 09Le Thuy Hanh
 
IBM SaaS Upload And Share A File
IBM SaaS Upload And Share A FileIBM SaaS Upload And Share A File
IBM SaaS Upload And Share A FileChris Sparshott
 
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?Martina Rüdiger
 
IR 2.0: media społecznościowe w relacjach inwestorskich
IR 2.0: media społecznościowe w relacjach inwestorskichIR 2.0: media społecznościowe w relacjach inwestorskich
IR 2.0: media społecznościowe w relacjach inwestorskichPiotr Biernacki
 
Luxury Real Estate Stats 4 26
Luxury Real Estate Stats 4 26Luxury Real Estate Stats 4 26
Luxury Real Estate Stats 4 26njhousehelper
 
DWI_Introduction Material_ver.01 (2)
DWI_Introduction Material_ver.01 (2)DWI_Introduction Material_ver.01 (2)
DWI_Introduction Material_ver.01 (2)Mohit Singh
 
TiếP Thị Số HướNg DẫNthiếT YếU Cho
TiếP Thị Số   HướNg DẫNthiếT YếU ChoTiếP Thị Số   HướNg DẫNthiếT YếU Cho
TiếP Thị Số HướNg DẫNthiếT YếU ChoLe Thuy Hanh
 
BioWikis BSB10
BioWikis BSB10BioWikis BSB10
BioWikis BSB10Dan Bolser
 
Manifesto Dos EmpresáRios
Manifesto Dos EmpresáRiosManifesto Dos EmpresáRios
Manifesto Dos EmpresáRiosFabricio Martins
 

Viewers also liked (20)

20-Line Lifesavers: Coding simple solutions in the GATK
20-Line Lifesavers: Coding simple solutions in the GATK20-Line Lifesavers: Coding simple solutions in the GATK
20-Line Lifesavers: Coding simple solutions in the GATK
 
Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics data
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
SNp mining in crops
SNp mining in cropsSNp mining in crops
SNp mining in crops
 
Press Release Vietnam -Vietnamese
Press Release Vietnam -VietnamesePress Release Vietnam -Vietnamese
Press Release Vietnam -Vietnamese
 
Cloud Computing and ROI
Cloud Computing and ROICloud Computing and ROI
Cloud Computing and ROI
 
IBM SaaS Complete A Questionnaire
IBM SaaS Complete A QuestionnaireIBM SaaS Complete A Questionnaire
IBM SaaS Complete A Questionnaire
 
Appearances do matter leadership in a crisis
Appearances do matter leadership in a crisisAppearances do matter leadership in a crisis
Appearances do matter leadership in a crisis
 
Chuong 1 tu bat on vi mo den con duong tai co cau
Chuong 1   tu bat on vi mo den con duong tai co cauChuong 1   tu bat on vi mo den con duong tai co cau
Chuong 1 tu bat on vi mo den con duong tai co cau
 
Building Your Personal Brand with Social Media
Building Your Personal Brand with Social MediaBuilding Your Personal Brand with Social Media
Building Your Personal Brand with Social Media
 
Workshop social networking 09
Workshop social networking 09Workshop social networking 09
Workshop social networking 09
 
IBM SaaS Upload And Share A File
IBM SaaS Upload And Share A FileIBM SaaS Upload And Share A File
IBM SaaS Upload And Share A File
 
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?
wchh2014 Wordpress ChildThemes - wieso, weshalb, warum?
 
IR 2.0: media społecznościowe w relacjach inwestorskich
IR 2.0: media społecznościowe w relacjach inwestorskichIR 2.0: media społecznościowe w relacjach inwestorskich
IR 2.0: media społecznościowe w relacjach inwestorskich
 
Luxury Real Estate Stats 4 26
Luxury Real Estate Stats 4 26Luxury Real Estate Stats 4 26
Luxury Real Estate Stats 4 26
 
DWI_Introduction Material_ver.01 (2)
DWI_Introduction Material_ver.01 (2)DWI_Introduction Material_ver.01 (2)
DWI_Introduction Material_ver.01 (2)
 
TiếP Thị Số HướNg DẫNthiếT YếU Cho
TiếP Thị Số   HướNg DẫNthiếT YếU ChoTiếP Thị Số   HướNg DẫNthiếT YếU Cho
TiếP Thị Số HướNg DẫNthiếT YếU Cho
 
BioWikis BSB10
BioWikis BSB10BioWikis BSB10
BioWikis BSB10
 
Manifesto Dos EmpresáRios
Manifesto Dos EmpresáRiosManifesto Dos EmpresáRios
Manifesto Dos EmpresáRios
 
Questions
QuestionsQuestions
Questions
 

Similar to Potato SNP Calling and Genetic Mapping

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​Jennifer Shelton
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...fruitbreedomics
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Monica Munoz-Torres
 
07 wp6 progresses&results-20130221
07 wp6 progresses&results-2013022107 wp6 progresses&results-20130221
07 wp6 progresses&results-20130221fruitbreedomics
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSHAMNAHAMNA8
 
20110524zurichngs 2nd pub
20110524zurichngs 2nd pub20110524zurichngs 2nd pub
20110524zurichngs 2nd pubsesejun
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
 
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...scalene
 

Similar to Potato SNP Calling and Genetic Mapping (20)

Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing
 
07 wp6 progresses&results-20130221
07 wp6 progresses&results-2013022107 wp6 progresses&results-20130221
07 wp6 progresses&results-20130221
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
 
20110524zurichngs 2nd pub
20110524zurichngs 2nd pub20110524zurichngs 2nd pub
20110524zurichngs 2nd pub
 
2.CRISPR .pptx
2.CRISPR .pptx2.CRISPR .pptx
2.CRISPR .pptx
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
Fish546
Fish546Fish546
Fish546
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Hamas 1
Hamas 1Hamas 1
Hamas 1
 
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
 

More from Dan Bolser

Ramona Tăme - Email Encryption and Digital SIgning
Ramona Tăme - Email Encryption and Digital SIgningRamona Tăme - Email Encryption and Digital SIgning
Ramona Tăme - Email Encryption and Digital SIgningDan Bolser
 
Nice 2012, BioWikis and DASWiki
Nice 2012, BioWikis and DASWikiNice 2012, BioWikis and DASWiki
Nice 2012, BioWikis and DASWikiDan Bolser
 
Ensembl plants hsf_d_bolser_2012
Ensembl plants hsf_d_bolser_2012Ensembl plants hsf_d_bolser_2012
Ensembl plants hsf_d_bolser_2012Dan Bolser
 
NETTAB 2012 flyer
NETTAB 2012 flyerNETTAB 2012 flyer
NETTAB 2012 flyerDan Bolser
 
Semantic MediaWiki Workshop
Semantic MediaWiki WorkshopSemantic MediaWiki Workshop
Semantic MediaWiki WorkshopDan Bolser
 
Wikipedia and the Global Brain
Wikipedia and the Global BrainWikipedia and the Global Brain
Wikipedia and the Global BrainDan Bolser
 

More from Dan Bolser (7)

Ramona Tăme - Email Encryption and Digital SIgning
Ramona Tăme - Email Encryption and Digital SIgningRamona Tăme - Email Encryption and Digital SIgning
Ramona Tăme - Email Encryption and Digital SIgning
 
Nice 2012, BioWikis and DASWiki
Nice 2012, BioWikis and DASWikiNice 2012, BioWikis and DASWiki
Nice 2012, BioWikis and DASWiki
 
Ensembl plants hsf_d_bolser_2012
Ensembl plants hsf_d_bolser_2012Ensembl plants hsf_d_bolser_2012
Ensembl plants hsf_d_bolser_2012
 
NETTAB 2012 flyer
NETTAB 2012 flyerNETTAB 2012 flyer
NETTAB 2012 flyer
 
Semantic MediaWiki Workshop
Semantic MediaWiki WorkshopSemantic MediaWiki Workshop
Semantic MediaWiki Workshop
 
Wikis at work
Wikis at workWikis at work
Wikis at work
 
Wikipedia and the Global Brain
Wikipedia and the Global BrainWikipedia and the Global Brain
Wikipedia and the Global Brain
 

Recently uploaded

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Recently uploaded (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Potato SNP Calling and Genetic Mapping

  • 1. Potato SNPs Dan Bolser and David Martin Next Gen Bug, Dundee 01/18/2010 1
  • 2. Aims of the work 1) Learn about handling RNASeq  Create a SNP calling pipeline 2) Select SNPs for genetic mapping  Using Illumina's GoldenGate SNP chip (OPA) 2
  • 3. Creating a SNP calling pipeline 3
  • 4. 4
  • 5. Align (using BWA) 1) Index the potato genome assembly bwa index [-a bwtsw|div|is] [-c] <in.fasta> 2) Perform the alignment bwa aln [options] <in.fasta> <in.fq> 3) Output results in SAM format (single end) bwa samse <in.fasta> <in.sai> <in.fq> 5
  • 6. Align (using Bowtie) 1) Index the potato genome assembly bowtie-build [options] <in.fasta> <ebwt> 2) Perform the alignment and output results bowtie [options] <ebwt> <in.fq>
  • 7. 7
  • 8. Convert (using SAMtools) 1) Convert SAM to BAM for sorting samtools view -S -b <in.sam> 2) Sort BAM for SNP calling samtools sort <in.bam> <out.bam.s>  Alignments are both compressed for long term storage and sorted for variant discovery. 8
  • 9. 9
  • 10. Coverage profiles / Depth vectors 10
  • 11. SAMtools...  Dump a coverage profile samtools mpileup -f <in.fasta> <my.bam.s> P1 244526 A 10 ...,.,,,.. BBQa`aaaa[ P1 244527 A 10 ...,.,,,.. BBZ_`^a_a[ P1 244528 C 10 .$.$.,.,,,.. >>RaZ`aaaa P1 244529 C 8 .,.,,,.. NaXaaaa` P1 244530 T 8 .,.,,,.. Xa_aaa` P1 244531 C 8 .,.,,,.. Rbabbaa P1 244532 T 9 .,.,,,..^~. EE^^^^^^A P1 244533 T 9 .,.,,,... BBB P1 244534 T 9 .$,$.,,,... @@^^^^^^E 11
  • 12. SAMtools Bio::DB::Sam (BioPerl) Dump a coverage profile 2 12
  • 13. SAMtools Bio::DB::Sam (BioPerl) P41630 Matches : 9 0233333333333345555555555 666778888888899999999999 999999999999999999999999 999976666666666665444444 44443332211111111000 13
  • 14. 14
  • 15. mpileup  samtools mpileup collects summary information in the input BAMs, computes the likelihood of data given each possible genotype and stores the likelihoods in the BCF format.  bcftools view applies the prior and does the actual calling.  Finally, we filter. 15
  • 16. SNP call 1) Index the potato genome assembly (again!) samtools faidx in.fasta 2) Run 'mpileup' to generate VCF format samtools mpileup -ug -f in.fasta my1.bam.s my2.bam.s > my.raw.bcf  Actually, all we did (I think) is perform a format conversion (BAM to VCF).
  • 18. VCF format A standard format for sequence variation: SNPs, indels and structural variants. Compressed and indexed. Developed for the 1000 Genomes Project. VCFtools for VCF like SAMtools for SAM. Specification and tools available from http://vcftools.sourceforge.net 18
  • 19. 19
  • 20. SNP call and filter 1) Call SNPs bcftools view -bvcg my.raw.bcf > my.var.bcf 2) Filter SNPs bcftools view my.var.bcf | vcfutils.pl varFilter my.var.bcf > my.var.bcf.filt 20
  • 21. 21
  • 22. Aims of the work 1) Learn about handling RNASeq  Create a SNP calling pipeline 2) Select SNPs for genetic mapping  Using Illumina's GoldenGate SNP chip (OPA) 22
  • 23. Select SNPs for genetic mapping Using Illumina's GoldenGate SNP chip (OPA) 23
  • 24. SNP chip (OPA) construction  A set of DM SNP positions was provided by the SolCAP project (RNASeq derived).  A subset was selected for developing OPAs (Illumina’s SNP chip technology).  OPAs were run, and results have now been compared to RNASeq. 24
  • 25. Comparison (using an early SAMtools)
  • 26. Comparison (using an early SAMtools)
  • 27. 27
  • 28.
  • 29. Comparison (using an early SAMtools)
  • 31.
  • 32.
  • 34. Looking into the RNASeq data… 34
  • 35. 35
  • 36. Potato genome assembly RNASeq RNASeq read library read library 36
  • 37. 37
  • 38. 38
  • 39. 39
  • 40. 40
  • 41. 41
  • 42. A lot more questions to answer…  Track down more ‘strange’ SNPs based on the expected AFS of the two samples.  Go beyond bialleleic SNPs  Check the OPA base... − Was the right base probed by the chip? 42
  • 43. Thank you for your patience! 43
  • 44.
  • 45. OPAs in 5 steps... The DNA sample is activated for binding to paramagnetic particles.
  • 46. OPAs in 5 steps... Three oligos are designed for each SNP locus. Two are specific to each allele of the SNP site (ASO) and a Locus- Specific Oligo (LSO).
  • 47. OPAs in 5 steps... Several wash steps remove excess and mis-hybridized oligos. Extension of the appropriate ASO and ligation to the LSO joins information about the genotype to the address sequence on the LSO.
  • 48. OPAs in 5 steps... The single-stranded, dye-labeled DNAs are hybridized to their complement bead type through their unique address sequences.
  • 49. OPAs in 5 steps... Key to the assay: Scalable, multiplexing sample preparation (one tube reaction). Highly parallel array- based read-out. High-quality data: Average call rates above 99% accuracy.

Editor's Notes

  1. All three oligo sequences contain regions of genomic complementarity and universal PCR primer sites; the LSO also contains a unique address sequence that targets a particular bead type. Up to 1,536 SNPs may be interrogated simultaneously in this manner. During the primer hybridization process, the assay oligos hybridize to the genomic DNA sample bound to paramagnetic particles. Because hybridization occurs prior to any amplification steps, no amplification bias can be introduced into the assay.
  2. Extension of the appropriate ASO and ligation of the extended product to the LSO joins information about the genotype present at the SNP site to the address sequence on the LSO Allele-specific primer extension (ASPE). This step is used to preferentially extend the correctly matched ASO (at the 3&apos; end) up to the 5&apos; end of the LSO primer.
  3. One to one mapping between an address sequence on the array and the locus being scored. As a result of this labeling scheme, the PCR product consists of double stranded DNA of which one strand, containing the complement to the Illumicode, is labeled with either Cy3 or Cy5 in an allele specific manner, and a complementary strand labeled with biotin. The biotinylated strand is removed and the single, florescently labeled strand hybridized to the BeadArray.