Scaling up the Marker Assay Developmentin Plant SpeciesMarch 2013                Florence SERVANT
Introduction:     Molecular Assisted Selection
DefinitionMarker assisted selection (MAS) refers to the use of DNA markers inbreeding.DNA markers are used as a substitute...
Example: Marker Assisted Backcrossing               > 6 generations                                  2-4 generations      ...
Marker Trait Association Discovery Process  (simplistic view)Identify right                          Select one or more   ...
Marker Assay Development for MAS    Marker                Need     Molecular     Need    Assisted           Marker Assays ...
More Definitions● SNP Marker Assay:  Genotyping assay developed from a Single Nucleotide Polymorphism  (SNP)● Trait Linked...
Genotyping Technologies
Why SNP Markers ?     Abundant     Simple to discover     Fast and cheap to genotype using:           Genotyping by se...
Genotyping Technologies        “Towards Higher Throughput”                                                   Phase 2      ...
Why SNP Markers ?  Abundant  Simple to discover  Fast and cheap to genotype using:         Genotyping by sequencing   ...
How SNP Markers are Genotyped● Assay free methods:   - “Genotyping by sequencing”   - More and more used when many markers...
How SNP Markers are Genotyped# of Polymorphims108                 NGS107            Solexa              or alternative    ...
HD Genotyping Chip Creation                                   Designability                                               ...
SNP Selection Criteria  Technical Criteria     Technology compatibility (designability score)     Technology sub-type (...
Bioinformatics Context     Fingerprinting projects                                             Trait Linked Markers       ...
Pipeline of Alignment Analysis17
What is a Universal Marker Assay? Technical Constraints…GGCTTAGCTTACTCGGCTTGCTGATGCTAGTGGCTAGG[A/C]TAGCTGATCGATGCTATAGCATC...
What is a Universal Marker Assay? Technical Constraints                                   SNPs to target                  ...
Genomic Sequencing
More Plant Genomes – Rate of Publication                                   Published Plant Genomes       50       45      ...
Sequencing Technologies                          Kilobases per day per machine     10000000                               ...
Current Challenges
Current Challenges● Genotyping by Sequencing:     -   Algorithm development     -   Data integration (genotyping warehouse...
Different Genotypes for Association Studies                          Genotyping by                           Sequencing   ...
Current Challenges● Genotyping by Sequencing:     -   Algorithm development     -   Data integration (genotyping warehouse...
Knowledge Management Pyramid28
6. Florence Servant, Syngenta
Upcoming SlideShare
Loading in...5
×

6. Florence Servant, Syngenta

966

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
966
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "6. Florence Servant, Syngenta"

  1. 1. Scaling up the Marker Assay Developmentin Plant SpeciesMarch 2013 Florence SERVANT
  2. 2. Introduction: Molecular Assisted Selection
  3. 3. DefinitionMarker assisted selection (MAS) refers to the use of DNA markers inbreeding.DNA markers are used as a substitute for or to assist phenotypicscreening.● The good plants are kept to serve as the parents for next generation● The bad plants can be disposed, saving space and time in the greenhouseAssumption: DNA markers can reliably predict phenotype They are tightly-linked to target loci3
  4. 4. Example: Marker Assisted Backcrossing > 6 generations 2-4 generations saved Source:4 http://www.knowledgebank.irri.org/ricebreedingcourse/Marker_assisted_breeding.htm
  5. 5. Marker Trait Association Discovery Process (simplistic view)Identify right Select one or more Collect data Discover MTAgermplasm approach(es) Trait/Phenotype data Analysis • QTL MappingGermplasm • Association Marker Trait Mapping Association • BSA Marker/Genotype data • other 5
  6. 6. Marker Assay Development for MAS Marker Need Molecular Need Assisted Marker Assays Polymorphisms Selection Polymorphisms Polymorphisms Selection Discovery6
  7. 7. More Definitions● SNP Marker Assay: Genotyping assay developed from a Single Nucleotide Polymorphism (SNP)● Trait Linked SNP Marker Assay: Genotyping assay developed from a SNP linked to a trait● Anonymous SNP Marker Assay: Genotyping assay developed from a SNP with unknown linkage to trait or gene They are used for: • Mapping • Fingerprinting / Germplasm Management • Association Studies7
  8. 8. Genotyping Technologies
  9. 9. Why SNP Markers ?  Abundant  Simple to discover  Fast and cheap to genotype using:  Genotyping by sequencing  Different assay technologies9
  10. 10. Genotyping Technologies “Towards Higher Throughput” Phase 2 Illumina Infinium “Low” throughput (3K-200K plexes) TaqMan (MGB & BHQ) Affymetrix Kaspar? Genotyping ChipsRFLPRAPD 2007 2008 2009 2010 2011 2012 2013 2014 SSR Phase 1 Phase 3 Illumina GoldenGate Affymetrix Axiom? (384-1536 plexes) (1.5K-2.6M plexes) 10
  11. 11. Why SNP Markers ?  Abundant  Simple to discover  Fast and cheap to genotype using:  Genotyping by sequencing  Different assay technologies  But Bi-allelic (assay technology constraint)  Haplotypes  SNP tagging11
  12. 12. How SNP Markers are Genotyped● Assay free methods: - “Genotyping by sequencing” - More and more used when many markers at once - Allow marker discovery● SNP assay technologies: - Routinely used today - Cost efficient when used on many samples (>10K per project) - Main technologies • TaqMan (BHQ and MGB) • Illumina GoldenGate • KASPar • Illumina Infinium • Affymetrix Axiom12
  13. 13. How SNP Markers are Genotyped# of Polymorphims108 NGS107 Solexa or alternative technologies106 SNP/Gene High density105 infinium or alternative technologies104 Germplasm fingerprinting infinium or alternative technologies103 Marker implementation102 TaqMan type technologies10 Genetic purity TaqMan type technologies1 13 1 10 102 103 104 105 106 107 108 # of samples
  14. 14. HD Genotyping Chip Creation Designability Cost Drivers SNP Discovery Phase 1 SNP Selection for HD SNP Non Redundant Genotyping Chip A Discovery Polymorphism DB Phase 2 SNP Selection for HD SNP Genotyping Chip B Discovery Phase 3 HD chip creation and ordering Useful SNP Markers Genomic & Genetic Annotation Genotyping LIMS14
  15. 15. SNP Selection Criteria  Technical Criteria Technology compatibility (designability score) Technology sub-type (eg Infinium I and II) Cross technology compatibility Assay Information from the genotyping labs  Biological Criteria Minimum Allele Frequency (MAF) or Polymorphism Information Content (PIC) Genome distribution (based on physical or genetic maps) Polymorphic within sub set SNP tagging15
  16. 16. Bioinformatics Context Fingerprinting projects Trait Linked Markers (Anonymous Markers) Text Mining HPC Pathway Analysis Internal Emerging Association Studies pipeline Standard Scripts Design Primer External VCF Perl Comparative sources Files Genomics “Species Translator” VCF Loader Assay - Non redundant, Technology - Stable ID “Designability” Polymorphism DB Genomic & SNP Genetic Data Selection Genotyping LIMS - Data volumes16
  17. 17. Pipeline of Alignment Analysis17
  18. 18. What is a Universal Marker Assay? Technical Constraints…GGCTTAGCTTACTCGGCTTGCTGATGCTAGTGGCTAGG[A/C]TAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGACCGATCTAGCTT… TACTCGGCTTGC GGCTAGGATAGCTGA TCGTAGATAAGCTAA GGCTAGGCTAGCTGA TaqMan GoldenGate CTGATGCTAGTGGCTAGGA AGCTAACTAGCCTGGCTAGA KASPar CTGATGCTAGTGGCTAGGC TAGCTTACTCGGCTTGCTGATGCTAGTGGCTAGGA Infinium I TAGCTTACTCGGCTTGCTGATGCTAGTGGCTAGGT TAGCTTACTCGGCTTGCTGATGCTAGTGGCTAGG Infinium II Use the conserved region to define primers/probes Rules to format footprints:  Replace Indels by „N‟  Replace Other SNP by IUPAC codes (or „N‟ in TaqMan) 18
  19. 19. What is a Universal Marker Assay? Technical Constraints SNPs to target SNPs only found in WildAlignment (MSA):LIN1 …GGCTTAGCTTAGTCGGCTAGCTGATGCTAGTGGCTAGGATAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGATCGACCTAG…LIN2 …GGCTTAGCTTAGTCGGCT----GATGCTAGTGGCTAGGCTAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGATCGACCTAG…LIN3 …GGCTTAGCTTACTCGGCT----GATGCTAGTGGCTAGGATAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGATCGACCTAG…WILD1…GGCTTAGCTTACTCGGCT----GATGCTAGTGGCTAGGATAGCTGATCGATGCTATTGCATCTATTCGATTGATCGGATCGACCTAG…WILD2…GGCTTAGCTTACTCGGCT----GATGCTAGTGGCTAGGATAGCTGATCGATGCTATTGCATCTATTCGATTGATCGGATCGACCTAG…WILD3…GGCTTAGCTTACTCGGCTAGCTGATGCTAGTGGCTAGGCTAGCTGATCGATGCTATTGCATCTATTCGATTGATCGGATCGAGCTAG… Usable region for primerPutative marker assays (without WILD species):…GGCTTAGCTTA[G/C]TCGGCTNNNNGATGCTAGTGGCTAGG M TAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGACCGATCTAG……GGCTTAGCTTA S TCGGCTNNNNGATGCTAGTGGCTAGG[A/C]TAGCTGATCGATGCTATAGCATCTATTCGATTGATCGGACCGATCTAG…Putative marker assays (with WILD species):…GGCTTAGCTTA[G/C]TCGGCTNNNNGATGCTAGTGGCTAGG M TAGCTGATCGATGCTATWGCATCTATTCGATTGATCGGAHCGATCTAG……GGCTTAGCTTA S TCGGCTNNNNGATGCTAGTGGCTAGG[A/C]TAGCTGATCGATGCTATWGCATCTATTCGATTGATCGGAHCGATCTAG… NOT USABLE region for primer 19
  20. 20. Genomic Sequencing
  21. 21. More Plant Genomes – Rate of Publication Published Plant Genomes 50 45 40 35 30 25 20 15 10 5 0 1998 2000 2002 2004 2006 2008 2010 2012 2014 Source: http://genomevolution.org/wiki/index.php/Sequenced_plant_genomes21
  22. 22. Sequencing Technologies Kilobases per day per machine 10000000 Massively parallel 10000000 sequencing 1000000 PacBio IonTorrent 100000 10000 Capillary sequencing Solexa 1000 SOLiD 100 Gel-based systems 454 10 Sanger 1 1975 1980 1985 1990 1995 2000 2005 2010 201522
  23. 23. Current Challenges
  24. 24. Current Challenges● Genotyping by Sequencing: - Algorithm development - Data integration (genotyping warehouse)25
  25. 25. Different Genotypes for Association Studies Genotyping by Sequencing Assay based Discovered (GbS) Genotyping Polymorphisms (AbG) Consolidated Genotype Database Association Studies26
  26. 26. Current Challenges● Genotyping by Sequencing: - Algorithm development - Data integration (genotyping warehouse)● New variations types: - Structural variations (intra-species comparisons) - Long insertions/introgressions (reference guided de-novo assembly) - Targeting Induced Local Lesions IN Genomes (TILLING) mutants● Genetics Data Integration: - Genotyping QC to detect inefficient assays - Improve exploitation of MTA Global Data Management● Genomics Data Integration: - Genome annotation for SNP selection - Multi reference genome support (Pan-genomes?)27
  27. 27. Knowledge Management Pyramid28

×