TLI 2012: Data flows in integrated breeding


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

TLI 2012: Data flows in integrated breeding

  1. 1. Data Flows in IntegratedBreedingGraham McLaren
  2. 2. Principles of DM for IntegratedBreeding (IB) IB requires high standards of sample and pedigree identification, it requires integration of field and lab data, and quality is of paramount importance. Data collected during breeding processes has immediate value for breeders and it also has cumulative value over years and populations.
  3. 3. Information Cycle for Crop Improvement Genetic Genomics Resources Public Crop Information and Information accessible via internet Genetics Systems Databases Crop Lead Centers Curation, integration and publication Community of Practice Breeding Informatics of Public Crop Information Institutional National Project Private CIS CIS CIS CIS Shared Information management Practices ARI NARS Networks SMEs Local CIS Local CIS Local CIS Local CIS
  4. 4. Compatibility of DM Schemes Users may have existing DM systems which need to be accommodated. DM needs to be compatible across all members working on the same project. Use of analysis and decision support tools and sharing of data with partners requires data to be formatted and stored in defined ways. Training and support in DM and analysis is essential for IB projects
  5. 5. Breeding Data FlowsBreeding Partner 1 Breeding Project 1 Breeding Partner 2 Breeding Project 2 Breeding Partner 3 Breeding Project 3 Breeding Partner n Breeding Project n Public Project data Crop Breeding management Information data Copy of Project database management Project shared Crop lead Center n Database Public Central databaseBreeding Partner 1 < shared and published > Database Local Project Public Crop Breeding Data Central Data Update to project database Database Data manager (DM): Project data curator: Central DB curator: •Database management •QA for project data •QA for public data •Breeding logistics •Curation and integration •Curation and integration •Fieldbook preparation •Distribution to partners •Distribution to projects •Data entry/checking •Project Trait Dictionary •Publication on Internet •Data management •Fieldbook Templates •Global Trait Dictionary •Update to public DB •Catalogue of Templates •Download of public DB •Training of DMs and •Training of partner DMs Curators
  6. 6. Interaction of breeding workflow and platform elements LIMS MSL High density GRSS ST genotyping Genetic Resources FDM TSL Phenotypic Key characterizationInformation System A&DS Choose parental material based on haplotype Sample Parental Material ST values, known genes, traits and adaptation Tracking Develop crossing scheme based on genotype Breeding Information system PIM Pedigree Information A&DS and phenotype compatibility Public Crop InformationLIMS Laboratory Crossing Block Pedigree information updated Information PIM FDM Field Data High density Analysis & LIMS MSL genotypingA&DS Decision ST Nursery 1 FDM TSL Phenotypic SupportPlatform Services evaluation Genetic Selection of lines based on QTL analysis / Resource A&DS estimation of marker breeding valuesGRSS Service Nursery 2 Pedigree information updated PIM Marker MSL Service n cycles of selection Marker ST LIMS MSL and recombination genotyping TSL Trait Service A&DS Selection on index of marker values Evaluation Trials Multi-location GRSS ST FDM TSL testing Selection of improved lines based on trait Cultivars A&DS improvement and adaptation and Improved Lines Pedigree information updatedbreeding lines PIM
  7. 7. The IBP Configurable Workflow System Breeding Activities Project Germplasm Germplasm Molecular Data Breeding Planning Management Evaluation Analysis Analysis Decisions Open Project Quality Assurance Parental selection Experimental Design Marker selection Selected lines Specify objectives Trait analysis Crossing Fieldbook production Fingerprinting Recombines Identify team Genetic Analysis Population Data collection Genotyping Recombination Data resources QTL Analysis development Data loading Data loading plans Define strategy Index Analysis Breeding Field Trial Genotypic Data DecisionBreeding Project Analytical Management Management Management Support System Planning Pipeline System System SystemMB design tool, MABCCross prediction Breeding nursery Trial field book Lab book, Statistical analysis MASand Strategic and pedigree and environment quality assurance applications and MARSsimulation record characterization and diversity selection indices GWS management system analysis Breeding Applications
  8. 8. The Breeding ManagementSystem Breeding Management System Genotypic Data •Nursery Management Management •Characterization lists System ST •Pedigree maintenance •Evaluation lists Field Trial •Seed Inventory Management System
  9. 9. SampleST Tracking
  10. 10. Genotyping Data ManagementSystem Genotypic Data Breeding Management Management System SystemCharacterization •Planting listlists •Sample list Analytical ST LIMS Pipeline •Genotyping Data Data Transformation -Genotyping Database •Quality Assurance -Application file formats
  11. 11. Tracking GenotypingST Samples
  12. 12. LIMS Genotyping order form
  13. 13. LIMS Genotyping results:
  14. 14. Field Trial ManagementSystem Field Trial Breeding Management Management Analytical System System PipelineEvaluation lists •Fieldbook Experimental design and preparation randomization CWS Data Collection Configuration -Hand-held devises System -Automatic measurementTrait templates •Environmental characterization Data Transformation -Phenotyping Database •Quality Assurance -Application file formats •Phenotyping data
  15. 15. The Trial Template
  16. 16. Analytical PipelineGenotypic Data Analytical Management Pipeline System Decision Support ToolsGenotyping data •Genotyping QA •Diversity analysis Diversity scores •Genetic mapping Pedigree trees COP matrices Field Trial •Phenotyping QA Phenotype means Management System •Single site analysis Genotype BLUPS Stability measures •Multi site analysis Adaptation scoresPhenotyping data •GxE Analysis Marker scores Genetic distance •QTL Analysis Genetic maps •QTLxE Analysis QTL estimates
  17. 17. LIMS Genotyping scores:
  18. 18. Decision Support andSimulation Breeding Decision Support Decisions Tools Analytical Pipeline Germplasm lists for characterization •MBDT Foreground markersDiversity scores •Breeding indices Background markersPedigree trees Target genotypesCOP matrices •OptiMas Donor germplasmPhenotype means Recipient germplasmGenotype BLUPS Ranked germplasmStability measures Selection listsAdaptation scores Simulation Parental listsMarker scores Tools Crossing schemesGenetic distanceGenetic maps Population sizesQTL estimates Selection intensity Marker densities •QuLine Crossing schemes •QuHybrid Selection schemesGenetic modelsGE systems •QuMARS Trait selectionBreeding methods •QuGene GE targeting Optimal breeding systems
  19. 19. ICIS COP matrixLower Triangular part of Coefficient of Parentage MatrixROWID COLID ROWNO COLNO COP Optional Labels 50533 50533 1 1 0.9577 "IR 64" "IR 64" 70125 50533 2 1 0.2231 "IR 72" "IR 64" 70125 70125 2 2 0.9896 "IR 72" "IR 72" 11105 50533 3 1 0.1872 "IR 36" "IR 64" 11105 70125 3 2 0.5108 "IR 36" "IR 72" 11105 11105 3 3 0.9478 "IR 36" "IR 36"Lower Triangular part of Inverse Coefficient of Parentage MatrixROWID COLID ROWNO COLNO INV-COP Optional Labels 50533 50533 1 1 1.1113776 "IR 64" "IR 64" 70125 50533 2 1 -0.1900738 "IR 72" "IR 64" 70125 70125 2 2 1.4324875 "IR 72" "IR 72" 11105 50533 3 1 -0.1170834 "IR 36" "IR 64" 11105 70125 3 2 -0.7344297 "IR 36" "IR 72" 11105 11105 3 3 1.4739708 "IR 36" "IR 36"
  20. 20. Flapjack QTL Information File Compulsory Fields QTL Chromosome Position Minimum Maximum Trait Experiment Optional Fields AddEffects AddSE Minlog10(P) %VarExplained PosMinFM PosMaxFM LFM RFM
  21. 21. Flapjack Map Data The map file should contain information on the markers, the chromosome they are on, and their position within that chromosome. The markers do not need to be in any particular order as Flapjack will group and sort them by chromosome and distance once they are loaded.
  22. 22. Breeding program designer Blue/gray – strategy + add new object at next level Green – Generation X delete object Yellow – selection round clone object Pink/red – trait selection step • To start, open ‘BreedingProgram.jar’ • Can create/drag/drop any new objects anywhere • Use left mouse click to drag any piece and drop on higher hiearchy • Use centre mouse click to zoom • Edit in list/value boxes to set parameters Scott Chapman
  23. 23. Available breeding simulationtools QuLine, a computer software that simulates breeding programs for developing inbred lines QuHybrid, a computer software that simulates breeding programs for developing hybrids QuMARS, a computer software that simulates marker-assisted recurrent selection and genome-wide selection Jiankang Wang
  24. 24. What can QuLine do? Comparison of genetic gains from different selection methods  Change in population mean  Change in gene frequency  Change in Hamming distance (distance of a selected genotype to the target genotype) Comparison of cross performance  Selection history  Rogers’ genetic distance  Number of lines retained from each cross Comparison of cost efficiency  Number of families  Individual plants per generation Validation of theories Jiankang Wang
  25. 25. Integrating the applications of the Configurable Workflow System Field Trial Genotypic Data Breeding Genotypic Data Management Analytical Decision Support Management Management Management System Pipeline Tools System System System •Genotyping QA •Planting list •Fieldbook •Planting list •MBDT •Diversity analysis •Sample list preparation •Sample list •Breeding indices •Genetic mapping •OptiMas LIMS •Phenotyping QA •Nursery Management Data Collection •Single site analysis •Characterization lists -Hand-held devises •Multi site analysis •Pedigree maintenance •Genotyping Data -Automatic •GxE Analysis Simulation •Evaluation lists •Quality Assurance •Genotyping Data Tools measurement •QTL Analysis •Seed Inventory •Quality Assurance •QTLxE Analysis •Environmental •QuLine characterization •QuHybrid •Quality Assurance •QuMARS •Phenotyping data •QuGene GMS DMS GDMS