SlideShare a Scribd company logo
1 of 83
Download to read offline
solGS: Web-based Genomic
Selection Analysis Tool
Purposes
Gain understanding of genomic
selection, GS model building, breeding
values prediction, assessing model data
input and output quality.
Brainstorm for ideas to make the tool
suit better your research purposes.
Outline
 Overview
 GS and solGS
 Demo
 Exercise, bug watching, feedback
 Brainstorming
Phenotyped
&
genotyped individuals
Genomic selection…
Prediction model
Predicted
breeding
Values (GEBVs)
Genotyped selection
candidates
Training population
GS advantages
 Little or no phenotyping
 reduced cost
 Shorter breeding cycles
 Higher selection gain per unit time
 Increased prediction accuracy
Phenotyped
&
genotyped individuals
Genomic selection…
Prediction model
Predicted
breeding
Values (GEBVs)
Genotyped selection
candidates
Training population
Challenges…
 Data volume, storage
 Data structuring, cleaning, imputation
 Statistical analysis complexity
 visualization and sharing
solGS
http://cassavabase.org/solgs
What you can do with solGS…
 Store data
 Chado Natural Diversity schema
 Create training dataset
 Build models and predict breeding
values of selection candidates
 Test model accuracy
What you can do with solGS…
 Explore phenotype data
 Evaluate population structure
 Check on relationship between
GEBVs vs observed phenotypes
 Calculate selection indices, correlation
 Visualize data on interactive plots
 Calculate selection response
What is the statistical approach
behind solGS?
…preparing phenotype data
 Omits individuals completely missing
phenotype values
 Adjusts phenotype values for block
effects
 Averages across multiple trials after
adjusting for block effects
…preparing genotype data
 Removes out monomorphic markers
 Removes markers with > 60% missing
values
 Removes markers with MAF < 5%
 Removes individuals with > 80%
missing values
 Imputes missing marker data
 Median substitution
…statistical modeling
 Univariate
 Two-stage analysis
 RR-BLUP
 Endelman, Plant Genome (2010)
 GBLUP
 Marker-based realized relationship matrix
 Prediction accuracy
 Based on 10-fold cross-validation
How does solGS work?
Websites for exercise
 Cassava-devel.sgn.cornell.edu
 Cassava-test.sgn.cornell.edu
 Review.cassavabase.org
 Cassavabase.org
 https://iita-mirror.cassavabase.org
 https://172.30.2.199
 Username: sgn
 Password: eggplant
Phenotyped
&
genotyped individuals
Genomic selection steps…
Prediction model
Predicted
breeding
Values (GEBVs)
Selection
candidates
Training dataset
Demo: Part I
Create training data set & build model
Explore model input and output
Phenotype and genetic correlation
Population structure
Selection index
Things to consider when creating a
training data set & building a model
Things to consider…Phenotype data
 Number of phenotyped individuals
 Minimum 20 clones
 Relevant to target environment
 Data quality
 Experimental design
 Measurement accuracy
 Missing values
 outliers
Things to consider…genotype data
 Marker number, genome distribution,
polymorphism,
 Data quality
 Allele calling accuracy
 Missing values (Per marker, individual)
 Minor alleles
 Heterozygosity,
 LD
 Population structure
Let’s do stuff!
single trial – single trait
 Create training data set and build
model
 Trial method

Search for trial ‘Cassava Ibadan 2002/03’

Create a training dataset with that trial
 Description, correlation

Build a model for FRW
 Explore model input and output,
 model accuracy
 Download GEBVs
Exercise: single trial – single trait
 Create training data set and build
model

Search for your trial

Create a training dataset with that trial
 Check description, correlation

Build a model for your trait
 Explore model input and output,
 Population structure
 model accuracy
 Download GEBVs
single trial – multiple traits
 Create training data set and build
models

Search for trial ‘Cassava Ibadan 2002/03’

Create a training dataset with that trial
 Description, correlation

Build models for FRW and CMDS
 Explore model input and output for each model,
 Genetic correlation
 Selection index
Exercise: single trial – multiple traits
 Create training data set and build
models

Search for your trial

Create a training dataset with that trial
 Check description, correlation

Build models for two traits at the same time
 Explore model input and output for each model,
 Genetic correlation
 Calculate and download selection index
Combined trials – single trait
 Create training data set and build
models using two trials
 Search for ‘cassava ibadan 02/03 & 01/02’
 Create a training dataset with the trials

Check description, correlation
 Build a model for FRW

Explore model input and output for the model,

Population structure

Prediction accuracy

Download GEBV
Exercise: combined trials – single trait
 Create training data set and build
models using two trials

Search for your trials

Create a training dataset with the trials
 Check description, correlation

Build a model for your trait
 Explore model input and output for the model,
 Population structure
 Prediction accuracy
 Download GEBV
Using list – single trait
 Create training data set and build a
model using plots list

Using the search wizard create a plots list from
trial ‘cassava ibadan 2002/03 plots’

Create a training dataset with the list
 Check description, correlation

Build a model for your FRW
 Explore model input and output for the model,
 Population structure
 Prediction accuracy
 Download GEBV
Exercise: Using list – single trait
 Create training data set and build a
model using plots list

Using the search wizard create a plots list from
a trial… select all plots..

Create a training dataset with the list
 Check description, correlation

Build a model for your trait
 Explore model input and output for the model,
 Population structure
 Prediction accuracy
 Download GEBV
Demo: Part II
Predict breeding values of selection
populations
Genetic correlation
Selection index
Selection gain
Things to consider when applying a
model to predict breeding values of
selection populations
Things to consider…applying the model
 Training population vs selection
population genetic relationship
 Target environment
 Marker types used
 Population structure
Predict GEBVs of a Selection population
 Create training data set & build model
 Cassava Ibadan 2002/03
 FRW
 Search for a selection population
 Cassava Ibadan 2003/04
 Predict GEBVs for the selection
population
 Check selection response
 Download GEBVs
Exercise: Selection Population Prediction
 Create training data set & build model
 use one of the models you already built
 Search for a selection population
 Related to the training population
 Predict GEBVs for the selection
population
 Check selection response
 Download GEBVs
Multiple Traits: Predict GEBVs of a Selection population
 Create training data set & build model
 Cassava Ibadan 2002/03
 FRW, CMDS
 Search for a selection population
 Cassava Ibadan 2003/04
 Predict GEBVs for both traits for the
selection population
 Check selection response
 Download GEBVs
Exercise: Multiple Traits selection population
prediction
 Create training data set & build model
 Use previous two models from your training
populations
 Search for a selection population
 Predict GEBVs for both traits for the
selection population
 Check genetic correlation
 Calculate selection index
List: Predict GEBVs of a Selection population
 Create training data set & build model
 Cassava Ibadan 2002/03
 FRW
 Search for a selection candidates list
 Cassava Ibadan 213 genotypes
 Predict GEBVs for the selection
population
 Check selection response
 Download GEBVs
Exercise: selection candidates list
 Create training data set & build model
 Go to a previous model page
 Create a selection candidates list
 Use search wizard to create accessions list
 Using the model predict GEBVs of the
list
 Check selection response
 Download GEBVs
Demo: Part III
 Trait search
 Search for ‘fresh root weight’
 Select trial ‘cassava ibadan 2002/03’
 Check model output
Demo: Part III
 PCA using accessions list
Brainstorm for new features
Make priority list
What features do you like in BMS?
What features do you like in to be added
in cassavabase?
Thanks to…
SolGS workshop 2016
SolGS workshop 2016
Composing a training population:
Fitting a prediction model...
3 options
SolGS workshop 2016
Fitting a prediction model…
Option 1:
Search using a trait name
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
Estimating breeding values of
selection candidates
Applying the model…
SolGS workshop 2016
SolGS workshop 2016
Fitting a prediction model…
Option 2:
Search for trials
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
SolGS workshop 2016
Estimating breeding values of a
selection candidates for multiple
traits
Applying the models…
SolGS workshop 2016
SolGS workshop 2016
Estimating genetic correlations
SolGS workshop 2016
Calculating selection indices
SolGS workshop 2016
Fitting a prediction model…
Option 3:
use your own list of individuals
SolGS workshop 2016
SolGS workshop 2016
To sum up…
 Store data
 Build prediction models
 Estimate breeding values
 Additional analyses:
 Correlation analysis
 Population structure
 Selection indices
 http://cassavabase.org/solgs
 Open source code
Thanks to…
SolGS workshop 2016
Many thanks!!
Background image: nextgencassava.org

More Related Content

What's hot

Group A - pet adoption centre
Group A - pet adoption centreGroup A - pet adoption centre
Group A - pet adoption centrehfcheng7
 
GroupA_BSIM0007
GroupA_BSIM0007GroupA_BSIM0007
GroupA_BSIM0007hfcheng7
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDJeremy Yang
 
Molecular & Cell Biology Honours 2015
Molecular & Cell Biology Honours 2015Molecular & Cell Biology Honours 2015
Molecular & Cell Biology Honours 2015UCT
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...geraintduck
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013mhaendel
 
Guided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkGuided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkMaryann Martone
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Susanna-Assunta Sansone
 
Why are we still doing industrial age drug
Why are we still doing industrial age drugWhy are we still doing industrial age drug
Why are we still doing industrial age drugSean Ekins
 
BioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialBioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialRothamsted Research, UK
 

What's hot (20)

NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Group A - pet adoption centre
Group A - pet adoption centreGroup A - pet adoption centre
Group A - pet adoption centre
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
GroupA_BSIM0007
GroupA_BSIM0007GroupA_BSIM0007
GroupA_BSIM0007
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Challenges in Normalizing and Disambiguating Organization Names, by John Fereira
Challenges in Normalizing and Disambiguating Organization Names, by John FereiraChallenges in Normalizing and Disambiguating Organization Names, by John Fereira
Challenges in Normalizing and Disambiguating Organization Names, by John Fereira
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARD
 
Molecular & Cell Biology Honours 2015
Molecular & Cell Biology Honours 2015Molecular & Cell Biology Honours 2015
Molecular & Cell Biology Honours 2015
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
CSHALS 2013
CSHALS 2013CSHALS 2013
CSHALS 2013
 
NCBO haendel talk 2013
NCBO haendel talk 2013NCBO haendel talk 2013
NCBO haendel talk 2013
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Guided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information FrameworkGuided tutorial of the Neuroscience Information Framework
Guided tutorial of the Neuroscience Information Framework
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014
 
Why are we still doing industrial age drug
Why are we still doing industrial age drugWhy are we still doing industrial age drug
Why are we still doing industrial age drug
 
BioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialBioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS Tutorial
 

Viewers also liked

Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)solgenomics
 
3b Cassavabase workshop: manage accessions
3b  Cassavabase workshop: manage accessions3b  Cassavabase workshop: manage accessions
3b Cassavabase workshop: manage accessionssolgenomics
 
3a Cassavabase worksop: manage breeding-program ands locations
3a  Cassavabase worksop: manage breeding-program ands locations3a  Cassavabase worksop: manage breeding-program ands locations
3a Cassavabase worksop: manage breeding-program ands locationssolgenomics
 
YamBase phenotyping workflow demo
YamBase phenotyping workflow demoYamBase phenotyping workflow demo
YamBase phenotyping workflow demosolgenomics
 
3h Cassavabase workshop: manage barcode
3h  Cassavabase workshop: manage barcode3h  Cassavabase workshop: manage barcode
3h Cassavabase workshop: manage barcodesolgenomics
 
Musa base phenotyping workflow demo
Musa base phenotyping workflow demoMusa base phenotyping workflow demo
Musa base phenotyping workflow demosolgenomics
 

Viewers also liked (6)

Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
 
3b Cassavabase workshop: manage accessions
3b  Cassavabase workshop: manage accessions3b  Cassavabase workshop: manage accessions
3b Cassavabase workshop: manage accessions
 
3a Cassavabase worksop: manage breeding-program ands locations
3a  Cassavabase worksop: manage breeding-program ands locations3a  Cassavabase worksop: manage breeding-program ands locations
3a Cassavabase worksop: manage breeding-program ands locations
 
YamBase phenotyping workflow demo
YamBase phenotyping workflow demoYamBase phenotyping workflow demo
YamBase phenotyping workflow demo
 
3h Cassavabase workshop: manage barcode
3h  Cassavabase workshop: manage barcode3h  Cassavabase workshop: manage barcode
3h Cassavabase workshop: manage barcode
 
Musa base phenotyping workflow demo
Musa base phenotyping workflow demoMusa base phenotyping workflow demo
Musa base phenotyping workflow demo
 

Similar to SolGS workshop 2016

Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amatoSSSW
 
Cassavabase SolGS poster PAG 2016
Cassavabase SolGS poster PAG 2016Cassavabase SolGS poster PAG 2016
Cassavabase SolGS poster PAG 2016solgenomics
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...butest
 
Tutorial Knowledge Discovery
Tutorial Knowledge DiscoveryTutorial Knowledge Discovery
Tutorial Knowledge DiscoverySSSW
 
New Initiatives - Geoffrey Bilder - London LIVE 2017
New Initiatives - Geoffrey Bilder - London LIVE 2017New Initiatives - Geoffrey Bilder - London LIVE 2017
New Initiatives - Geoffrey Bilder - London LIVE 2017Crossref
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Paolo Missier
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Presentation
PresentationPresentation
Presentationbutest
 
Cassavabase SolGS presentation PAG 2016
Cassavabase SolGS presentation PAG 2016Cassavabase SolGS presentation PAG 2016
Cassavabase SolGS presentation PAG 2016solgenomics
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenIntegrated Breeding Platform
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsGolden Helix Inc
 
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...Spark Summit
 
marketing -MK102_Session 9_2023_Final_v2.pdf
marketing -MK102_Session 9_2023_Final_v2.pdfmarketing -MK102_Session 9_2023_Final_v2.pdf
marketing -MK102_Session 9_2023_Final_v2.pdff20180184h
 
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...cloudbeatsch
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning ClassifiersMostafa
 
Machine Learning with WEKA
Machine Learning with WEKAMachine Learning with WEKA
Machine Learning with WEKAbutest
 

Similar to SolGS workshop 2016 (20)

Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amato
 
Cassavabase SolGS poster PAG 2016
Cassavabase SolGS poster PAG 2016Cassavabase SolGS poster PAG 2016
Cassavabase SolGS poster PAG 2016
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
 
Tutorial Knowledge Discovery
Tutorial Knowledge DiscoveryTutorial Knowledge Discovery
Tutorial Knowledge Discovery
 
New Initiatives - Geoffrey Bilder - London LIVE 2017
New Initiatives - Geoffrey Bilder - London LIVE 2017New Initiatives - Geoffrey Bilder - London LIVE 2017
New Initiatives - Geoffrey Bilder - London LIVE 2017
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Presentation
PresentationPresentation
Presentation
 
Cassavabase SolGS presentation PAG 2016
Cassavabase SolGS presentation PAG 2016Cassavabase SolGS presentation PAG 2016
Cassavabase SolGS presentation PAG 2016
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
 
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
 
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
 
marketing -MK102_Session 9_2023_Final_v2.pdf
marketing -MK102_Session 9_2023_Final_v2.pdfmarketing -MK102_Session 9_2023_Final_v2.pdf
marketing -MK102_Session 9_2023_Final_v2.pdf
 
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Machine Learning with WEKA
Machine Learning with WEKAMachine Learning with WEKA
Machine Learning with WEKA
 

More from solgenomics

Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0solgenomics
 
Cassavabase-PhenoApp sample tracking
Cassavabase-PhenoApp sample trackingCassavabase-PhenoApp sample tracking
Cassavabase-PhenoApp sample trackingsolgenomics
 
Musabase PAG 2018
Musabase PAG 2018Musabase PAG 2018
Musabase PAG 2018solgenomics
 
Introduction to YamBase
Introduction to YamBaseIntroduction to YamBase
Introduction to YamBasesolgenomics
 
1 introduction to cassavabase
1  introduction to cassavabase 1  introduction to cassavabase
1 introduction to cassavabase solgenomics
 
2 Cassavabase workshop: search menu
2  Cassavabase workshop: search menu2  Cassavabase workshop: search menu
2 Cassavabase workshop: search menusolgenomics
 
3c Cassavabase workshop: manage-crosses
3c  Cassavabase workshop: manage-crosses3c  Cassavabase workshop: manage-crosses
3c Cassavabase workshop: manage-crossessolgenomics
 
3d Cassavabase workshop: manage field-trial
3d  Cassavabase workshop: manage field-trial3d  Cassavabase workshop: manage field-trial
3d Cassavabase workshop: manage field-trialsolgenomics
 
3e Cassavabase workshop: manage genotyping-trials
3e  Cassavabase workshop: manage genotyping-trials3e  Cassavabase workshop: manage genotyping-trials
3e Cassavabase workshop: manage genotyping-trialssolgenomics
 
3f Cassavabase workshop: manage field-book
3f  Cassavabase workshop: manage field-book3f  Cassavabase workshop: manage field-book
3f Cassavabase workshop: manage field-booksolgenomics
 
3g Cassavabase workshop: manage phenotyping
3g  Cassavabase workshop: manage phenotyping3g  Cassavabase workshop: manage phenotyping
3g Cassavabase workshop: manage phenotypingsolgenomics
 
4 Cassavabase workshop: analyze menu
4  Cassavabase workshop: analyze menu4  Cassavabase workshop: analyze menu
4 Cassavabase workshop: analyze menusolgenomics
 
5 Cassavabase workshop: contact us
5  Cassavabase workshop: contact us5  Cassavabase workshop: contact us
5 Cassavabase workshop: contact ussolgenomics
 

More from solgenomics (15)

Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0
 
Cassavabase-PhenoApp sample tracking
Cassavabase-PhenoApp sample trackingCassavabase-PhenoApp sample tracking
Cassavabase-PhenoApp sample tracking
 
Musabase PAG 2018
Musabase PAG 2018Musabase PAG 2018
Musabase PAG 2018
 
Sql cheat sheet
Sql cheat sheetSql cheat sheet
Sql cheat sheet
 
Introduction to YamBase
Introduction to YamBaseIntroduction to YamBase
Introduction to YamBase
 
1 introduction to cassavabase
1  introduction to cassavabase 1  introduction to cassavabase
1 introduction to cassavabase
 
2 Cassavabase workshop: search menu
2  Cassavabase workshop: search menu2  Cassavabase workshop: search menu
2 Cassavabase workshop: search menu
 
3c Cassavabase workshop: manage-crosses
3c  Cassavabase workshop: manage-crosses3c  Cassavabase workshop: manage-crosses
3c Cassavabase workshop: manage-crosses
 
3d Cassavabase workshop: manage field-trial
3d  Cassavabase workshop: manage field-trial3d  Cassavabase workshop: manage field-trial
3d Cassavabase workshop: manage field-trial
 
3e Cassavabase workshop: manage genotyping-trials
3e  Cassavabase workshop: manage genotyping-trials3e  Cassavabase workshop: manage genotyping-trials
3e Cassavabase workshop: manage genotyping-trials
 
3f Cassavabase workshop: manage field-book
3f  Cassavabase workshop: manage field-book3f  Cassavabase workshop: manage field-book
3f Cassavabase workshop: manage field-book
 
3g Cassavabase workshop: manage phenotyping
3g  Cassavabase workshop: manage phenotyping3g  Cassavabase workshop: manage phenotyping
3g Cassavabase workshop: manage phenotyping
 
4 Cassavabase workshop: analyze menu
4  Cassavabase workshop: analyze menu4  Cassavabase workshop: analyze menu
4 Cassavabase workshop: analyze menu
 
5 Cassavabase workshop: contact us
5  Cassavabase workshop: contact us5  Cassavabase workshop: contact us
5 Cassavabase workshop: contact us
 
SGN UPLB 2016
SGN UPLB 2016SGN UPLB 2016
SGN UPLB 2016
 

Recently uploaded

Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPirithiRaju
 
Basics Of Computers | The Computer System
Basics Of Computers | The Computer SystemBasics Of Computers | The Computer System
Basics Of Computers | The Computer SystemNehaRohtagi1
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPTvishalbhati28
 
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxAKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxharichikku1713
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionWim van Es
 
Advance pharmacology presentation..............
Advance pharmacology presentation..............Advance pharmacology presentation..............
Advance pharmacology presentation..............SIMRAN VERMA
 
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERIC
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERICROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERIC
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERICsnehalraut2002
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinSowmiya
 
Cultivating various strains of Duckweed Syllabus.pdf
Cultivating various strains of Duckweed Syllabus.pdfCultivating various strains of Duckweed Syllabus.pdf
Cultivating various strains of Duckweed Syllabus.pdfHaim R. Branisteanu
 
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...Thane Heins
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...ORAU
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...klada0003
 
layers of the earths atmosphere.ppt slides for grade 9
layers of the earths atmosphere.ppt slides for grade 9layers of the earths atmosphere.ppt slides for grade 9
layers of the earths atmosphere.ppt slides for grade 9rolanaribato30
 
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...Sharon Liu
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.kapgateprachi@gmail.com
 
Theory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesTheory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesChimwemweGladysBanda
 
Development of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in DrivingDevelopment of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in Drivingstudiotelon
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)GregBabinski
 
Skin: Structure and function of the skin
Skin: Structure and function of the skinSkin: Structure and function of the skin
Skin: Structure and function of the skinheenarahangdale01
 

Recently uploaded (20)

Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
 
Basics Of Computers | The Computer System
Basics Of Computers | The Computer SystemBasics Of Computers | The Computer System
Basics Of Computers | The Computer System
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPT
 
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptxAKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
AKSHITA A R ECOLOGICAL NICHE and Gauss lawpptx
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projection
 
Advance pharmacology presentation..............
Advance pharmacology presentation..............Advance pharmacology presentation..............
Advance pharmacology presentation..............
 
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERIC
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERICROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERIC
ROLE OF HERBS IN COSMETIC SKIN CARE: ALOE AND TURMERIC
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of protein
 
Cultivating various strains of Duckweed Syllabus.pdf
Cultivating various strains of Duckweed Syllabus.pdfCultivating various strains of Duckweed Syllabus.pdf
Cultivating various strains of Duckweed Syllabus.pdf
 
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
 
Proof-of-Concept Publicly Accessible Data Dashboards from the US-EPA.pptx
Proof-of-Concept Publicly Accessible Data Dashboards from the US-EPA.pptxProof-of-Concept Publicly Accessible Data Dashboards from the US-EPA.pptx
Proof-of-Concept Publicly Accessible Data Dashboards from the US-EPA.pptx
 
layers of the earths atmosphere.ppt slides for grade 9
layers of the earths atmosphere.ppt slides for grade 9layers of the earths atmosphere.ppt slides for grade 9
layers of the earths atmosphere.ppt slides for grade 9
 
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...
20240315 ACMJ Diagrams Set 2.docx . With light, motor, coloured light, and se...
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
 
Theory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesTheory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theories
 
Development of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in DrivingDevelopment of a Questionnaire for Identifying Personal Values in Driving
Development of a Questionnaire for Identifying Personal Values in Driving
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)
 
Skin: Structure and function of the skin
Skin: Structure and function of the skinSkin: Structure and function of the skin
Skin: Structure and function of the skin
 

SolGS workshop 2016

  • 2. Purposes Gain understanding of genomic selection, GS model building, breeding values prediction, assessing model data input and output quality. Brainstorm for ideas to make the tool suit better your research purposes.
  • 3. Outline  Overview  GS and solGS  Demo  Exercise, bug watching, feedback  Brainstorming
  • 4. Phenotyped & genotyped individuals Genomic selection… Prediction model Predicted breeding Values (GEBVs) Genotyped selection candidates Training population
  • 5. GS advantages  Little or no phenotyping  reduced cost  Shorter breeding cycles  Higher selection gain per unit time  Increased prediction accuracy
  • 6. Phenotyped & genotyped individuals Genomic selection… Prediction model Predicted breeding Values (GEBVs) Genotyped selection candidates Training population
  • 7. Challenges…  Data volume, storage  Data structuring, cleaning, imputation  Statistical analysis complexity  visualization and sharing
  • 9. What you can do with solGS…  Store data  Chado Natural Diversity schema  Create training dataset  Build models and predict breeding values of selection candidates  Test model accuracy
  • 10. What you can do with solGS…  Explore phenotype data  Evaluate population structure  Check on relationship between GEBVs vs observed phenotypes  Calculate selection indices, correlation  Visualize data on interactive plots  Calculate selection response
  • 11. What is the statistical approach behind solGS?
  • 12. …preparing phenotype data  Omits individuals completely missing phenotype values  Adjusts phenotype values for block effects  Averages across multiple trials after adjusting for block effects
  • 13. …preparing genotype data  Removes out monomorphic markers  Removes markers with > 60% missing values  Removes markers with MAF < 5%  Removes individuals with > 80% missing values  Imputes missing marker data  Median substitution
  • 14. …statistical modeling  Univariate  Two-stage analysis  RR-BLUP  Endelman, Plant Genome (2010)  GBLUP  Marker-based realized relationship matrix  Prediction accuracy  Based on 10-fold cross-validation
  • 15. How does solGS work?
  • 16. Websites for exercise  Cassava-devel.sgn.cornell.edu  Cassava-test.sgn.cornell.edu  Review.cassavabase.org  Cassavabase.org  https://iita-mirror.cassavabase.org  https://172.30.2.199  Username: sgn  Password: eggplant
  • 17. Phenotyped & genotyped individuals Genomic selection steps… Prediction model Predicted breeding Values (GEBVs) Selection candidates Training dataset
  • 18. Demo: Part I Create training data set & build model Explore model input and output Phenotype and genetic correlation Population structure Selection index
  • 19. Things to consider when creating a training data set & building a model
  • 20. Things to consider…Phenotype data  Number of phenotyped individuals  Minimum 20 clones  Relevant to target environment  Data quality  Experimental design  Measurement accuracy  Missing values  outliers
  • 21. Things to consider…genotype data  Marker number, genome distribution, polymorphism,  Data quality  Allele calling accuracy  Missing values (Per marker, individual)  Minor alleles  Heterozygosity,  LD  Population structure
  • 23. single trial – single trait  Create training data set and build model  Trial method  Search for trial ‘Cassava Ibadan 2002/03’  Create a training dataset with that trial  Description, correlation  Build a model for FRW  Explore model input and output,  model accuracy  Download GEBVs
  • 24. Exercise: single trial – single trait  Create training data set and build model  Search for your trial  Create a training dataset with that trial  Check description, correlation  Build a model for your trait  Explore model input and output,  Population structure  model accuracy  Download GEBVs
  • 25. single trial – multiple traits  Create training data set and build models  Search for trial ‘Cassava Ibadan 2002/03’  Create a training dataset with that trial  Description, correlation  Build models for FRW and CMDS  Explore model input and output for each model,  Genetic correlation  Selection index
  • 26. Exercise: single trial – multiple traits  Create training data set and build models  Search for your trial  Create a training dataset with that trial  Check description, correlation  Build models for two traits at the same time  Explore model input and output for each model,  Genetic correlation  Calculate and download selection index
  • 27. Combined trials – single trait  Create training data set and build models using two trials  Search for ‘cassava ibadan 02/03 & 01/02’  Create a training dataset with the trials  Check description, correlation  Build a model for FRW  Explore model input and output for the model,  Population structure  Prediction accuracy  Download GEBV
  • 28. Exercise: combined trials – single trait  Create training data set and build models using two trials  Search for your trials  Create a training dataset with the trials  Check description, correlation  Build a model for your trait  Explore model input and output for the model,  Population structure  Prediction accuracy  Download GEBV
  • 29. Using list – single trait  Create training data set and build a model using plots list  Using the search wizard create a plots list from trial ‘cassava ibadan 2002/03 plots’  Create a training dataset with the list  Check description, correlation  Build a model for your FRW  Explore model input and output for the model,  Population structure  Prediction accuracy  Download GEBV
  • 30. Exercise: Using list – single trait  Create training data set and build a model using plots list  Using the search wizard create a plots list from a trial… select all plots..  Create a training dataset with the list  Check description, correlation  Build a model for your trait  Explore model input and output for the model,  Population structure  Prediction accuracy  Download GEBV
  • 31. Demo: Part II Predict breeding values of selection populations Genetic correlation Selection index Selection gain
  • 32. Things to consider when applying a model to predict breeding values of selection populations
  • 33. Things to consider…applying the model  Training population vs selection population genetic relationship  Target environment  Marker types used  Population structure
  • 34. Predict GEBVs of a Selection population  Create training data set & build model  Cassava Ibadan 2002/03  FRW  Search for a selection population  Cassava Ibadan 2003/04  Predict GEBVs for the selection population  Check selection response  Download GEBVs
  • 35. Exercise: Selection Population Prediction  Create training data set & build model  use one of the models you already built  Search for a selection population  Related to the training population  Predict GEBVs for the selection population  Check selection response  Download GEBVs
  • 36. Multiple Traits: Predict GEBVs of a Selection population  Create training data set & build model  Cassava Ibadan 2002/03  FRW, CMDS  Search for a selection population  Cassava Ibadan 2003/04  Predict GEBVs for both traits for the selection population  Check selection response  Download GEBVs
  • 37. Exercise: Multiple Traits selection population prediction  Create training data set & build model  Use previous two models from your training populations  Search for a selection population  Predict GEBVs for both traits for the selection population  Check genetic correlation  Calculate selection index
  • 38. List: Predict GEBVs of a Selection population  Create training data set & build model  Cassava Ibadan 2002/03  FRW  Search for a selection candidates list  Cassava Ibadan 213 genotypes  Predict GEBVs for the selection population  Check selection response  Download GEBVs
  • 39. Exercise: selection candidates list  Create training data set & build model  Go to a previous model page  Create a selection candidates list  Use search wizard to create accessions list  Using the model predict GEBVs of the list  Check selection response  Download GEBVs
  • 40. Demo: Part III  Trait search  Search for ‘fresh root weight’  Select trial ‘cassava ibadan 2002/03’  Check model output
  • 41. Demo: Part III  PCA using accessions list
  • 42. Brainstorm for new features Make priority list What features do you like in BMS? What features do you like in to be added in cassavabase?
  • 46. Composing a training population: Fitting a prediction model... 3 options
  • 48. Fitting a prediction model… Option 1: Search using a trait name
  • 60. Estimating breeding values of selection candidates Applying the model…
  • 63. Fitting a prediction model… Option 2: Search for trials
  • 70. Estimating breeding values of a selection candidates for multiple traits Applying the models…
  • 77. Fitting a prediction model… Option 3: use your own list of individuals
  • 80. To sum up…  Store data  Build prediction models  Estimate breeding values  Additional analyses:  Correlation analysis  Population structure  Selection indices  http://cassavabase.org/solgs  Open source code
  • 83. Many thanks!! Background image: nextgencassava.org

Editor's Notes

  1. Advantages: no phenotyping, which means less cost, shorter time, gain per unit time is higher,
  2. Advantages: no phenotyping, which means less cost, shorter time, gain per unit time is higher,
  3. Uploaded_216