Palmira, October 2018
Mueller lab
demoand
Guillaume Bauchet, Bryan Ellerbrock, David Lyon, Naama
Menda, Nicolas Morales, Alex C. Ogbonna, Adrian Powell, Titima
Tantikanjana and Isaak Y Tecle
Mueller lab
- Bioinformatics
- Genomics
- Databases
https://btiscience.org/lukas-mueller/#lab-members
https://www.facebook.com/solgenomics/
NEXTGEN CASSAVA
Ensuring Data Quality
• Integrated electronic data capture using the
Android Fieldbook and other tablet-based
solutions
• Digital data never “leaks” into “analog”
domain
• Widely used barcoding ensures data
collection quality
• Quality filtering upon upload
Cassavabase Status
• Has collected breeding data from all NextGen programs
• 9.7 million phenotypic observations
• 2488 trials
• 34,000 genotypes
• From phase I (2012-2017) to phase II (2018-2022):
• Increase data collection and ensure quality
• Increase database interoperability and expand the “digital ecosystem” to
farmers
Project Partners
https://github.com/solgenomic
s
https://yambase.org
https://sweetpotatobase.org
https://musabase.org
https://cassavabase.org
Expanding resources: BrAPI
• Breeding Application Programming Interface
(API)
• Language support: Brapi.R interface and Brapi.JS
• Data exchange
• New way for coding breeding applications
(BrAPPs)
• BrAPPs run on any data backend that supports
BrAPI
cassavabase
BMS
GOBii
Flapjack
Germinate
B4R
= Empower breeder’s toolbox to increase genetic gain
https://github.com/CIP-RIU/brapi
Tool Example:
Genotype Visualization
Today’s Demo Content
3.00-4.00: Breeding Data Management, Sample
Tracking and FieldApp (Guillaume)
4.00-5.00: Data analysis (Isaak)
User: sgn
Password: eggplant
1: Generic password:
Database training websites
https://cassava-test.sgn.cornell.edu/
https://cassava-test.sgn.cornell.edu/
2: Personal password:
User: breeder1, breeder2,…,breeder10
Password: ISTRC18
Account Privileges
Account Type Privileges
none Browse, use tools
“user" User database, forum
“submitter"
create trials, add phenotype
information etc.
“Curator” All previous + data deletions
Create New
Trial
Fieldbook
files
Creation
Collect Data
& samples
Import and
Setup Trial in
PhenoApps
Upload
Phenotypes
Historical data
Uploads
Phenotype/
Genotype
Analysis
Crossings
& Nursery
Search/Download
accessions-seeds
Manage
-> List
-> Dataset Manage data
collection
-> barcode tools
-> label design
-> phenoApps
Pedigree & Crossing
-> cross upload
-> seedlots
-> phenoApps
-> Selection Index
-> Summary statistics
-> Graphical filtering
Search tools
-> Single criteria
-> Wizard search
Select / add accessions
Database pipeline and tools: the Big picture
+ Trait ontologies
+ SNP marker data
-> ANOVA, HIDAP
-> Trial comparison
-> Genomic selection
Analysis
Workflows
data collection workflow
Phenotype data collection
Workflows
Tissue sampling
data collection workflow
Tutorial:
https://www.slideshare.net/solgenomics/sample-tracking-tutorialistrc2018
https://cassava-test.sgn.cornell.edu/breeders/search
-1- Select the
“2018_NGCGOBII_Gstr
ialdataset”
-2- Click ”select all”:
-3- Select ”traits” such as:
fresh root yield|CO_334:0000013
top yield|CO_334:0000017
root number counting|CO_334:0000011
harvest index variable|CO_334:0000015
dry matter content by specific gravity method|CO_334:0000160
dry matter content percentage|CO_334:0000092
cassava mosaic disease severity 1-month evaluation|CO_334:0000191
cassava mosaic disease severity 3-month evaluation|CO_334:0000192
cassava mosaic disease severity 6-month evaluation|CO_334:0000194
-4- Download data in
excel:
-5- Store your selection
as a dataset. It will be
stored under your
profile on cassavabase
an can be re-accessed
anytime (same as list)
Access data from cassavabase
• Exploratory
• Descriptive statistics
• Interactive visualization
• Pairwise multiple comparison
• Inferential
• ANOVA, correlation, population structure, clustering
• Genomic Prediction
• QTL analysis…coming soon
• GWAS…coming soon
• Efficiency
• Automation
• Reproducibility
• Access and sharing
Explore trial data
Filter interactively
Compare traits across trials
Analyze data
Check traits correlation
Run ANOVA
Calculate selection index
Check population structure (PCA)…
Partition samples into groups (clusters)
GWAS
GWAS
Genomic Prediction (solGS)
workflow
Phenotyped
&
genotyped individuals
Genomic selection…
Prediction model
Predicted
breeding
Values (GEBVs)
Genotyped selection
candidates
Training population
Prediction modeling
• Univariate
• Two-stage analysis
• GBLUP
• Marker-based realized relationship matrix
• Prediction accuracy
• Based on 2 replication, 10-fold cross-validation
Creating a training dataset
Fitting a prediction model
Exploring model input
Checking the model
Exploring model output
(GEBVs)
Estimating breeding values of
selection candidates
Applying the model…
Selection gain?
Genetic correlation
GEBVs based Multi-trait selection:
Selection index
Summary
• Exploratory and inferential analysis
• Interactive visualization
• Adds efficiency, reproducibility
• Easy access and sharing
Contact us!
USER MANUAL
CONTACT SGN TEAM
Contact us!
https://cassavabase.org/contact/form
Online manual: https://solgenomics.github.io/sgn/
Request new traits: http://submit.rtbbase.org/
Slides: http://www.slideshare.net/solgenomics
Looking for code?
Online Resources
Looking for database tutorials or ontology request?
Looking for phenoApps?
PhenoApps: https://github.com/PhenoApps
https://www.youtube.com/playlist?list=PLs7Y2nGwfz4E5_gv1H6Y4imeWDkFJDhIn
Cassavabase code: https://github.com/solgenomics
BrAPI code: https://brapi.org/
BrApps: https://brapi.org/brapps.php
Lukas
Mueller
Alex
Ogbonna
Bryan
Ellerbrock
Naama
Menda
Isaak
Tecle
Nick
Morales
Chiedozie
Egesi
Peter
Kulakow
Robert
Kawuki
Ismail
Rabbi
Prasad
Peteti
Afola
Agbona
Titima
Tantikanjana
Thanks!
Hernan
Ceballos
Eder
Oliveira

Cassavabase-PhenoApps demo ISTRC 2018