SlideShare a Scribd company logo
1 of 17
Download to read offline
Predicting Phenotype from Multi-Scale Genomic
and Environment Data using Neural Networks
and Knowledge Graphs: An Introduction to the
NSF GenoPhenoEnvo Project
Anne E Thessen, Michael Behrisch, Emily J Cain, Remco Chang, Bryan Heidorn,
Pankaj Jaiswal, David LeBauer, Ab Mosca, Monica C Munoz-Torres, Arun Ross,
Tyson Swetnam
Acknowledgements
● NSF Ideas Lab
● NSF Award 1940330 Harnessing
the Data Revolution
● Translational and Integrative
Sciences Lab OSU (tislab.org)
● Two new members: Ishita Debnath
(MSU) and Ryan Bartelme (UA)
Predicting Phenotype from Genes and Environments
● G + E = P only works
in the simplest
systems, if at all
● There’s a lot we
don’t know about
how genes are
translated into
phenotypes
● How do phenotypes
affect the
ecosystem?
How can we predict phenotype given an organism’s environmental conditions and
genomic endowment?
● State-of-the-art statistical
modeling has led to many
insights, but has been applied
to very controlled systems.
● Getting the phenotype is only
part of the answer.
● Can we use the predictive
model to reveal hidden
processes? Critical variables?
Machine Learning for Results and Process
● Pros
○ Capable of coping with non-linearity
in biological systems
○ Find hidden relationships
● Cons
○ Output is opaque
○ Not enough curated data not available
○ Data that are available not “ML ready”
GenoPhenoEnvo Project
● Goal: Develop a machine learning framework capable of predicting
phenotypes based on multi-scale data about genes and environments.
○ Leverage existing, well-structured, cross-species reference data about genes and phenotypes
○ Provide interactive data visualizations for examining and interpreting the “black-box” behavior
of ML models and their results
○ Realize a new model for relating phenotypes, genetic endowment, and environmental
characteristics
● Just started Oct 1
● Now in Year 1 Q2
Training Data
Year 1
● TERRA-REF sorghum
● Heavily controlled and measured
environment data
● Thorough genotyping and
phenotyping
● Knowledge graph links
● How do we prepare these data to
be ML ready?
Year 2
● TERRA-REF wheat
● NEON, EOS
● Citizen science phenology
What is a Knowledge Graph?
How Can Knowledge Graphs Help?
DOI: 10.1126/scitranslmed.3009262
How Can Knowledge Graphs Help?
How Can Knowledge Graphs Help?
● Constrain ML and prioritize results
● Quality control - sanity check
● Integrate heterogeneous data
○ Manage terminology
○ Manage scale and granularity
● Find new relationships
● Fill in data gaps with inferencing
Training Data - Genomic
● Use VCF files and phenotype data from
TERRA-REF
● SNP to phenotype associations with p values and
effect sizes (GWAS)
● Manhattan plot for all phenotype data
List 1: TERRA-REF data for Y1
Gene Information (G)
● Sorghum whole genome
● Sorghum genotypes[219]
Phenotype Information (P)
● Emergence Date
● End of Season Biomass
● End of Season Height
● Flowering Date
Environment Information (E)
● Soil moisture
● Air temperature
● PAR (Irradiance)
● Wind speed
● Humidity
● Precipitation and Irrigation
● Fertilizer inputs
By M. Kamran Ikram et al - Ikram MK et al (2010) Four Novel Loci (19q13, 6q24, 12q24, and
5q14) Influence the Microcirculation In Vivo. PLoS Genet. 2010 Oct 28;6(10):e1001184.
doi:10.1371/journal.pgen.1001184.g001, CC BY 2.5,
https://commons.wikimedia.org/w/index.php?curid=18056138
*example
GWAS results can be combined
with the knowledge graph results to
reduce input variables for ML
Training Data - Phenomic
● Days to flowering
● Growing Degree Days to flowering
● Days to flag leaf emergence
● Growing Degree Days to flag leaf emergence
● Canopy height
● End of season canopy height
● Above ground dry biomass at harvest
List 1: TERRA-REF data for Y1
Gene Information (G)
● Sorghum whole genome
● Sorghum genotypes[219]
Phenotype Information (P)
● Emergence Date
● End of Season Biomass
● End of Season Height
● Flowering Date
Environment Information (E)
● Soil moisture
● Air temperature
● PAR (Irradiance)
● Wind speed
● Humidity
● Precipitation and Irrigation
● Fertilizer inputs
Phenotypes can be
represented as
categories or
integers (10 cm or
decreased height?)
Training Data - Environmental
● Air temperature
● Relative humidity
● Precipitation
● Wind speed and direction
● Growing degree days
● Cumulative precipitation
List 1: TERRA-REF data for Y1
Gene Information (G)
● Sorghum whole genome
● Sorghum genotypes[219]
Phenotype Information (P)
● Emergence Date
● End of Season Biomass
● End of Season Height
● Flowering Date
Environment Information (E)
● Soil moisture
● Air temperature
● PAR (Irradiance)
● Wind speed
● Humidity
● Precipitation and Irrigation
● Fertilizer inputs
Data from weather station and gantry
Abstracted to daily average, min, and max
Machine Learning - Preliminary
1. Regression Models
2. Simple Neural Networks
3. Deep Neural Networks
Year 2 - Expanding to Ecosystems (Preliminary)
Leverage data and resources from
multiple NSF supported programs:
● Analyze genetic and remote
sensing data from NEON
● Utilize the CyVerse Data Science
Workbench
● XSEDE for ML computations
List 2: Observational & EOS data for Y2
Gene Information (G)
● Cottonwood whole genome
● Cottonwood genotypes
● Sorghum whole genome
● Sorghum genotypes
● Wheat whole genome
● Wheat genotypes
Environment Information (E)
● Soil moisture (e.g., SMAP)
● Precipitation (Daymet)
● Air temperature (Daymet)
● PAR (NARR)
● Soil Type (USDA)
Phenotype Information (P)
● Leafing date
● Flowering date
● Breaking leaf buds (#)
● Ripe fruits (#)
● Increasing leaf size (Y/N)
● Falling leaves (Y/N)
● Colored leaves (Y/N)
● Flowers or buds (#)
● Open flowers (#)
● Pollen release (Y/N)
● Recent fruit/seed drop
(Y/N)
● Fruits (#)
● NDVI (EOS)
GenoPhenoEnvo Project Information
● Join our Google Group
● Watch our GitHub Repo
github.com/genophenoenvo
● Search Twitter hashtag
#GenoPhenoEnvo
● Visit the project web page
● Anne E Thessen
● annethessen@gmail.com
Questions?

More Related Content

Similar to Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project

Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisAntica Culina
 
Phenomics in crop improvement
Phenomics in crop  improvementPhenomics in crop  improvement
Phenomics in crop improvementsukruthaa
 
My presentation at ICEM 2017: From data mining to information extraction: usi...
My presentation at ICEM 2017: From data mining to information extraction: usi...My presentation at ICEM 2017: From data mining to information extraction: usi...
My presentation at ICEM 2017: From data mining to information extraction: usi...matteodefelice
 
Affordable field high-throughput phenotyping - some tips
Affordable field high-throughput phenotyping - some tipsAffordable field high-throughput phenotyping - some tips
Affordable field high-throughput phenotyping - some tipsCIMMYT
 
Perth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_quPerth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_qubensparrowau
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxgrssieee
 
New predictive characterization methods for accessing and using crop wild rel...
New predictive characterization methods for accessing and using crop wild rel...New predictive characterization methods for accessing and using crop wild rel...
New predictive characterization methods for accessing and using crop wild rel...Bioversity International
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticEdizonJambormias2
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTTERN Australia
 
agriculture decision support system
agriculture decision support systemagriculture decision support system
agriculture decision support systemPankaj Khodifad
 
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.pptJianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.pptgrssieee
 
Baseline study for EIA
Baseline study for EIABaseline study for EIA
Baseline study for EIAJenson Samraj
 
Drought Assessment + Impacts: A Preview
Drought Assessment + Impacts: A PreviewDrought Assessment + Impacts: A Preview
Drought Assessment + Impacts: A PreviewJenkins Macedo
 
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...John Blue
 
General ausplots school
General ausplots schoolGeneral ausplots school
General ausplots schoolbensparrowau
 
Topological Data Analysis What is it? What is it good for? How can it be use...
Topological Data Analysis  What is it? What is it good for? How can it be use...Topological Data Analysis  What is it? What is it good for? How can it be use...
Topological Data Analysis What is it? What is it good for? How can it be use...DanChitwood
 

Similar to Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project (20)

Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysis
 
Phenomics in crop improvement
Phenomics in crop  improvementPhenomics in crop  improvement
Phenomics in crop improvement
 
My presentation at ICEM 2017: From data mining to information extraction: usi...
My presentation at ICEM 2017: From data mining to information extraction: usi...My presentation at ICEM 2017: From data mining to information extraction: usi...
My presentation at ICEM 2017: From data mining to information extraction: usi...
 
Affordable field high-throughput phenotyping - some tips
Affordable field high-throughput phenotyping - some tipsAffordable field high-throughput phenotyping - some tips
Affordable field high-throughput phenotyping - some tips
 
NTT
NTTNTT
NTT
 
Perth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_quPerth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_qu
 
Undergraduate Modeling Workshop - Vegetation Working Group Final Presentatio...
Undergraduate Modeling Workshop -  Vegetation Working Group Final Presentatio...Undergraduate Modeling Workshop -  Vegetation Working Group Final Presentatio...
Undergraduate Modeling Workshop - Vegetation Working Group Final Presentatio...
 
Imprint of nutrient availability on WUE and Ts- α relationship
Imprint of nutrient availability on WUE and Ts- α relationshipImprint of nutrient availability on WUE and Ts- α relationship
Imprint of nutrient availability on WUE and Ts- α relationship
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptx
 
New predictive characterization methods for accessing and using crop wild rel...
New predictive characterization methods for accessing and using crop wild rel...New predictive characterization methods for accessing and using crop wild rel...
New predictive characterization methods for accessing and using crop wild rel...
 
Newproject
NewprojectNewproject
Newproject
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MAST
 
agriculture decision support system
agriculture decision support systemagriculture decision support system
agriculture decision support system
 
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.pptJianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
 
Baseline study for EIA
Baseline study for EIABaseline study for EIA
Baseline study for EIA
 
Drought Assessment + Impacts: A Preview
Drought Assessment + Impacts: A PreviewDrought Assessment + Impacts: A Preview
Drought Assessment + Impacts: A Preview
 
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
 
General ausplots school
General ausplots schoolGeneral ausplots school
General ausplots school
 
Topological Data Analysis What is it? What is it good for? How can it be use...
Topological Data Analysis  What is it? What is it good for? How can it be use...Topological Data Analysis  What is it? What is it good for? How can it be use...
Topological Data Analysis What is it? What is it good for? How can it be use...
 

More from Anne Thessen

Unifying Genomics, Phenomics, and Environments
Unifying Genomics, Phenomics, and EnvironmentsUnifying Genomics, Phenomics, and Environments
Unifying Genomics, Phenomics, and EnvironmentsAnne Thessen
 
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...Anne Thessen
 
Bridging discrepancies across North American butterfly naming authorities: Su...
Bridging discrepancies across North American butterfly naming authorities: Su...Bridging discrepancies across North American butterfly naming authorities: Su...
Bridging discrepancies across North American butterfly naming authorities: Su...Anne Thessen
 
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...Anne Thessen
 
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesNext-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesAnne Thessen
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecologyAnne Thessen
 
Data Infrastructure for Coastal and Estuarine Science
Data Infrastructure for Coastal and Estuarine ScienceData Infrastructure for Coastal and Estuarine Science
Data Infrastructure for Coastal and Estuarine ScienceAnne Thessen
 
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...Anne Thessen
 
Knowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTKKnowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTKAnne Thessen
 
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...Marrying models and data: Adventures in Modeling, Data Wrangling and Software...
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...Anne Thessen
 
Visualizing Evolution
Visualizing EvolutionVisualizing Evolution
Visualizing EvolutionAnne Thessen
 
The Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyThe Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyAnne Thessen
 
Knowledge Extraction and Semantic Linking in the Encyclopedia of Life
Knowledge Extraction and Semantic Linking in the Encyclopedia of LifeKnowledge Extraction and Semantic Linking in the Encyclopedia of Life
Knowledge Extraction and Semantic Linking in the Encyclopedia of LifeAnne Thessen
 

More from Anne Thessen (13)

Unifying Genomics, Phenomics, and Environments
Unifying Genomics, Phenomics, and EnvironmentsUnifying Genomics, Phenomics, and Environments
Unifying Genomics, Phenomics, and Environments
 
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...
Combining Phenomes and Genomes to Fill Analytical Gaps: Data Management in Ph...
 
Bridging discrepancies across North American butterfly naming authorities: Su...
Bridging discrepancies across North American butterfly naming authorities: Su...Bridging discrepancies across North American butterfly naming authorities: Su...
Bridging discrepancies across North American butterfly naming authorities: Su...
 
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...
Ontological Support of Data Discovery and Synthesis in Estuarine and Coastal ...
 
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesNext-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
 
Linking biodiversity data for ecology
Linking biodiversity data for ecologyLinking biodiversity data for ecology
Linking biodiversity data for ecology
 
Data Infrastructure for Coastal and Estuarine Science
Data Infrastructure for Coastal and Estuarine ScienceData Infrastructure for Coastal and Estuarine Science
Data Infrastructure for Coastal and Estuarine Science
 
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...
Gulf of Mexico Hydrocarbon Database: Integrating Heterogeneous Data for Impro...
 
Knowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTKKnowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTK
 
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...Marrying models and data: Adventures in Modeling, Data Wrangling and Software...
Marrying models and data: Adventures in Modeling, Data Wrangling and Software...
 
Visualizing Evolution
Visualizing EvolutionVisualizing Evolution
Visualizing Evolution
 
The Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyThe Future of Microalgal Taxonomy
The Future of Microalgal Taxonomy
 
Knowledge Extraction and Semantic Linking in the Encyclopedia of Life
Knowledge Extraction and Semantic Linking in the Encyclopedia of LifeKnowledge Extraction and Semantic Linking in the Encyclopedia of Life
Knowledge Extraction and Semantic Linking in the Encyclopedia of Life
 

Recently uploaded

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 

Recently uploaded (20)

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 

Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project

  • 1. Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project Anne E Thessen, Michael Behrisch, Emily J Cain, Remco Chang, Bryan Heidorn, Pankaj Jaiswal, David LeBauer, Ab Mosca, Monica C Munoz-Torres, Arun Ross, Tyson Swetnam
  • 2. Acknowledgements ● NSF Ideas Lab ● NSF Award 1940330 Harnessing the Data Revolution ● Translational and Integrative Sciences Lab OSU (tislab.org) ● Two new members: Ishita Debnath (MSU) and Ryan Bartelme (UA)
  • 3. Predicting Phenotype from Genes and Environments ● G + E = P only works in the simplest systems, if at all ● There’s a lot we don’t know about how genes are translated into phenotypes ● How do phenotypes affect the ecosystem?
  • 4. How can we predict phenotype given an organism’s environmental conditions and genomic endowment? ● State-of-the-art statistical modeling has led to many insights, but has been applied to very controlled systems. ● Getting the phenotype is only part of the answer. ● Can we use the predictive model to reveal hidden processes? Critical variables?
  • 5. Machine Learning for Results and Process ● Pros ○ Capable of coping with non-linearity in biological systems ○ Find hidden relationships ● Cons ○ Output is opaque ○ Not enough curated data not available ○ Data that are available not “ML ready”
  • 6. GenoPhenoEnvo Project ● Goal: Develop a machine learning framework capable of predicting phenotypes based on multi-scale data about genes and environments. ○ Leverage existing, well-structured, cross-species reference data about genes and phenotypes ○ Provide interactive data visualizations for examining and interpreting the “black-box” behavior of ML models and their results ○ Realize a new model for relating phenotypes, genetic endowment, and environmental characteristics ● Just started Oct 1 ● Now in Year 1 Q2
  • 7. Training Data Year 1 ● TERRA-REF sorghum ● Heavily controlled and measured environment data ● Thorough genotyping and phenotyping ● Knowledge graph links ● How do we prepare these data to be ML ready? Year 2 ● TERRA-REF wheat ● NEON, EOS ● Citizen science phenology
  • 8. What is a Knowledge Graph?
  • 9. How Can Knowledge Graphs Help? DOI: 10.1126/scitranslmed.3009262
  • 10. How Can Knowledge Graphs Help?
  • 11. How Can Knowledge Graphs Help? ● Constrain ML and prioritize results ● Quality control - sanity check ● Integrate heterogeneous data ○ Manage terminology ○ Manage scale and granularity ● Find new relationships ● Fill in data gaps with inferencing
  • 12. Training Data - Genomic ● Use VCF files and phenotype data from TERRA-REF ● SNP to phenotype associations with p values and effect sizes (GWAS) ● Manhattan plot for all phenotype data List 1: TERRA-REF data for Y1 Gene Information (G) ● Sorghum whole genome ● Sorghum genotypes[219] Phenotype Information (P) ● Emergence Date ● End of Season Biomass ● End of Season Height ● Flowering Date Environment Information (E) ● Soil moisture ● Air temperature ● PAR (Irradiance) ● Wind speed ● Humidity ● Precipitation and Irrigation ● Fertilizer inputs By M. Kamran Ikram et al - Ikram MK et al (2010) Four Novel Loci (19q13, 6q24, 12q24, and 5q14) Influence the Microcirculation In Vivo. PLoS Genet. 2010 Oct 28;6(10):e1001184. doi:10.1371/journal.pgen.1001184.g001, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=18056138 *example GWAS results can be combined with the knowledge graph results to reduce input variables for ML
  • 13. Training Data - Phenomic ● Days to flowering ● Growing Degree Days to flowering ● Days to flag leaf emergence ● Growing Degree Days to flag leaf emergence ● Canopy height ● End of season canopy height ● Above ground dry biomass at harvest List 1: TERRA-REF data for Y1 Gene Information (G) ● Sorghum whole genome ● Sorghum genotypes[219] Phenotype Information (P) ● Emergence Date ● End of Season Biomass ● End of Season Height ● Flowering Date Environment Information (E) ● Soil moisture ● Air temperature ● PAR (Irradiance) ● Wind speed ● Humidity ● Precipitation and Irrigation ● Fertilizer inputs Phenotypes can be represented as categories or integers (10 cm or decreased height?)
  • 14. Training Data - Environmental ● Air temperature ● Relative humidity ● Precipitation ● Wind speed and direction ● Growing degree days ● Cumulative precipitation List 1: TERRA-REF data for Y1 Gene Information (G) ● Sorghum whole genome ● Sorghum genotypes[219] Phenotype Information (P) ● Emergence Date ● End of Season Biomass ● End of Season Height ● Flowering Date Environment Information (E) ● Soil moisture ● Air temperature ● PAR (Irradiance) ● Wind speed ● Humidity ● Precipitation and Irrigation ● Fertilizer inputs Data from weather station and gantry Abstracted to daily average, min, and max
  • 15. Machine Learning - Preliminary 1. Regression Models 2. Simple Neural Networks 3. Deep Neural Networks
  • 16. Year 2 - Expanding to Ecosystems (Preliminary) Leverage data and resources from multiple NSF supported programs: ● Analyze genetic and remote sensing data from NEON ● Utilize the CyVerse Data Science Workbench ● XSEDE for ML computations List 2: Observational & EOS data for Y2 Gene Information (G) ● Cottonwood whole genome ● Cottonwood genotypes ● Sorghum whole genome ● Sorghum genotypes ● Wheat whole genome ● Wheat genotypes Environment Information (E) ● Soil moisture (e.g., SMAP) ● Precipitation (Daymet) ● Air temperature (Daymet) ● PAR (NARR) ● Soil Type (USDA) Phenotype Information (P) ● Leafing date ● Flowering date ● Breaking leaf buds (#) ● Ripe fruits (#) ● Increasing leaf size (Y/N) ● Falling leaves (Y/N) ● Colored leaves (Y/N) ● Flowers or buds (#) ● Open flowers (#) ● Pollen release (Y/N) ● Recent fruit/seed drop (Y/N) ● Fruits (#) ● NDVI (EOS)
  • 17. GenoPhenoEnvo Project Information ● Join our Google Group ● Watch our GitHub Repo github.com/genophenoenvo ● Search Twitter hashtag #GenoPhenoEnvo ● Visit the project web page ● Anne E Thessen ● annethessen@gmail.com Questions?