SlideShare a Scribd company logo
---Internal Use---
Artificial Intelligence in Life Sciences
and Agriculture
Yannick Djoumbou Feunang - Corteva Agriscience,
Indianapolis, IN
Big Data Summit 2020
Research Park UIUC
---Internal Use---
Disclaimer: The views and information expressed in this presentation are
solely of the author and do not reflect Corteva Agriscience by any means.
2
---Internal Use---
A Typical Drug Development Process – From Targets to Products
• A very long, costly, and tedious process with overall failure of over 96% (Hingorani
et al., 2019)
• Similar process for crop protection discovery 3
---Internal Use---
Challenges in Drug/Pesticide Development
• Cost and time pressure:
➢Avg. $1.3B and 12 years from discovery to launch in pharma (2009-2018)
➢Avg. $268M and 11.3 years from discovery to launch in Agrosciences (2010-2014)
➢High attrition rates (>96% failure)
• Greater need for productivity, and sustainability
➢Increase in population, decrease in agricultural land
➢More (human, animal, crop) diseases, increased resistance
➢Environmental concerns
• High complexity and adaptability of biological systems:
➢Cannot always be simulated using first-principle (physics-based) models
▪ E.g. Solvation and flexibility of protein chains are complex phenomena
➢Require to (also) use of applicable, large-scale-data driven models
• Despite recent efforts, relevant data is still only available at low scale
4
---Internal Use---
What is Artificial Intelligence (AI)?
“Artificial Intelligence is the theory and
development of computer systems able to perform
tasks that normally require human intelligence,
such as visual perception, speech recognition,
decision-making, and translation between
languages.” OED
5
It utilizes systems and software tools that can parse, interpret
and learn from the input data to make independent decisions
for accomplishing specific objectives
---Internal Use---
The Global AI Market In Pharma and Agrosciences Industries
• The main applications of AI in pharma include drug discovery, precision medicine, medical imaging
& diagnostics research
➢ In January 2020, Exscientia submitted the first AI-generated drug into clinical trials after only 12 months
• The main applications of AI in Agriculture include data generation (sensors and imaging), crop
productivity (Machine learning), and robotics
6
deloitte.com/insights
marketsandmarkets.com/
---Internal Use---
Key Factors To The Increasing Adoption of AI
7
Increased Data generation
rate, and availability
Increasing computational
resources
deloitte.com/insights deloitte.com/insights
Floating-point Operations per Second (FLOPS)
USD
deloitte.com/insights
Exponential increase in computing power – Decrease
in computing costs
---Internal Use---
Key Factors To The Increasing Adoption of AI
• AI and data analytics algorithms are
more mature to efficiently handle
big data:
➢Cognitive search
➢Big data visualization
➢Predictions and Forecasting
➢Synthesis automation
• AI has been part of life sciences for
several decades
8
towardsdatascience.com
---Internal Use---
Evolution of AI in Molecular Design (some key points)
• 1950 – 1960s: First wave
➢Semantic processing, logical reasoning, man-machine interactions
➢Modern Quantitative Structure-Activity Relationship (QSAR) practices (Hansch et al., 1962)
• 1970 – early 2010s: Second wave
➢Progresses in AI-related mathematical modelling, chemical pattern recognition, auto-generation
of molecular fragments
➢Automation first enters the pharma industry (1980s)
➢First instances of constructive ML in pharma (solves problem, learns from experience - 1990s)
➢Combinatorial Chemistry
➢Emergence of several electronic chemical, biological, spectral databases (2000s)
➢Significant applications of ML in chemical safety, synthesis, ADMET, etc.
• Mid-2010s – : Third wave
➢Deep Neural Networks are first used in QSAR (Mat et al. 2015)
➢Significant progresses in Deep/Reinforcement Learning
➢Increasing architectural hardware specialization (GPU, TPU, large-scale parallel computing)
➢Exponential data generation
9
---Internal Use---
Molecule Discovery Process: Iterative Design-Make-Test-Analyze Cycles
10
Hit & Lead Generation Lead Optimization
start with idea for molecule:
1. inspiration
2. data mining
3. publications
iterative
cycles
Design and Make
new molecules
many iterations until product goals
are met
Analyze
learn and build hypotheses
Test molecules
The DMTA Cycle is the fundamental discovery process in all life sciences such as
Drug Discovery, Pesticide Discovery, Material Sciences, Formulation Development, etc.
---Internal Use---
AI Driven Design-Make-Test-Analyze Cycle
11
iterative
cycles
test
design analyze
make
• automated testing
• image, video analysis for grading
• AI predicts routes to make
molecule
• retrosynthesis tools
• automated lab (or CRO) makes it
• automated predictive
models and hypothesis
generation: AI and ML
• global and local models to
predict activities,
properties, toxicity,
environmental impact
• predict metabolites, pro-
drugs
• 3D: modeling + QSAR
• AI designs new molecules
(de-novo, GANs)
• enumerate and search
billions of available
molecules
• text and patent mining and
Natural Language
Processing
---Internal Use---
AI for Computer-aided Synthesis Planning, and Metabolism Prediction
• Computer-Aided Synthesis Planning (CASP):
➢Given a molecule, (how) can it be synthesized starting from available molecules?
➢Introduced by Corey and Wipke in the 1960s (LHASA, rule-based)
➢Most systems implement rule-based, or machine-learning approaches
➢Applicable in process/synthetic chemistry for yield optimization, chemical safety,
green chemistry
• Prediction of Metabolism and Degradation
➢Given a molecule, (how) is it transformed by enzymes into metabolites?
➢Introduced by Wipke in the 1980s (XENO, rule-based)
➢Most systems implement rule-based, and/or machine-learning approaches
➢Applicable in lead generation/optimization, regulatory science, environmental
science
12
---Internal Use---
Data Collection, Processing, and Consumption
• Publication in the scientific literature or
corporate electronic lab notebooks (ELNs)
➢ Data must be extracted, curated,
transformed, aggregated, and integrated
before any search or download
➢ Reaction data mining can be used to address
in-depth questions, and modelling tasks
13
• Corporate ELN data can be extracted, and
annotated before being loaded into a DataMart
• Reaction data can be used for high-
performance search, data mining, data-driven
modelling
Engkvist et al. 2018 Engkvist et al. 2018
---Internal Use---
(Supervised) ML Model Building and Validation - Workflow
• Building ML models for chemistry
➢ Featurization can involved computation of
molecular properties, automated extraction of
significant patterns, etc.
14
• Validating ML models for chemistry
➢ Train/test split can be based on: random
selection, time-split, chemical scaffold, etc.
Strieth-Kalthoff et al. (2020) Strieth-Kalthoff et al. (2020)
---Internal Use---
AI for Computer-Aided Synthesis Planning (CASP)
• Start from the target molecule:
➢ Identify possible retrosynthetic disconnections, and precursor molecules
➢ Repeat the process until you get to a set of precursors that are all available
➢ This requires module for single-step retrosynthesis, search algorithm, list of available precursors
• Additionally, given a set of reactants:
➢ Predict best reaction conditions (e.g. temperature, catalysts, solvent) for optimal yield
➢ Predict forward reaction to identify major products, and possible side products
• Examples of CASP tools: ML/DL-based (e.g. ASKCOS, AIZynthFinder), and Rule-based (e.g. Synthia)
15
Struble et al. 2020
---Internal Use---
AI for Computer-Aided Synthesis Planning (CASP)
16
ASKCOS: (Left) Color-coded green boxes mark if they are purchasable compounds, and blue is for the root target compound
(branebrutinib); (Right) A selected molecule (top) with a single-step precursor (bottom)
Struble et al. 2020
---Internal Use---
AI for Metabolism and Degradation Prediction
• Requires:
➢ Module to predict (and rank) SoMs, enzyme-substrate selectivity, or reaction groups
➢ Library of reaction templates to apply or select (via prediction) from
➢ Modules are usually specific (chemical/enzyme classes), or comprehensive (whole species)
• Some tools include: ML/DL-based (Meteor Nexus, MetaTrans), Rule-based (MetabolExpert), Hybrid
(BioTransformer)
17
Tolclofos-methyl
(Rats; mice)
Substrate selectivity Yes/No
Sites of Metabolism
(SoMs)
OH
Yes
No
Yes
Yes
ML/DL
prediction
ML/DL
prediction
Apply
rule
Apply
rule
Hydroxylation
Desulfurization
O-dealkylation
Epoxidation
Reaction
Templates
---Internal Use---
AI for Metabolism and Degradation Prediction
BioTransformer: Examples of predicted CYP450 metabolites for the pesticide Tolclofos (http://biotransformer.ca/)
18
---Internal Use---
AI for Quantitative Structure Activity Relationship (QSAR) Modelling
• QSAR models are classification or regression
models that use structural features of a molecule
to predict its activity (or a property)
➢Several interrelated activities can be predicted
with one model
• QSAR helps prioritizing compounds for synthesis
and/or biological evaluation
➢It reduces large libraries (105 to 107) to much
smaller sets
➢It alleviates the high costs of experimental
screening
• QSAR tools can be used for bot lead identification
and optimization
19
Neves et al., 2018
---Internal Use---
Summary
• AI now more than ever impacts the Design-Make-Test-
Analyze cycle of molecular design
➢It enables big data ingestion and exploitation, actively
learning, autonomous optimization, and rapid decision-
making
➢A lot of room for exploration and innovation
• Yet, several challenges remain:
➢Lack of big, diversified, and relevant data
➢AI and digital transformation requires cultural change
➢The AI market is desperate for talented AI experts
• A bright future is awaiting
➢Let’s embark on this amazing journey
20
---Internal Use---
THANK YOU FOR LISTENING
21

More Related Content

What's hot

Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineering
Karthikeyan Rathinam
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
Denis C. Bauer
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
Introduction to synthetic biology
Introduction to synthetic biology Introduction to synthetic biology
Introduction to synthetic biology
Gaurav Bohra
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
Pragya Pai
 
Bioentrepreneurship
BioentrepreneurshipBioentrepreneurship
Bioentrepreneurship
Ilika Kaushik
 
Ai for life sciences - are we ready
Ai for life sciences  - are we readyAi for life sciences  - are we ready
Ai for life sciences - are we ready
Jack C Crawford
 
Bioinformatics Software
Bioinformatics SoftwareBioinformatics Software
Bioinformatics Software
university of education,Lahore
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
Vasyl Mykytyuk
 
Synthetic Biology
Synthetic BiologySynthetic Biology
Synthetic Biology
Robert Cormia
 
Ethical issues of genetic engineering
Ethical issues of genetic engineeringEthical issues of genetic engineering
Ethical issues of genetic engineering
Mirpur University of Science and Technology, Mirpur AJK
 
What is ai ?
What is ai ?What is ai ?
What is ai ?
Arunabh Mishra
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
Pranavathiyani G
 
Bioinformatics Projects And Applications
Bioinformatics Projects And ApplicationsBioinformatics Projects And Applications
Bioinformatics Projects And Applications
Dr. Paulsharma Chakravarthy
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
Mykola Dobrochynskyy
 
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
Bernard Marr
 
COMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGYCOMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGY
Krupali Gandhi
 
History of AI, Current Trends, Prospective Trajectories
History of AI, Current Trends, Prospective TrajectoriesHistory of AI, Current Trends, Prospective Trajectories
History of AI, Current Trends, Prospective Trajectories
Giovanni Sileno
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
Yannick Djoumbou
 
Elements of Artificial Intelligence
Elements of Artificial IntelligenceElements of Artificial Intelligence
Elements of Artificial Intelligence
CloudStakes Technology
 

What's hot (20)

Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineering
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Introduction to synthetic biology
Introduction to synthetic biology Introduction to synthetic biology
Introduction to synthetic biology
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Bioentrepreneurship
BioentrepreneurshipBioentrepreneurship
Bioentrepreneurship
 
Ai for life sciences - are we ready
Ai for life sciences  - are we readyAi for life sciences  - are we ready
Ai for life sciences - are we ready
 
Bioinformatics Software
Bioinformatics SoftwareBioinformatics Software
Bioinformatics Software
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
 
Synthetic Biology
Synthetic BiologySynthetic Biology
Synthetic Biology
 
Ethical issues of genetic engineering
Ethical issues of genetic engineeringEthical issues of genetic engineering
Ethical issues of genetic engineering
 
What is ai ?
What is ai ?What is ai ?
What is ai ?
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
 
Bioinformatics Projects And Applications
Bioinformatics Projects And ApplicationsBioinformatics Projects And Applications
Bioinformatics Projects And Applications
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
 
COMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGYCOMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGY
 
History of AI, Current Trends, Prospective Trajectories
History of AI, Current Trends, Prospective TrajectoriesHistory of AI, Current Trends, Prospective Trajectories
History of AI, Current Trends, Prospective Trajectories
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
 
Elements of Artificial Intelligence
Elements of Artificial IntelligenceElements of Artificial Intelligence
Elements of Artificial Intelligence
 

Similar to Artificial Intelligence in Life Sciences and Agriculture.

Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Bigfinite
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia Alliance
 
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Sean Ekins
 
Artificial intelligence robotics and computational fluid dynamics
Artificial intelligence robotics and computational fluid dynamics Artificial intelligence robotics and computational fluid dynamics
Artificial intelligence robotics and computational fluid dynamics
Chandrakant Kharude
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Chief Analytics Officer Forum
 
Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Intel big data analytics in health and life sciences personalized medicine
Intel big data analytics in health and life sciences personalized medicineIntel big data analytics in health and life sciences personalized medicine
Intel big data analytics in health and life sciences personalized medicine
Ketan Paranjape
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
EMC
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET Journal
 
Considerations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflowConsiderations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflow
Eagle Genomics
 
Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2
Value Amplify Consulting
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
GigaScience, BGI Hong Kong
 
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdfApplications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
ArunPrasad880048
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
Poo Kuan Hoong
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Ed Griffen
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
Kimberley Mitchell
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Sri Ambati
 
Operation research and its application
Operation research and its applicationOperation research and its application
Operation research and its application
priya sinha
 

Similar to Artificial Intelligence in Life Sciences and Agriculture. (20)

Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 
AI.pptx
AI.pptxAI.pptx
AI.pptx
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
 
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
Assay Central: A New Approach to Compiling Big Data and Preparing Machine Lea...
 
Artificial intelligence robotics and computational fluid dynamics
Artificial intelligence robotics and computational fluid dynamics Artificial intelligence robotics and computational fluid dynamics
Artificial intelligence robotics and computational fluid dynamics
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...
 
Intel big data analytics in health and life sciences personalized medicine
Intel big data analytics in health and life sciences personalized medicineIntel big data analytics in health and life sciences personalized medicine
Intel big data analytics in health and life sciences personalized medicine
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of Diabetes
 
Considerations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflowConsiderations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflow
 
Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2Machine Learning Impact on IoT - Part 2
Machine Learning Impact on IoT - Part 2
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
 
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdfApplications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
Applications-of-AI-in-Drug-Discovery-and-Development-PreScouter.pdf
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
 
Nesher Tech I-Corps@NIH 121014
Nesher Tech I-Corps@NIH 121014Nesher Tech I-Corps@NIH 121014
Nesher Tech I-Corps@NIH 121014
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
 
Operation research and its application
Operation research and its applicationOperation research and its application
Operation research and its application
 

Recently uploaded

Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
yusufzako14
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
anitaento25
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
azzyixes
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 

Recently uploaded (20)

Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 

Artificial Intelligence in Life Sciences and Agriculture.

  • 1. ---Internal Use--- Artificial Intelligence in Life Sciences and Agriculture Yannick Djoumbou Feunang - Corteva Agriscience, Indianapolis, IN Big Data Summit 2020 Research Park UIUC
  • 2. ---Internal Use--- Disclaimer: The views and information expressed in this presentation are solely of the author and do not reflect Corteva Agriscience by any means. 2
  • 3. ---Internal Use--- A Typical Drug Development Process – From Targets to Products • A very long, costly, and tedious process with overall failure of over 96% (Hingorani et al., 2019) • Similar process for crop protection discovery 3
  • 4. ---Internal Use--- Challenges in Drug/Pesticide Development • Cost and time pressure: ➢Avg. $1.3B and 12 years from discovery to launch in pharma (2009-2018) ➢Avg. $268M and 11.3 years from discovery to launch in Agrosciences (2010-2014) ➢High attrition rates (>96% failure) • Greater need for productivity, and sustainability ➢Increase in population, decrease in agricultural land ➢More (human, animal, crop) diseases, increased resistance ➢Environmental concerns • High complexity and adaptability of biological systems: ➢Cannot always be simulated using first-principle (physics-based) models ▪ E.g. Solvation and flexibility of protein chains are complex phenomena ➢Require to (also) use of applicable, large-scale-data driven models • Despite recent efforts, relevant data is still only available at low scale 4
  • 5. ---Internal Use--- What is Artificial Intelligence (AI)? “Artificial Intelligence is the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.” OED 5 It utilizes systems and software tools that can parse, interpret and learn from the input data to make independent decisions for accomplishing specific objectives
  • 6. ---Internal Use--- The Global AI Market In Pharma and Agrosciences Industries • The main applications of AI in pharma include drug discovery, precision medicine, medical imaging & diagnostics research ➢ In January 2020, Exscientia submitted the first AI-generated drug into clinical trials after only 12 months • The main applications of AI in Agriculture include data generation (sensors and imaging), crop productivity (Machine learning), and robotics 6 deloitte.com/insights marketsandmarkets.com/
  • 7. ---Internal Use--- Key Factors To The Increasing Adoption of AI 7 Increased Data generation rate, and availability Increasing computational resources deloitte.com/insights deloitte.com/insights Floating-point Operations per Second (FLOPS) USD deloitte.com/insights Exponential increase in computing power – Decrease in computing costs
  • 8. ---Internal Use--- Key Factors To The Increasing Adoption of AI • AI and data analytics algorithms are more mature to efficiently handle big data: ➢Cognitive search ➢Big data visualization ➢Predictions and Forecasting ➢Synthesis automation • AI has been part of life sciences for several decades 8 towardsdatascience.com
  • 9. ---Internal Use--- Evolution of AI in Molecular Design (some key points) • 1950 – 1960s: First wave ➢Semantic processing, logical reasoning, man-machine interactions ➢Modern Quantitative Structure-Activity Relationship (QSAR) practices (Hansch et al., 1962) • 1970 – early 2010s: Second wave ➢Progresses in AI-related mathematical modelling, chemical pattern recognition, auto-generation of molecular fragments ➢Automation first enters the pharma industry (1980s) ➢First instances of constructive ML in pharma (solves problem, learns from experience - 1990s) ➢Combinatorial Chemistry ➢Emergence of several electronic chemical, biological, spectral databases (2000s) ➢Significant applications of ML in chemical safety, synthesis, ADMET, etc. • Mid-2010s – : Third wave ➢Deep Neural Networks are first used in QSAR (Mat et al. 2015) ➢Significant progresses in Deep/Reinforcement Learning ➢Increasing architectural hardware specialization (GPU, TPU, large-scale parallel computing) ➢Exponential data generation 9
  • 10. ---Internal Use--- Molecule Discovery Process: Iterative Design-Make-Test-Analyze Cycles 10 Hit & Lead Generation Lead Optimization start with idea for molecule: 1. inspiration 2. data mining 3. publications iterative cycles Design and Make new molecules many iterations until product goals are met Analyze learn and build hypotheses Test molecules The DMTA Cycle is the fundamental discovery process in all life sciences such as Drug Discovery, Pesticide Discovery, Material Sciences, Formulation Development, etc.
  • 11. ---Internal Use--- AI Driven Design-Make-Test-Analyze Cycle 11 iterative cycles test design analyze make • automated testing • image, video analysis for grading • AI predicts routes to make molecule • retrosynthesis tools • automated lab (or CRO) makes it • automated predictive models and hypothesis generation: AI and ML • global and local models to predict activities, properties, toxicity, environmental impact • predict metabolites, pro- drugs • 3D: modeling + QSAR • AI designs new molecules (de-novo, GANs) • enumerate and search billions of available molecules • text and patent mining and Natural Language Processing
  • 12. ---Internal Use--- AI for Computer-aided Synthesis Planning, and Metabolism Prediction • Computer-Aided Synthesis Planning (CASP): ➢Given a molecule, (how) can it be synthesized starting from available molecules? ➢Introduced by Corey and Wipke in the 1960s (LHASA, rule-based) ➢Most systems implement rule-based, or machine-learning approaches ➢Applicable in process/synthetic chemistry for yield optimization, chemical safety, green chemistry • Prediction of Metabolism and Degradation ➢Given a molecule, (how) is it transformed by enzymes into metabolites? ➢Introduced by Wipke in the 1980s (XENO, rule-based) ➢Most systems implement rule-based, and/or machine-learning approaches ➢Applicable in lead generation/optimization, regulatory science, environmental science 12
  • 13. ---Internal Use--- Data Collection, Processing, and Consumption • Publication in the scientific literature or corporate electronic lab notebooks (ELNs) ➢ Data must be extracted, curated, transformed, aggregated, and integrated before any search or download ➢ Reaction data mining can be used to address in-depth questions, and modelling tasks 13 • Corporate ELN data can be extracted, and annotated before being loaded into a DataMart • Reaction data can be used for high- performance search, data mining, data-driven modelling Engkvist et al. 2018 Engkvist et al. 2018
  • 14. ---Internal Use--- (Supervised) ML Model Building and Validation - Workflow • Building ML models for chemistry ➢ Featurization can involved computation of molecular properties, automated extraction of significant patterns, etc. 14 • Validating ML models for chemistry ➢ Train/test split can be based on: random selection, time-split, chemical scaffold, etc. Strieth-Kalthoff et al. (2020) Strieth-Kalthoff et al. (2020)
  • 15. ---Internal Use--- AI for Computer-Aided Synthesis Planning (CASP) • Start from the target molecule: ➢ Identify possible retrosynthetic disconnections, and precursor molecules ➢ Repeat the process until you get to a set of precursors that are all available ➢ This requires module for single-step retrosynthesis, search algorithm, list of available precursors • Additionally, given a set of reactants: ➢ Predict best reaction conditions (e.g. temperature, catalysts, solvent) for optimal yield ➢ Predict forward reaction to identify major products, and possible side products • Examples of CASP tools: ML/DL-based (e.g. ASKCOS, AIZynthFinder), and Rule-based (e.g. Synthia) 15 Struble et al. 2020
  • 16. ---Internal Use--- AI for Computer-Aided Synthesis Planning (CASP) 16 ASKCOS: (Left) Color-coded green boxes mark if they are purchasable compounds, and blue is for the root target compound (branebrutinib); (Right) A selected molecule (top) with a single-step precursor (bottom) Struble et al. 2020
  • 17. ---Internal Use--- AI for Metabolism and Degradation Prediction • Requires: ➢ Module to predict (and rank) SoMs, enzyme-substrate selectivity, or reaction groups ➢ Library of reaction templates to apply or select (via prediction) from ➢ Modules are usually specific (chemical/enzyme classes), or comprehensive (whole species) • Some tools include: ML/DL-based (Meteor Nexus, MetaTrans), Rule-based (MetabolExpert), Hybrid (BioTransformer) 17 Tolclofos-methyl (Rats; mice) Substrate selectivity Yes/No Sites of Metabolism (SoMs) OH Yes No Yes Yes ML/DL prediction ML/DL prediction Apply rule Apply rule Hydroxylation Desulfurization O-dealkylation Epoxidation Reaction Templates
  • 18. ---Internal Use--- AI for Metabolism and Degradation Prediction BioTransformer: Examples of predicted CYP450 metabolites for the pesticide Tolclofos (http://biotransformer.ca/) 18
  • 19. ---Internal Use--- AI for Quantitative Structure Activity Relationship (QSAR) Modelling • QSAR models are classification or regression models that use structural features of a molecule to predict its activity (or a property) ➢Several interrelated activities can be predicted with one model • QSAR helps prioritizing compounds for synthesis and/or biological evaluation ➢It reduces large libraries (105 to 107) to much smaller sets ➢It alleviates the high costs of experimental screening • QSAR tools can be used for bot lead identification and optimization 19 Neves et al., 2018
  • 20. ---Internal Use--- Summary • AI now more than ever impacts the Design-Make-Test- Analyze cycle of molecular design ➢It enables big data ingestion and exploitation, actively learning, autonomous optimization, and rapid decision- making ➢A lot of room for exploration and innovation • Yet, several challenges remain: ➢Lack of big, diversified, and relevant data ➢AI and digital transformation requires cultural change ➢The AI market is desperate for talented AI experts • A bright future is awaiting ➢Let’s embark on this amazing journey 20
  • 21. ---Internal Use--- THANK YOU FOR LISTENING 21