Research opportunities in
materials design using AI/ML
Anubhav Jain
Staff Scientist
Lawrence Berkeley National Laboratory
Acknowledgements:
Kristin Persson (Materials Project), Gerbrand Ceder (Literature Mining, A-lab)
New materials are the critical ingredients
for technological innovation
2
DOI:10.1007/s10853-020-04434-8
DOI:10.3390/en12142750
Current and future opportunities
for applying AI/ML to materials science
Simulation Literature Experiment
The Materials Project @LBNL uses
supercomputing to generate high-quality
data sets on materials properties
• The Materials Project (www.materialsproject.org)
• Free resource of calculated and contributed
materials properties
• >150,000 inorganic compounds
• >500,000 registered users
• Most popular database for downstream machine
learning (composition or structure à property)
Databases cited by machine learning studies
Training machine learning models to
calculate density functional theory energies
Merchant, A., Batzner, S., Schoenholz, S.S. et
al. Scaling deep learning for materials
discovery. Nature 624, 80–85 (2023).
https://doi.org/10.1038/s41586-023-06735-9
Jain A. Machine learning in materials research: developments over the last decade and challenges
for the future. ChemRxiv. 2024; doi:10.26434/chemrxiv-2024-x6spt
Such ML models can be used to
accelerate materials discovery efforts
1. Train ML
models to
predict adsorbate
energy on surfaces
2. Use ML models to screen cost-effective
materials for Se(IV) electrocatalysis
3. Experimental demonstration of
improved performance (Wei Tong)
Example: Se oxyanion remediation in
wastewater using electrocatalysis
Current and future opportunities
for applying AI/ML to materials science
Simulation Literature Experiment
Large language models make it possible to
extract data from research literature at scale
Named Entity Recognition
• Custom machine learning models to
extract the most valuable materials-related
2019 (pre-LLMs):
Custom trained language models,
limited functionality
2022 (early LLMs):
Fine-tuning commercial
language models, good
functionality
2024 (current LLMs):
Commercial language models
give good functionality with
prompt-only on long text
https://doi.org/10.1021/acs.jcim.9b00470 https://doi.org/10.1038/s41467-024-45563-x
Predicting synthesis outcomes using
literature-derived databases & ML
https://doi.org/10.1038/s41597-022-01317-2 (Ceder group)
https://doi.org/10.26434/chemrxiv-2024-ncjlp
1. Use the literature to generate large databases of
materials syntheses procedures and outcomes
2. Train ML models to predict synthesis
outcomes (here, Au nanoparticle shape)
Current and future opportunities
for applying AI/ML to materials science
Simulation Literature Experiment
Synthesis recipe
50 mg Li2CO3
80 mg MnO
20 mg TiO2
800 °C (air)
24 hours
50 mg
80 mg
Target
LiMnTiO4
20 mg
800 °C, 24 hours
Final
product!
There are no well-defined rules
for choosing the most effective
precursors and conditions
Experimental issues like
precursor melting, volatility, or
reactivity with the container
Initial experiments often
give zero target yield.
What to do next?
Making new materials is inherently slow and unpredictable
Even when you are successful, it is very time and labor intensive! 11
From the computer to the “A-lab” (video):
Szymanski, N. J et al. An Autonomous Laboratory for the Accelerated Synthesis of Novel Materials. Nature 2023, 624 (7990), 86–91.
~40 compounds synthesis in 3 weeks via 350+ synthesis
attempts
Making 41 unknown-to-system chemical compositions in <3 weeks
is a major achievement
71% success
per target
37% success
per recipe
13
N.J. Szymanski, et al. Nature. 624 (2023).
Building a materials design system that merges
simulation, experiment, robotics, and ML
“A-lab”
Materials Project
NERSC
AI recipes
based on
“reading”
literature
Iterative AI
refines recipe
to synthesize
target phase
New materials can be
virtually pre-screened
with supercomputers
and AI (“Materials Project”)
Likely synthesis routes can be
predicted using text mining
Robotic equipment and AI (“A-
lab”) can conduct experiments
Questions?

Research opportunities in materials design using AI/ML

  • 1.
    Research opportunities in materialsdesign using AI/ML Anubhav Jain Staff Scientist Lawrence Berkeley National Laboratory Acknowledgements: Kristin Persson (Materials Project), Gerbrand Ceder (Literature Mining, A-lab)
  • 2.
    New materials arethe critical ingredients for technological innovation 2 DOI:10.1007/s10853-020-04434-8 DOI:10.3390/en12142750
  • 3.
    Current and futureopportunities for applying AI/ML to materials science Simulation Literature Experiment
  • 4.
    The Materials Project@LBNL uses supercomputing to generate high-quality data sets on materials properties • The Materials Project (www.materialsproject.org) • Free resource of calculated and contributed materials properties • >150,000 inorganic compounds • >500,000 registered users • Most popular database for downstream machine learning (composition or structure à property) Databases cited by machine learning studies
  • 5.
    Training machine learningmodels to calculate density functional theory energies Merchant, A., Batzner, S., Schoenholz, S.S. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023). https://doi.org/10.1038/s41586-023-06735-9 Jain A. Machine learning in materials research: developments over the last decade and challenges for the future. ChemRxiv. 2024; doi:10.26434/chemrxiv-2024-x6spt
  • 6.
    Such ML modelscan be used to accelerate materials discovery efforts 1. Train ML models to predict adsorbate energy on surfaces 2. Use ML models to screen cost-effective materials for Se(IV) electrocatalysis 3. Experimental demonstration of improved performance (Wei Tong) Example: Se oxyanion remediation in wastewater using electrocatalysis
  • 7.
    Current and futureopportunities for applying AI/ML to materials science Simulation Literature Experiment
  • 8.
    Large language modelsmake it possible to extract data from research literature at scale Named Entity Recognition • Custom machine learning models to extract the most valuable materials-related 2019 (pre-LLMs): Custom trained language models, limited functionality 2022 (early LLMs): Fine-tuning commercial language models, good functionality 2024 (current LLMs): Commercial language models give good functionality with prompt-only on long text https://doi.org/10.1021/acs.jcim.9b00470 https://doi.org/10.1038/s41467-024-45563-x
  • 9.
    Predicting synthesis outcomesusing literature-derived databases & ML https://doi.org/10.1038/s41597-022-01317-2 (Ceder group) https://doi.org/10.26434/chemrxiv-2024-ncjlp 1. Use the literature to generate large databases of materials syntheses procedures and outcomes 2. Train ML models to predict synthesis outcomes (here, Au nanoparticle shape)
  • 10.
    Current and futureopportunities for applying AI/ML to materials science Simulation Literature Experiment
  • 11.
    Synthesis recipe 50 mgLi2CO3 80 mg MnO 20 mg TiO2 800 °C (air) 24 hours 50 mg 80 mg Target LiMnTiO4 20 mg 800 °C, 24 hours Final product! There are no well-defined rules for choosing the most effective precursors and conditions Experimental issues like precursor melting, volatility, or reactivity with the container Initial experiments often give zero target yield. What to do next? Making new materials is inherently slow and unpredictable Even when you are successful, it is very time and labor intensive! 11
  • 12.
    From the computerto the “A-lab” (video): Szymanski, N. J et al. An Autonomous Laboratory for the Accelerated Synthesis of Novel Materials. Nature 2023, 624 (7990), 86–91.
  • 13.
    ~40 compounds synthesisin 3 weeks via 350+ synthesis attempts Making 41 unknown-to-system chemical compositions in <3 weeks is a major achievement 71% success per target 37% success per recipe 13 N.J. Szymanski, et al. Nature. 624 (2023).
  • 14.
    Building a materialsdesign system that merges simulation, experiment, robotics, and ML “A-lab” Materials Project NERSC AI recipes based on “reading” literature Iterative AI refines recipe to synthesize target phase New materials can be virtually pre-screened with supercomputers and AI (“Materials Project”) Likely synthesis routes can be predicted using text mining Robotic equipment and AI (“A- lab”) can conduct experiments
  • 15.