SlideShare a Scribd company logo
1 of 25
Download to read offline
JARVIS-ML
2D/3D materials screening and genetic
algorithm with ML model
Kamal Choudhary, Brian DeCost, Francesca Tavazza
NIST, Gaithersburg
AIMS workshop
August 07, 2018
1
Acknowledgement and Collaboration
• Carelyn Campbell, James Warren, Andrew Reid, Daniel Wheeler, Aaron Gilad Kusne, Jason Hattrick-Simpers,
Martin Green, Faical Y. Congo, Zachary Trautt, Andrew Reid, Nhan Van Nguyen, Sugata Chowdhury, Albert
Davydov, Irina Kalish, Chandler Becker, Ryan Beams, Adam Biacchi, Kevin Garrity, Russell Johnson, John
Vinson, NIST
• Logan Ward, University of Chicago
• Ankit Agrawal, Northwestern University
• Patrick Riley, Google
• Yuri Mishin, George Mason University
• Richard Hennig, University of Florida
• Tao Liang, Penn state
• Materials-Project team, Lawrence Berkeley National Laboratory
• Evan Reed, Stanford University
• Xin Zhao, Fen Zhang, Ames lab
• Sam Reeve (nanohub.org), Purdue University
• Lidia Carvalho Gomes, National University of Singapore
2
Outline
• AI in materials, Motivation
• JARVIS-FF, DFT and ML
• Representing materials to computers
• Visualizing multi-dimensional data
• Histograms for target data
• Gradient boosting decision trees
• Classification and regression models, feature importance
• Application of ML models in materials screening
• Application of ML in mapping energy landscape
• Web-app
• Conclusions
3
AI in materials
• For successful application of AI:
-High fidelity data, pertinent algorithm and validation strategy
• Unlike other AI input data (cat/dog images from facebook etc.), materials data are really
small and takes long time to generate
• Available of easily applied AI algorithms: scikit-learn, tensor-flow, lightgbm, Keras etc.
• Publicly available databases such as JARVIS-DFT, Materials-project, AFLOW, OQMD,
Harvard clean energy project, AiiDA, PDB database, COD database etc.
• Immense potential for AI in materials: medicine-design, medical-diagnosis, alloy-design,…
4https://www.youtube.com/watch?v=8r_smb-9GTg
https://www.nesgt.com/blog/2016/10/how-artificial-intelligence-is-being-used-in-the-pharmaceutical-industry
https://rowanalytics.com/blog-post/ai-in-medicine-a-historical-perspective/
Motivation
• Bridge gap between AI and materials community
• 10100 materials predicted, impossible to characterize with current theoretical
and experimental techniques
• Fully automated computational and experimental discovery as well as
industrial implementation
• Learning data from classical-physics and quantum-physics based outputs
through AI/machine learning techniques
• JARVIS : Joint Automated Repository for Various Integrated Simulations
5
https://en.wikipedia.org/wiki/Iron_Man
https://nige.wordpress.com/2008/01/30/book/interactions/
https://arxiv.org/abs/1805.07325
JARVIS-DFT
6
JARVIS-FF JARVIS-ML
( )rVmaF −==
( ) ( ) ( ) ( )
2
2
2
Eff i i iV r r E r r
m
 
 
−  + = 
 
XCeeNeEff VVVTV +++=
 EH =
• Solve Newton’s equation for atomic positions
• Approximations for V (force-fields):
EAM, EIM, MEAM, AIREBO, REAXFF, COMB, COMB3,
TERSOFF, SW etc.
• Contains:
Automated LAMMPS based force-field calculations on DFT
geometries. Some of the properties included in JARVIS-FF are
energetics, elastic constants, surface energies, defect formations
energies and phonon frequencies of materials
• Time: Takes years to fit FFs, relatively quick calculations
• Website: https://www.ctcms.nist.gov/~knc6/periodic.html
• Publications:
➢ Nature:Scientific Data 4, 160125 (2017)
➢ arXiv:1804.01024 (2018)
Force-field (classical) Density-functional theory (quantum) Machine learning (data-driven)
• Solve Scrodinger equation for electrons
• >30,000 materials data (3D, 2D, 1D, 0D)
• Contains:
Formation energy, exfoliation energy, diffraction pattern, radial
distribution function, band-structure (SOC/Non-SOC), density
of states, carrier effective mass, temperature and carrier
concentration dependent thermoelectric properties, elastic
constants and gamma-point phonons
• Time: 5000 cores for last 4 years
• Website: https://www.ctcms.nist.gov/~knc6/JVASP.html
• Publications:
➢ Nature:Scientific Reports 7, 5179 (2017)
➢ Nature:Scientific Data 5, 180082 (2018)
➢ Phys. Rev. B 98, 014107 (2018)
• Drawing the line, dimensionality reduction, curve-
fitting?
• Neural nets, decision trees, fuzzy-logic etc.
• Uses gradient boosting decision tree
• Contains:
Machine learning prediction tools, trained on JARVIS-
DFT data.
Some of the ML-predictions focus on energetics, heat of
formation, GGA/METAGGA bandgaps, bulk and shear
modulus, exfoliation energy, refractive index, magnetic
moment, carrier effective masses
• Time: Much easier and faster to train
• Website: https://www.ctcms.nist.gov/jarvisml/
• Publication:
➢ Phys. Rev. Materials 2, 08380 (2018)
Schrödinger's cat
• Enthalpy of formation and enthalpy of exfoliation
• Exfoliable materials: <200 meV/atom
Optoelectronic
properties
Elastic properties
Energetics
Magnetic properties
Transport properties
• OptB88vdW, TBmBJ and HSE06 bandgaps
• Frequency dependent dielectric functions
• 6x6 elastic tensors, Poisson’s ratio and phonons
• Magnetic moments
• Carrier effective mass, Seebeck coefficient and zT
Dimensionality of materials
• >700 2D mono/multi-layers & >30000 3D bulk
materials
• Increasing ~3000 materials/month
Webpage: https://jarvis.nist.gov
JARVIS-DFT data • Large and reliable dataset of
➢ optoelectronic properties (>18000)
➢ Elastic properties (>11000)
➢ 2D/1D/0D exfoliation energies (>800)
➢ Topological material properties, Z2 index
(>2000)
and others…
Topological
properties
Steps in JARVIS-ML model
1) Descriptor selection (perhaps the most important step!),visualization
2) Preprocessing (variance threshold, PCA etc.)
3) Train-test split of data (90%-10% for instance)
4) Hyperparameter optimization using grid-search on train data
5) Select the best model (based on R2/MAE/RMSE accuracy)
6) Prediction on test data (classification/regression)
7) Plot learning curve for complexity analysis
8) Plot feature importance (if available)
Finding the right features/descriptors: representing
atomic structure to computers
9
1557 descriptors/features for one material
• Arithmetic operations (mean, sum, std. deviation…) of
electronegativity, atomic radii, heat of fusion,….
of atoms at each site
(example: Electronegativity of Mo+Mo+S+S+S+S)/6 = 0.15
• Atomic bond distance based descriptors
• Angle based descriptors
https://github.com/usnistgov/jarvis
Visualizing multi-dimensional data with t-SNE
10
• Converts similarities between data points to joint probabilities
• Visulaization with t-SNE for ~25000 materials
http://holoviews.org/
http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
Data-spread: histograms
11
• Identifying the range of target data
• Energetics, electronic, optical, and mechanical properties
• AI generally good for interpolation
Data-spread: histograms
12
Need of explainable AI for materials: black box may
not work for materials community
13https://www.darpa.mil/program/explainable-artificial-intelligence
Gradient boosting decision tree (GBDT)
14
• Ensemble of weak decision tree models
• Consecutively fits new models to provide a more accurate estimate of the response variables
• Suppose there are N training examples: (𝑥𝑖, 𝑦𝑖) 𝑁
• GBDT model estimates the function of future variable x by the linear combination of the individual decision trees using:
𝑓𝑚 𝑥 = ෍
𝑚=1
𝑀
𝑇 𝑥; 𝜃 𝑚
where 𝑇 𝑥; 𝜃 𝑚 is the i-th decision tree, 𝜃 𝑚is its parameters and M is the number of decision trees
• Final estimation in a forward stage-wise fashion
𝑓𝑚 𝑥 = 𝑓 𝑚−1 𝑥 + 𝑇 𝑥; 𝜃 𝑚
where fm−1(x) is the model in (m−1) step. The parameter 𝜃 𝑚 is learned by the principle of
• Empirical risk minimization using:
෢𝜃 𝑚 = 𝑎𝑟𝑔 𝜃 𝑚
𝑚𝑖𝑛
σ𝑖=1
𝑁
𝐿 𝑦𝑖, 𝑓 𝑚−1 𝑥 + 𝑇 𝑥; 𝜃 𝑚
where L is the loss-function.
https://web.stanford.edu/~hastie/Papers/ESLII.pdf
Classification problem:
metal/non-metal, magnetic/non-magnetic materials
15
ROC-curve: Excellent classification models
Regression models: formation energy and bandgap
model
16
Learning curve shows scope of further improvement
Other model performance
17
Performance on 10 % held data
Property #Data-points MAECFID-DFT MAEDFT-Exp
Formation energy (eV/atom) 24549 0.12 0.136
Exfoliation energy (meV/atom) 616 37.3 -
OPT-bandgap (eV) 22404 0.32 1.33
MBJ-bandgap (eV) 10499 0.44 0.51
Bulk modulus (GPa) 10954 10.5 10.0
Shear modulus (GPa) 10954 9.5 10.0
OPT-nx (no unit) 12299 0.54 1.78
OPT-ny (no unit) 12299 0.55 -
OPT-nz (no unit) 12299 0.55 -
MBJ-nx (no unit) 6628 0.45 1.6
MBJ-ny (no unit) 6628 0.50 -
MBJ-nz (no unit) 6628 0.46 -
Explainability: feature importance
18
• Chemical features most important followed by RDF and NN
• Incrementally adding structural features decreases MAE
Introduction to Genetic Algorithm
Picture from GASP manual
• Based on ‘Survival of the fittest’ theory: fitness of crystal
structure based on energy of structure
• Parents to offspring crystal structure
• Generally energy is obtained from DFT, MD…let’s try ML ?
D. M. Deaven, Molecular geometry optimization with a genetic algorithm, Physical Review
Letters, 75 (1995)
G. Ceder, Data-mining-driven quantum mechanics for the prediction of structure, MRS
Bulletin, 31 (2006)
https://github.com/henniggroup/GASP-python/
19
Search for new materials: Genetic algorithm with ML
20
• MoS2, WS2 indeed stable as in DFT and experiments
• Need further verification foor low-lying energy structures with DFT
• New way of validating ML model for materials
2D materials screening: flexible electronics
applications
21
• ~5000 2D materials predicted
• Requires expensive DFT calculations for predicting properties such as bandgap, exfoliation energy etc.
• Use of ML drops down the time to a few seconds
• Using this technique we identified new 2D materials such as CuI, InS etc.
• Validated using DFT
Web-app: DEMO
22
https://www.ctcms.nist.gov/jarvisml/ • Pickle the learnt parameters
• Integrate with FlaskPython
Getting hands on: github sample trainng
23
https://github.com/usnistgov/jarvis/blob/master/jarvis/db/static/jarvis_ml-train.ipynb
https://github.com/usnistgov/jarvis/blob/master/jarvis/sklearn/examples/desc_example.py
On-going work
24
• Combining all materials data >5 million:
solid-state crystals, molecules, proteins in one database and their visualization
• Multi-output Regression
• Active learning: train on DFT data, add to experiments to reduce domain search
• Transfer learning: Use previously trained model to learn on new data
Summary
25
• JARVIS provides:
❑ Data (>30000 materials, 300000 properties): FF, DFT, ML
❑ Public websites (#>30000), views (>20000)
❑ Jupyter notebooks, python scripts on github
❑ Web-app for on-the fly prediction of properties using ML
❑ API-upload/download data
• Important links:
✓https://jarvis.nist.gov/
✓https://www.ctcms.nist.gov/jarvisml/
✓https://www.slideshare.net/KAMALCHOUDHARY4/
• E-mail: kamal.choudhary@nist.gov

More Related Content

What's hot

A Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge SystemsA Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge Systemsaimsnist
 
When The New Science Is In The Outliers
When The New Science Is In The OutliersWhen The New Science Is In The Outliers
When The New Science Is In The Outliersaimsnist
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningKAMAL CHOUDHARY
 
The MGI and AI
The MGI and AIThe MGI and AI
The MGI and AIaimsnist
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...KAMAL CHOUDHARY
 
Predicting local atomic structures from X-ray absorption spectroscopy using t...
Predicting local atomic structures from X-ray absorption spectroscopy using t...Predicting local atomic structures from X-ray absorption spectroscopy using t...
Predicting local atomic structures from X-ray absorption spectroscopy using t...aimsnist
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAnubhav Jain
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...BrianDeCost
 
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...KAMAL CHOUDHARY
 
Computational Database for 3D and 2D materials to accelerate discovery
Computational Database for 3D and 2D materials to accelerate discoveryComputational Database for 3D and 2D materials to accelerate discovery
Computational Database for 3D and 2D materials to accelerate discoveryKAMAL CHOUDHARY
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsAnubhav Jain
 
Graphs, Environments, and Machine Learning for Materials Science
Graphs, Environments, and Machine Learning for Materials ScienceGraphs, Environments, and Machine Learning for Materials Science
Graphs, Environments, and Machine Learning for Materials Scienceaimsnist
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Designaimsnist
 
High-throughput discovery of low-dimensional and topologically non-trivial ma...
High-throughput discovery of low-dimensional and topologically non-trivial ma...High-throughput discovery of low-dimensional and topologically non-trivial ma...
High-throughput discovery of low-dimensional and topologically non-trivial ma...KAMAL CHOUDHARY
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsAnubhav Jain
 
Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Anubhav Jain
 
Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...Anubhav Jain
 
Density functional theory calculations and data mining for new thermoelectric...
Density functional theory calculations and data mining for new thermoelectric...Density functional theory calculations and data mining for new thermoelectric...
Density functional theory calculations and data mining for new thermoelectric...Anubhav Jain
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distddm314
 

What's hot (20)

A Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge SystemsA Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge Systems
 
When The New Science Is In The Outliers
When The New Science Is In The OutliersWhen The New Science Is In The Outliers
When The New Science Is In The Outliers
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
The MGI and AI
The MGI and AIThe MGI and AI
The MGI and AI
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
 
Predicting local atomic structures from X-ray absorption spectroscopy using t...
Predicting local atomic structures from X-ray absorption spectroscopy using t...Predicting local atomic structures from X-ray absorption spectroscopy using t...
Predicting local atomic structures from X-ray absorption spectroscopy using t...
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design Problems
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...
 
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...
Computational Discovery of Two-Dimensional Materials, Evaluation of Force-Fie...
 
Computational Database for 3D and 2D materials to accelerate discovery
Computational Database for 3D and 2D materials to accelerate discoveryComputational Database for 3D and 2D materials to accelerate discovery
Computational Database for 3D and 2D materials to accelerate discovery
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials Informatics
 
Graphs, Environments, and Machine Learning for Materials Science
Graphs, Environments, and Machine Learning for Materials ScienceGraphs, Environments, and Machine Learning for Materials Science
Graphs, Environments, and Machine Learning for Materials Science
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
 
High-throughput discovery of low-dimensional and topologically non-trivial ma...
High-throughput discovery of low-dimensional and topologically non-trivial ma...High-throughput discovery of low-dimensional and topologically non-trivial ma...
High-throughput discovery of low-dimensional and topologically non-trivial ma...
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
 
Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...
 
Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...
 
Density functional theory calculations and data mining for new thermoelectric...
Density functional theory calculations and data mining for new thermoelectric...Density functional theory calculations and data mining for new thermoelectric...
Density functional theory calculations and data mining for new thermoelectric...
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 dist
 

Similar to 2D/3D Materials screening and genetic algorithm with ML model

NIST-JARVIS infrastructure for Improved Materials Design
NIST-JARVIS infrastructure for Improved Materials DesignNIST-JARVIS infrastructure for Improved Materials Design
NIST-JARVIS infrastructure for Improved Materials DesignKAMAL CHOUDHARY
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignKAMAL CHOUDHARY
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Anubhav Jain
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...Anubhav Jain
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Anubhav Jain
 
Materials discovery through theory, computation, and machine learning
Materials discovery through theory, computation, and machine learningMaterials discovery through theory, computation, and machine learning
Materials discovery through theory, computation, and machine learningAnubhav Jain
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Materials Data in the 21st Century: From Mishmash to Moneyball
Materials Data in the 21st Century: From Mishmash to MoneyballMaterials Data in the 21st Century: From Mishmash to Moneyball
Materials Data in the 21st Century: From Mishmash to Moneyballbmeredig
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsAnubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsDeakin University
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectAnubhav Jain
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...Anubhav Jain
 
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...KAMAL CHOUDHARY
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...Deakin University
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular PropertiesYassin Youssfi
 
Conducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectConducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectAnubhav Jain
 

Similar to 2D/3D Materials screening and genetic algorithm with ML model (20)

NIST-JARVIS infrastructure for Improved Materials Design
NIST-JARVIS infrastructure for Improved Materials DesignNIST-JARVIS infrastructure for Improved Materials Design
NIST-JARVIS infrastructure for Improved Materials Design
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
 
Materials discovery through theory, computation, and machine learning
Materials discovery through theory, computation, and machine learningMaterials discovery through theory, computation, and machine learning
Materials discovery through theory, computation, and machine learning
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Materials Data in the 21st Century: From Mishmash to Moneyball
Materials Data in the 21st Century: From Mishmash to MoneyballMaterials Data in the 21st Century: From Mishmash to Moneyball
Materials Data in the 21st Century: From Mishmash to Moneyball
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of Materials
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials Project
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...
 
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...
ChemNLP: A Natural Language Processing based Library for Materials Chemistry ...
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular Properties
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
Conducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectConducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials Project
 
qmms_wines.pptx
qmms_wines.pptxqmms_wines.pptx
qmms_wines.pptx
 

More from aimsnist

Enabling Data Science Methods for Catalyst Design and Discovery
Enabling Data Science Methods for Catalyst Design and DiscoveryEnabling Data Science Methods for Catalyst Design and Discovery
Enabling Data Science Methods for Catalyst Design and Discoveryaimsnist
 
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...aimsnist
 
Coupling AI with HiTp experiments to Discover Metallic Glasses Faster
Coupling AI with HiTp experiments to Discover Metallic Glasses FasterCoupling AI with HiTp experiments to Discover Metallic Glasses Faster
Coupling AI with HiTp experiments to Discover Metallic Glasses Fasteraimsnist
 
Classical force fields as physics-based neural networks
Classical force fields as physics-based neural networksClassical force fields as physics-based neural networks
Classical force fields as physics-based neural networksaimsnist
 
Pathways Towards a Hierarchical Discovery of Materials
Pathways Towards a Hierarchical Discovery of MaterialsPathways Towards a Hierarchical Discovery of Materials
Pathways Towards a Hierarchical Discovery of Materialsaimsnist
 
Materials Data in Action
Materials Data in ActionMaterials Data in Action
Materials Data in Actionaimsnist
 
Combinatorial Experimentation and Machine Learning for Materials Discovery
Combinatorial Experimentation and Machine Learning for Materials DiscoveryCombinatorial Experimentation and Machine Learning for Materials Discovery
Combinatorial Experimentation and Machine Learning for Materials Discoveryaimsnist
 
Progress in Natural Language Processing of Materials Science Text
Progress in Natural Language Processing of Materials Science TextProgress in Natural Language Processing of Materials Science Text
Progress in Natural Language Processing of Materials Science Textaimsnist
 

More from aimsnist (8)

Enabling Data Science Methods for Catalyst Design and Discovery
Enabling Data Science Methods for Catalyst Design and DiscoveryEnabling Data Science Methods for Catalyst Design and Discovery
Enabling Data Science Methods for Catalyst Design and Discovery
 
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
 
Coupling AI with HiTp experiments to Discover Metallic Glasses Faster
Coupling AI with HiTp experiments to Discover Metallic Glasses FasterCoupling AI with HiTp experiments to Discover Metallic Glasses Faster
Coupling AI with HiTp experiments to Discover Metallic Glasses Faster
 
Classical force fields as physics-based neural networks
Classical force fields as physics-based neural networksClassical force fields as physics-based neural networks
Classical force fields as physics-based neural networks
 
Pathways Towards a Hierarchical Discovery of Materials
Pathways Towards a Hierarchical Discovery of MaterialsPathways Towards a Hierarchical Discovery of Materials
Pathways Towards a Hierarchical Discovery of Materials
 
Materials Data in Action
Materials Data in ActionMaterials Data in Action
Materials Data in Action
 
Combinatorial Experimentation and Machine Learning for Materials Discovery
Combinatorial Experimentation and Machine Learning for Materials DiscoveryCombinatorial Experimentation and Machine Learning for Materials Discovery
Combinatorial Experimentation and Machine Learning for Materials Discovery
 
Progress in Natural Language Processing of Materials Science Text
Progress in Natural Language Processing of Materials Science TextProgress in Natural Language Processing of Materials Science Text
Progress in Natural Language Processing of Materials Science Text
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 

2D/3D Materials screening and genetic algorithm with ML model

  • 1. JARVIS-ML 2D/3D materials screening and genetic algorithm with ML model Kamal Choudhary, Brian DeCost, Francesca Tavazza NIST, Gaithersburg AIMS workshop August 07, 2018 1
  • 2. Acknowledgement and Collaboration • Carelyn Campbell, James Warren, Andrew Reid, Daniel Wheeler, Aaron Gilad Kusne, Jason Hattrick-Simpers, Martin Green, Faical Y. Congo, Zachary Trautt, Andrew Reid, Nhan Van Nguyen, Sugata Chowdhury, Albert Davydov, Irina Kalish, Chandler Becker, Ryan Beams, Adam Biacchi, Kevin Garrity, Russell Johnson, John Vinson, NIST • Logan Ward, University of Chicago • Ankit Agrawal, Northwestern University • Patrick Riley, Google • Yuri Mishin, George Mason University • Richard Hennig, University of Florida • Tao Liang, Penn state • Materials-Project team, Lawrence Berkeley National Laboratory • Evan Reed, Stanford University • Xin Zhao, Fen Zhang, Ames lab • Sam Reeve (nanohub.org), Purdue University • Lidia Carvalho Gomes, National University of Singapore 2
  • 3. Outline • AI in materials, Motivation • JARVIS-FF, DFT and ML • Representing materials to computers • Visualizing multi-dimensional data • Histograms for target data • Gradient boosting decision trees • Classification and regression models, feature importance • Application of ML models in materials screening • Application of ML in mapping energy landscape • Web-app • Conclusions 3
  • 4. AI in materials • For successful application of AI: -High fidelity data, pertinent algorithm and validation strategy • Unlike other AI input data (cat/dog images from facebook etc.), materials data are really small and takes long time to generate • Available of easily applied AI algorithms: scikit-learn, tensor-flow, lightgbm, Keras etc. • Publicly available databases such as JARVIS-DFT, Materials-project, AFLOW, OQMD, Harvard clean energy project, AiiDA, PDB database, COD database etc. • Immense potential for AI in materials: medicine-design, medical-diagnosis, alloy-design,… 4https://www.youtube.com/watch?v=8r_smb-9GTg https://www.nesgt.com/blog/2016/10/how-artificial-intelligence-is-being-used-in-the-pharmaceutical-industry https://rowanalytics.com/blog-post/ai-in-medicine-a-historical-perspective/
  • 5. Motivation • Bridge gap between AI and materials community • 10100 materials predicted, impossible to characterize with current theoretical and experimental techniques • Fully automated computational and experimental discovery as well as industrial implementation • Learning data from classical-physics and quantum-physics based outputs through AI/machine learning techniques • JARVIS : Joint Automated Repository for Various Integrated Simulations 5 https://en.wikipedia.org/wiki/Iron_Man https://nige.wordpress.com/2008/01/30/book/interactions/ https://arxiv.org/abs/1805.07325
  • 6. JARVIS-DFT 6 JARVIS-FF JARVIS-ML ( )rVmaF −== ( ) ( ) ( ) ( ) 2 2 2 Eff i i iV r r E r r m     −  + =    XCeeNeEff VVVTV +++=  EH = • Solve Newton’s equation for atomic positions • Approximations for V (force-fields): EAM, EIM, MEAM, AIREBO, REAXFF, COMB, COMB3, TERSOFF, SW etc. • Contains: Automated LAMMPS based force-field calculations on DFT geometries. Some of the properties included in JARVIS-FF are energetics, elastic constants, surface energies, defect formations energies and phonon frequencies of materials • Time: Takes years to fit FFs, relatively quick calculations • Website: https://www.ctcms.nist.gov/~knc6/periodic.html • Publications: ➢ Nature:Scientific Data 4, 160125 (2017) ➢ arXiv:1804.01024 (2018) Force-field (classical) Density-functional theory (quantum) Machine learning (data-driven) • Solve Scrodinger equation for electrons • >30,000 materials data (3D, 2D, 1D, 0D) • Contains: Formation energy, exfoliation energy, diffraction pattern, radial distribution function, band-structure (SOC/Non-SOC), density of states, carrier effective mass, temperature and carrier concentration dependent thermoelectric properties, elastic constants and gamma-point phonons • Time: 5000 cores for last 4 years • Website: https://www.ctcms.nist.gov/~knc6/JVASP.html • Publications: ➢ Nature:Scientific Reports 7, 5179 (2017) ➢ Nature:Scientific Data 5, 180082 (2018) ➢ Phys. Rev. B 98, 014107 (2018) • Drawing the line, dimensionality reduction, curve- fitting? • Neural nets, decision trees, fuzzy-logic etc. • Uses gradient boosting decision tree • Contains: Machine learning prediction tools, trained on JARVIS- DFT data. Some of the ML-predictions focus on energetics, heat of formation, GGA/METAGGA bandgaps, bulk and shear modulus, exfoliation energy, refractive index, magnetic moment, carrier effective masses • Time: Much easier and faster to train • Website: https://www.ctcms.nist.gov/jarvisml/ • Publication: ➢ Phys. Rev. Materials 2, 08380 (2018) Schrödinger's cat
  • 7. • Enthalpy of formation and enthalpy of exfoliation • Exfoliable materials: <200 meV/atom Optoelectronic properties Elastic properties Energetics Magnetic properties Transport properties • OptB88vdW, TBmBJ and HSE06 bandgaps • Frequency dependent dielectric functions • 6x6 elastic tensors, Poisson’s ratio and phonons • Magnetic moments • Carrier effective mass, Seebeck coefficient and zT Dimensionality of materials • >700 2D mono/multi-layers & >30000 3D bulk materials • Increasing ~3000 materials/month Webpage: https://jarvis.nist.gov JARVIS-DFT data • Large and reliable dataset of ➢ optoelectronic properties (>18000) ➢ Elastic properties (>11000) ➢ 2D/1D/0D exfoliation energies (>800) ➢ Topological material properties, Z2 index (>2000) and others… Topological properties
  • 8. Steps in JARVIS-ML model 1) Descriptor selection (perhaps the most important step!),visualization 2) Preprocessing (variance threshold, PCA etc.) 3) Train-test split of data (90%-10% for instance) 4) Hyperparameter optimization using grid-search on train data 5) Select the best model (based on R2/MAE/RMSE accuracy) 6) Prediction on test data (classification/regression) 7) Plot learning curve for complexity analysis 8) Plot feature importance (if available)
  • 9. Finding the right features/descriptors: representing atomic structure to computers 9 1557 descriptors/features for one material • Arithmetic operations (mean, sum, std. deviation…) of electronegativity, atomic radii, heat of fusion,…. of atoms at each site (example: Electronegativity of Mo+Mo+S+S+S+S)/6 = 0.15 • Atomic bond distance based descriptors • Angle based descriptors https://github.com/usnistgov/jarvis
  • 10. Visualizing multi-dimensional data with t-SNE 10 • Converts similarities between data points to joint probabilities • Visulaization with t-SNE for ~25000 materials http://holoviews.org/ http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
  • 11. Data-spread: histograms 11 • Identifying the range of target data • Energetics, electronic, optical, and mechanical properties • AI generally good for interpolation
  • 13. Need of explainable AI for materials: black box may not work for materials community 13https://www.darpa.mil/program/explainable-artificial-intelligence
  • 14. Gradient boosting decision tree (GBDT) 14 • Ensemble of weak decision tree models • Consecutively fits new models to provide a more accurate estimate of the response variables • Suppose there are N training examples: (𝑥𝑖, 𝑦𝑖) 𝑁 • GBDT model estimates the function of future variable x by the linear combination of the individual decision trees using: 𝑓𝑚 𝑥 = ෍ 𝑚=1 𝑀 𝑇 𝑥; 𝜃 𝑚 where 𝑇 𝑥; 𝜃 𝑚 is the i-th decision tree, 𝜃 𝑚is its parameters and M is the number of decision trees • Final estimation in a forward stage-wise fashion 𝑓𝑚 𝑥 = 𝑓 𝑚−1 𝑥 + 𝑇 𝑥; 𝜃 𝑚 where fm−1(x) is the model in (m−1) step. The parameter 𝜃 𝑚 is learned by the principle of • Empirical risk minimization using: ෢𝜃 𝑚 = 𝑎𝑟𝑔 𝜃 𝑚 𝑚𝑖𝑛 σ𝑖=1 𝑁 𝐿 𝑦𝑖, 𝑓 𝑚−1 𝑥 + 𝑇 𝑥; 𝜃 𝑚 where L is the loss-function. https://web.stanford.edu/~hastie/Papers/ESLII.pdf
  • 15. Classification problem: metal/non-metal, magnetic/non-magnetic materials 15 ROC-curve: Excellent classification models
  • 16. Regression models: formation energy and bandgap model 16 Learning curve shows scope of further improvement
  • 17. Other model performance 17 Performance on 10 % held data Property #Data-points MAECFID-DFT MAEDFT-Exp Formation energy (eV/atom) 24549 0.12 0.136 Exfoliation energy (meV/atom) 616 37.3 - OPT-bandgap (eV) 22404 0.32 1.33 MBJ-bandgap (eV) 10499 0.44 0.51 Bulk modulus (GPa) 10954 10.5 10.0 Shear modulus (GPa) 10954 9.5 10.0 OPT-nx (no unit) 12299 0.54 1.78 OPT-ny (no unit) 12299 0.55 - OPT-nz (no unit) 12299 0.55 - MBJ-nx (no unit) 6628 0.45 1.6 MBJ-ny (no unit) 6628 0.50 - MBJ-nz (no unit) 6628 0.46 -
  • 18. Explainability: feature importance 18 • Chemical features most important followed by RDF and NN • Incrementally adding structural features decreases MAE
  • 19. Introduction to Genetic Algorithm Picture from GASP manual • Based on ‘Survival of the fittest’ theory: fitness of crystal structure based on energy of structure • Parents to offspring crystal structure • Generally energy is obtained from DFT, MD…let’s try ML ? D. M. Deaven, Molecular geometry optimization with a genetic algorithm, Physical Review Letters, 75 (1995) G. Ceder, Data-mining-driven quantum mechanics for the prediction of structure, MRS Bulletin, 31 (2006) https://github.com/henniggroup/GASP-python/ 19
  • 20. Search for new materials: Genetic algorithm with ML 20 • MoS2, WS2 indeed stable as in DFT and experiments • Need further verification foor low-lying energy structures with DFT • New way of validating ML model for materials
  • 21. 2D materials screening: flexible electronics applications 21 • ~5000 2D materials predicted • Requires expensive DFT calculations for predicting properties such as bandgap, exfoliation energy etc. • Use of ML drops down the time to a few seconds • Using this technique we identified new 2D materials such as CuI, InS etc. • Validated using DFT
  • 22. Web-app: DEMO 22 https://www.ctcms.nist.gov/jarvisml/ • Pickle the learnt parameters • Integrate with FlaskPython
  • 23. Getting hands on: github sample trainng 23 https://github.com/usnistgov/jarvis/blob/master/jarvis/db/static/jarvis_ml-train.ipynb https://github.com/usnistgov/jarvis/blob/master/jarvis/sklearn/examples/desc_example.py
  • 24. On-going work 24 • Combining all materials data >5 million: solid-state crystals, molecules, proteins in one database and their visualization • Multi-output Regression • Active learning: train on DFT data, add to experiments to reduce domain search • Transfer learning: Use previously trained model to learn on new data
  • 25. Summary 25 • JARVIS provides: ❑ Data (>30000 materials, 300000 properties): FF, DFT, ML ❑ Public websites (#>30000), views (>20000) ❑ Jupyter notebooks, python scripts on github ❑ Web-app for on-the fly prediction of properties using ML ❑ API-upload/download data • Important links: ✓https://jarvis.nist.gov/ ✓https://www.ctcms.nist.gov/jarvisml/ ✓https://www.slideshare.net/KAMALCHOUDHARY4/ • E-mail: kamal.choudhary@nist.gov