High-throughput computation and machine learning methods can be applied to materials design problems at scale. Density functional theory (DFT) allows modeling of materials at the quantum mechanical level but large computational resources are required. "High-throughput DFT" uses automation, parallelization across supercomputers, and data mining approaches to rapidly screen millions of potential new materials in silico before experimental validation. This helps address the challenge of discovering new materials for applications like energy technologies by searching the vast space of possible compositions and structures more efficiently than traditional experimentation alone.
TMS workshop on machine learning in materials science: Intro to deep learning...BrianDeCost
This presentation is intended as a high-level introduction for to deep learning and its applications in materials science. The intended audience is materials scientists and engineers
Disclaimers: the second half of this presentation is intended as a broad overview of deep learning applications in materials science; due to time limitations it is not intended to be comprehensive. As a review of the field, this necessarily includes work that is not my own. If my own name is not included explicitly in the reference at the bottom of a slide, I was not involved in that work.
Any mention of commercial products in this presentation is for information only; it does not imply recommendation or endorsement by NIST.
* 모두의연구소에서 2018년 12월에 진행한 Moducon 2018을 리뷰합니다.
* 재밌게 들었던 발표 두 가지를 정리합니다
1. Research of Clova AI toward 'AI for Everyone' - 하정우 님 (Clova AI Research Director)
2. 나만 알고싶은 논문 - 민규식 님 (한양대학교)
* 광주과학기술원 인공지능스터디 A-GIST 모임에서 발표했습니다.
* 발표영상 (한국어, 유튜브): https://youtu.be/FRvlwaqrGHM
TMS workshop on machine learning in materials science: Intro to deep learning...BrianDeCost
This presentation is intended as a high-level introduction for to deep learning and its applications in materials science. The intended audience is materials scientists and engineers
Disclaimers: the second half of this presentation is intended as a broad overview of deep learning applications in materials science; due to time limitations it is not intended to be comprehensive. As a review of the field, this necessarily includes work that is not my own. If my own name is not included explicitly in the reference at the bottom of a slide, I was not involved in that work.
Any mention of commercial products in this presentation is for information only; it does not imply recommendation or endorsement by NIST.
* 모두의연구소에서 2018년 12월에 진행한 Moducon 2018을 리뷰합니다.
* 재밌게 들었던 발표 두 가지를 정리합니다
1. Research of Clova AI toward 'AI for Everyone' - 하정우 님 (Clova AI Research Director)
2. 나만 알고싶은 논문 - 민규식 님 (한양대학교)
* 광주과학기술원 인공지능스터디 A-GIST 모임에서 발표했습니다.
* 발표영상 (한국어, 유튜브): https://youtu.be/FRvlwaqrGHM
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...Bang Xiang Yong
Presented at MET4FOF Workshop, JULY 2020
I talk about our recent work of combining Bayesian Deep learning with Explainable Artificial Intelligence (XAI) methods. In particular, we look at Bayesian Autoencoders.
Uncertainty in Deep Learning, Gal (2016)
Representing Inferential Uncertainty in Deep Neural Networks Through Sampling, McClure & Kriegeskorte (2017)
Uncertainty-Aware Reinforcement Learning from Collision Avoidance, Khan et al. (2016)
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, Lakshminarayanan et al. (2017)
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, Kendal & Gal (2017)
Uncertainty-Aware Learning from Demonstration Using Mixture Density Networks with Sampling-Free Variance Modeling, Choi et al. (2017)
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, Anonymous (2018)
Under a Compulsory Course of "Materials Physics and Technology for Nanoelectronics" a team of BE Students of Nanotechnology, Nanoelectronics and Bionnotechnology prepared this seminar for Prof. Marc Heyns, marc.heyns@imec.be Kapeldreef 75, B-3001 Heverlee IMEC Building IV, room 2.33
Tel: 016 281 348
In this presentation, it proposes efficient method of storing energy by the use of piezoceramic. It is very reliable to use
piezo ceramic for generating electrical energy which can be used for powering any portable devices. The basic concept
of piezo ceramic is that the mechanical strain applied on to the ceramic such as bimorph or unimorph piezo converts it
into electrical energy. In the present day scenerio, wherein there is great demand for energy, this idea of piezoelectric
concept works well.
Image classification using convolutional neural networkKIRAN R
For separating the images from a large collection of images or from a large dataset this classifier can be used, Here deep neural network is used for training and classifying the images. The convolutional neural network is the most suitable algorithm for classifier images. This Classifier is a machine learning model, so the more you train it the more will be the accuracy.
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Ichigaku Takigawa
2nd ICReDD International Symposium—Toward Interdisciplinary Research Guided by Theory and Calculation
Nov. 27 (wed) - Nov. 29 (fri), 2019
https://www.icredd.hokudai.ac.jp/event/1229
https://github.com/telecombcn-dl/lectures-all/
These slides review techniques for interpreting the behavior of deep neural networks. The talk reviews basic techniques such as the display of filters and tensors, as well as more advanced ones that try to interpret which part of the input data is responsible for the predictions, or generate data that maximizes the activation of certain neurons.
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...Bang Xiang Yong
Presented at MET4FOF Workshop, JULY 2020
I talk about our recent work of combining Bayesian Deep learning with Explainable Artificial Intelligence (XAI) methods. In particular, we look at Bayesian Autoencoders.
Uncertainty in Deep Learning, Gal (2016)
Representing Inferential Uncertainty in Deep Neural Networks Through Sampling, McClure & Kriegeskorte (2017)
Uncertainty-Aware Reinforcement Learning from Collision Avoidance, Khan et al. (2016)
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, Lakshminarayanan et al. (2017)
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, Kendal & Gal (2017)
Uncertainty-Aware Learning from Demonstration Using Mixture Density Networks with Sampling-Free Variance Modeling, Choi et al. (2017)
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, Anonymous (2018)
Under a Compulsory Course of "Materials Physics and Technology for Nanoelectronics" a team of BE Students of Nanotechnology, Nanoelectronics and Bionnotechnology prepared this seminar for Prof. Marc Heyns, marc.heyns@imec.be Kapeldreef 75, B-3001 Heverlee IMEC Building IV, room 2.33
Tel: 016 281 348
In this presentation, it proposes efficient method of storing energy by the use of piezoceramic. It is very reliable to use
piezo ceramic for generating electrical energy which can be used for powering any portable devices. The basic concept
of piezo ceramic is that the mechanical strain applied on to the ceramic such as bimorph or unimorph piezo converts it
into electrical energy. In the present day scenerio, wherein there is great demand for energy, this idea of piezoelectric
concept works well.
Image classification using convolutional neural networkKIRAN R
For separating the images from a large collection of images or from a large dataset this classifier can be used, Here deep neural network is used for training and classifying the images. The convolutional neural network is the most suitable algorithm for classifier images. This Classifier is a machine learning model, so the more you train it the more will be the accuracy.
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Ichigaku Takigawa
2nd ICReDD International Symposium—Toward Interdisciplinary Research Guided by Theory and Calculation
Nov. 27 (wed) - Nov. 29 (fri), 2019
https://www.icredd.hokudai.ac.jp/event/1229
https://github.com/telecombcn-dl/lectures-all/
These slides review techniques for interpreting the behavior of deep neural networks. The talk reviews basic techniques such as the display of filters and tensors, as well as more advanced ones that try to interpret which part of the input data is responsible for the predictions, or generate data that maximizes the activation of certain neurons.
Kilohertz-Rate MeV Ultrafast Electron Diffraction for Time-resolved Materials...Yi Lin
Ultrafast electron diffraction (UED) enables direct insight into structural dynamics of solids. Relativistic MeV-scale electron beams yield access to high-momentum scattering and preserve beam coherence, yet their application at high repetition rates for high-sensitivity UED has been limited. We discuss the High Repetition-rate Electron Scattering (HiRES) instrument at Berkeley Lab and its first applications to UED of metallic films and quantum materials. HiRES employs a state-of-the-art photoinjector with RF bunch compression to generate high-brightness, relativistic 0.75 MeV electron pulses with up to 105-106 el./pulse and with highest achievable coherence length of 10 nm. The resulting high momentum range (±10 Å-1) yields access over multiple Brillouin zones. The sub-500 fs electron pulses are provided at 0.1-250 kHz repetition rate, and combined with optical pumping via a 1.03 µm fiber amplifier enable UED of cryogenically cooled materials. We will show examples of first experiments including transient Debye-Waller dynamics in ultrathin metals at kHz repetition rate as well as studies of charge density waves in 2D materials.
Work at LBNL was supported by the DOE Office of Basic Energy Sciences.
What can we learn from molecular dynamics simulations of carbon nanotube and ...Stephan Irle
We present the results of nonequilibrium molecular dynamics (MD) simulations of catalytic and non-catalytic carbon nanostructure formation processes, including single-walled carbon nanotube (SWCNT) and graphene nucleation and growth. In the talk, we discuss the significance of the findings in the light of more traditional, static descriptions of growth reaction mechanisms, and highlight differences as well as commonalities.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
In silico drugs analogue design: novobiocin analogues.pptx
High-throughput computation and machine learning methods applied to materials design
1. High-throughput computation and machine
learning methods applied to materials design
Anubhav Jain
Energy Technologies Area
Lawrence Berkeley National Laboratory
Berkeley, CA
Citrine Informatics Talk
July 3, 2018
Slides (already) posted to hackingmaterials.lbl.gov
2. New materials discovery for devices is difficult
• Novel materials with enhanced performance characteristics
could make a big dent in sustainability, scalability, and cost
• In practice, we tend to re-use the same fundamental materials
for decades
– solar power w/Si since 1950s
– graphite/LiCoO2 (basis of today’s Li battery electrodes) since 1990
– Bi2Te3 and PbTe thermoelectrics first studied ~1910
• Although there are lots of improvements to manufacturing,
microstructure, etc., there not many new basic compositions
• Why is discovering better materials such a challenge?
2
3. 3
A material is defined at multiple length scales –
stick to the fundamental scale for now
4. 4
A material is defined at multiple length scales –
stick to the fundamental scale for now
5. 5
Atoms in a box – the materials universe is huge!
• Bag of 30 atoms
• Each atom is one of 50
elements
• Arrange on 10x10x10 lattice
• Over 10108 possibilities!
– more than grains of sand on all
beaches (1021)
– more than number of atoms in
universe (1080)
7. What constrains traditional experimentation?
7
“[The Chevrel] discovery resulted from a lot of
unsuccessful experiments of Mg ions insertion
into well-known hosts for Li+ ions insertion, as
well as from the thorough literature analysis
concerning the possibility of divalent ions
intercalation into inorganic materials.”
-Aurbach group, on discovery of Chevrel cathode
for multivalent (e.g., Mg2+) batteries
Levi, Levi, Chasid, Aurbach
J. Electroceramics (2009)
8. Outline
8
① From quantum mechanics to density functional
theory (DFT)
② “High-throughput” DFT
③ Data mining approaches to materials design
9. The basis of density functional theory
is quantum mechanics
9
−!2
2m
∇2
Ψ(r)+V (r)Ψ(r) = EΨ(r)
Schrödinger equation describes all the properties
of a system through the wavefunction:
Time-independent, non-relativistic Schrödinger equation
10. • There aren’t too many real situations where we can
get a closed solution to the Schrödinger equation
• Let’s pretend we want to approach things
numerically for 1000 electrons
– There are ~500,000 electron-electron interactions to worry
about.
– Even storing the wavefunction would take ~101000 GB!
• Discretize the x,y,z, position of each electron into a 1000-
element grid = 1 billion positions per electron
• Need the wavefunction output (real + complex part) for each
combination of all electron positions, i.e. 1E9 ^ (1000) * 2, or
2E9000 values
• even at 1 byte per wavefunction value (low resolution), you have
about 2E1000 GB needed needed to store the wavefunction!
10
The wave function is formidable
11. Dirac summarized it best …
11
“The underlying physical laws necessary
for the mathematical theory of a large part
of physics and the whole of chemistry are
thus completely known, and the difficulty
is only that the exact application of these
laws leads to equations much too
complicated to be soluble.”
“It therefore becomes desirable that
approximate practical methods of applying
quantum mechanics should be developed,
which can lead to an explanation of the
main features of complex atomic systems
without too much computation.”
12. What is density functional theory (DFT)?
12
DFT theory:
• replaces many-body interactions with a mean field interaction that
reproduces the same charge density as the original formulation
• proves that, given the correct charge density, it in principle possible to
compute all ground state properties of quantum mechanics exactly
So, (for the ground state properties) we went from many-body wavefunctions
to mean field charge density! This was worthy of the 1998 Nobel Prize.
DFT practice:
• accuracy depends on the choice of (some) parameters, the type of material,
the property to be studied, and whether the simulated system (crystal) is a
good approximation of reality.
e–
e– e–
e– e–
e–
13. How does one use DFT to design new materials?
13
A. Jain, Y. Shin, and K. A.
Persson, Nat. Rev. Mater.
1, 15004 (2016).
14. How accurate is DFT in practice?
14
Shown are typical DFT results for (i) Li
battery voltages, (ii) electronic band gaps,
and (iii) bulk modulus
(i) (ii)
(iii)
(i) V. L. Chevrier, S. P. Ong, R. Armiento, M. K. Y. Chan, and G. Ceder,
Phys. Rev. B 82, 075122 (2010).
(ii) M. Chan and G. Ceder, Phys. Rev. Lett. 105, 196403 (2010).
(iii) M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst,
M. Sluiter, C. K. Ande, S. Van Der Zwaag, J. J. Plata, C. Toher, S.
Curtarolo, G. Ceder, K.A. Persson, and M. Asta, Sci. Data 2, 150009
(2015).
battery voltages
band gaps
bulk modulus
15. • System size is essentially limited to ~1000 atoms
– many important materials phenomena simply do not occur at this
length scale
• Certain materials, such as those with strong electron
correlation, remain difficult to model accurately
• Certain properties, including excited state properties
such as band gap, remain difficult to model accurately
• These are all active areas of research and improvement
to the theory, and the situation is improving on all fronts
15
Limitations of density functional theory
16. Outline
16
① From quantum mechanics to density functional
theory (DFT)
② “High-throughput” DFT
③ Data mining approaches to materials design
19. High-throughput DFT: a key idea
19
Automate the DFT
procedure
Supercomputing
Power
FireWorks
Software for programming
general computational
workflows that can be
scaled across large
supercomputers.
NERSC
Supercomputing center,
processor count is
~100,000 desktop
machines. Other centers
are also viable.
High-throughput
materials screening
G. Ceder & K.A.
Persson, Scientific
American (2015)
20. • The answer is “it really varies a lot”
– how big / complicated are the materials you are modeling?
– how complex / expensive are the properties you are
modeling?
• Ballpark numbers:
– Low range: optimize structure of ~3-atom compounds
• time to do a million materials ~ 10 million CPU-hours
– Medium range: bulk modulus of ~50 atom compounds
• time to do a million materials ~ 2 billion CPU hours
• The largest CPU allocations from the DOE are
typically in the order of ~100 million CPU-hours
20
How much computer time is needed for
high-throughput DFT?
21. Examples of (early) high-throughput studies
21
Application Researcher Search space Candidates Hit rate
Scintillators Klintenberg et al. 22,000 136 1/160
Curtarolo et al. 11,893 ? ?
Topological insulators Klintenberg et al. 60,000 17 1/3500
Curtarolo et al. 15,000 28 1/535
High TC superconductors Klintenberg et al. 60,000 139 1/430
Thermoelectrics – ICSD
- Half Heusler systems
- Half Heusler best ZT
Curtarolo et al. 2,500
80,000
80,000
20
75
18
1/125
1/1055
1/4400
1-photon water splitting Jacobsen et al. 19,000 20 1/950
2-photon water splitting Jacobsen et al. 19,000 12 1/1585
Transparent shields Jacobsen et al. 19,000 8 1/2375
Hg adsorbers Bligaard et al. 5,581 14 1/400
HER catalysts Greeley et al. 756 1 1/756*
Li ion battery cathodes Ceder et al. 20,000 4 1/5000*
Entries marked with * have experimentally verified the candidates.
See also: Curtarolo et al., Nature Materials 12 (2013) 191–201.
22. Computations predict, experiments confirm
22
Sidorenkite-based Li-ion battery
cathodes
LED phosphors
YCuTe2 thermoelectrics
Wang, Z., Ha, J., Kim, Y. H., Im, W. Bin, McKittrick, J. &
Ong, S. P. Mining Unexplored Chemistries for Phosphors
for High-Color-Quality White-Light-Emitting Diodes.
Joule 2, 914–926 (2018).
Chen, H.; Hao, Q.; Zivkovic, O.; Hautier, G.; Du, L.-S.; Tang,
Y.; Hu, Y.-Y.; Ma, X.; Grey, C. P.; Ceder, G. Sidorenkite
(Na3MnPO4CO3): A New Intercalation Cathode Material
for Na-Ion Batteries, Chem. Mater., 2013
Aydemir, U; Pohls, J-H; Zhu, H; Hautier, G; Bajaj, S; Gibbs,
ZM; Chen, W; Li, G; Broberg, D; White, MA; Asta, M;
Persson, K; Ceder, G; Jain, A; Snyder, GJ. Thermoelectric
Properties of Intrinsically Doped YCuTe2 with CuTe4-based
Layered Structure. J. Mat. Chem C, 2016
More examples here: A. Jain, Y. Shin, and K. A. Persson, Nat. Rev. Mater. 1, 15004 (2016).
23. With HT-DFT, we can generate data rapidly – what to do next?
23
M. de Jong, W. Chen, H.
Geerlings, M. Asta, and K. A.
Persson, Sci. Data, 2015, 2,
150053.!
M. De Jong, W. Chen, T.
Angsten, A. Jain, R. Notestine,
A. Gamst, M. Sluiter, C. K.
Ande, S. Van Der Zwaag, J. J.
Plata, C. Toher, S. Curtarolo,
G. Ceder, K. a Persson, and M.
Asta, Sci. Data, 2015, 2, 150009.!
>4500 elastic
tensors
>900
piezoelectric
tensors
>48000
Seebeck
coefficients +
cRTA transport
Ricci, Chen, Aydemir, Snyder,
Rignanese, Jain, & Hautier (in
submission)!
24. With HT-DFT, we can generate data rapidly – what to do next?
24
M. de Jong, W. Chen, H.
Geerlings, M. Asta, and K. A.
Persson, Sci. Data, 2015, 2,
150053.!
M. De Jong, W. Chen, T.
Angsten, A. Jain, R. Notestine,
A. Gamst, M. Sluiter, C. K.
Ande, S. Van Der Zwaag, J. J.
Plata, C. Toher, S. Curtarolo,
G. Ceder, K. a Persson, and M.
Asta, Sci. Data, 2015, 2, 150009.!
>4500 elastic
tensors
>900
piezoelectric
tensors
>48000
Seebeck
coefficients +
cRTA transport
Ricci, Chen, Aydemir, Snyder,
Rignanese, Jain, & Hautier (in
submission)!
Goal: make it easy to
generate comparable
data sets on your own
25. A “black-box” view of performing a calculation
25
“something”!
Results!!
researcher!
What is the
GGA-PBE elastic
tensor of GaAs?
26. Unfortunately, the inside of the “black box”
is usually tedious and “low-level”
26
lots of tedious,
low-level work…!
Results!!
researcher!
What is the
GGA-PBE elastic
tensor of GaAs?
Input file flags
SLURM format
how to fix ZPOTRF?
q set up the structure coordinates
q write input files, double-check all
the flags
q copy to supercomputer
q submit job to queue
q deal with supercomputer
headaches
q monitor job
q fix error jobs, resubmit to queue,
wait again
q repeat process for subsequent
calculations in workflow
q parse output files to obtain results
q copy and organize results, e.g., into
Excel
27. What would be a better way?
27
“something”!
Results!!
researcher!
What is the
GGA-PBE elastic
tensor of GaAs?
28. What would be a better way?
28
Results!!
researcher!
What is the
GGA-PBE elastic
tensor of GaAs?
Workflows to run!
q band structure!
q surface energies!
ü elastic tensor!
q Raman spectrum!
q QH thermal expansion!
29. Ideally the method should scale to millions of calculations
29
Results!!
researcher!
Start with all binary
oxides, replace O->S,
run several different
properties
Workflows to run!
ü band structure!
ü surface energies!
ü elastic tensor!
q Raman spectrum!
q QH thermal expansion!
q spin-orbit coupling!
30. Atomate tries make it easy, automatic, and flexible to
generate data with existing simulation packages
30
Results!!
researcher!
Run many different
properties of many
different materials!
31. Each simulation procedure translates high-level instructions
into a series of low-level tasks
31
quickly and automatically translate PI-style (minimal)
specifications into well-defined FireWorks workflows
What is the
GGA-PBE elastic
tensor of GaAs?
M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, et al.,
Charting the complete elastic properties of inorganic crystalline compounds,
Sci. Data. 2 (2015).
32. Atomate contains a library of simulation procedures
32
VASP-based
• band structure
• spin-orbit coupling
• hybrid functional
calcs
• elastic tensor
• piezoelectric tensor
• Raman spectra
• NEB
• GIBBS method
• QH thermal
expansion
• AIMD
• ferroelectric
• surface adsorption
• work functions
Other
• BoltzTraP
• FEFF method
• LAMMPS MD
Mathew, K. et al Atomate: A high-level interface to generate, execute, and analyze
computational materials science workflows, Comput. Mater. Sci. 139 (2017) 140–152.
33. 33
Full operation diagram
job 1
job 2
job 3 job 4
structure! workflow! database of
all workflows!
automatically submit + execute!output files + database!
34. Atomate thus encodes and standardizes knowledge about
running various kinds of simulations from domain experts
34
K. Mathew J. Montoya S. Dwaraknath A. Faghaninia
All past and present knowledge, from everyone in the group,
everyone previously in the group, and our collaborators,
about how to run calculations
M. Aykol
S.P. Ong
B. Bocklund T. Smidt
H. Tang I.H. Chu M. Horton J. Dagdalen B. Wood
Z.K. Liu J. Neaton K. Persson A. Jain
+
35. Outline
35
① From quantum mechanics to density functional
theory (DFT)
② “High-throughput” DFT
③ Data mining approaches to materials design
39. Can we build a general optimizer?
39
Generalizable
forward solver
Supercomputing
Power
Statistical
optimization
FireWorks NERSC Various optimization libraries
(Figure: J. Mueller)
40. Rocketsled: Automatic materials screening that selects
materials to compute AND submits them to supercomputer
40
screening space of ~20,000 potential
ABX3 perovskite combinations as
water splitting materials –
precomputed in DFT by different group
if a machine learning algorithm was in
charge of picking the next compound
based on past data, how efficient
would it be?
41. Machine learning: the big problem in my view is connecting
data to ML algorithms through features
41
Lots of data on
complex objects that
you want to interrelate
Clustering, Regression, Feature
extraction, Model-building, etc.
Well developed
data-mining routines that work
only on numbers (ideally ones
with high relevance to your
problem)
Need to transform materials science objects into a set of
physically relevant numerical data (“features” or “descriptors”)
42. Goal of matminer: connect materials data with data mining
algorithms and data visualization libraries
42
Ward, L. et al. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60–69 (2018).
43. >40 featurizer classes can
generate thousands of
potential descriptors
43
Matminer contains a library of descriptors for various
materials science entities
feat = EwaldEnergy([options])
y = feat.featurize([input_data])
• compatible with
scikit-learn
pipelining
• automatically deploy
multiprocessing to
parallelize over data
• include citations to
methodology papers
44. 44
Interactive Jupyter notebooks demonstrate use cases
https://github.com/hackingmaterials/matminer_examples!
Many examples available:
• Retrieving data from various databases
• Predicting bulk / shear modulus
• Predicting formation energies:
• from composition alone
• with Voronoi-based structure features
included
• with Coulomb matrix and Orbital Field
matrix descriptors (reproducing
previous studies in the literature)
• Making interactive visualizations
• Creating an ML pipeline
45. Defining local order parameters for various environments
45
Use a given local order parameter
with a threshold
for motif recognition:
If qtet > qthresh,
then motif is tetrahedron.
Else
not (too much) a tetrahedron.
Tetrahedral order parameter, qtet, [1]:
[1] Zimmermann et al., J. Am. Chem. Soc., 2017, 10.1021/jacs.5b08098
46. We have now developed mathematical order parameters for
various types of local environments
46
47. How well do these work?
47
1. Order parameters clearly
distinguish different environments
even after thermal distortion
2. Work well in applications (defect site
finding, diffusion characterization)
[1] Zimmermann et al., Frontiers of Materials, 2017, doi: 10.3389/fmats.2017.00034
49. Results on MP web site, e.g. for BCC-like structures
49
https://www.materialsproject.org/materials/mp-91/!
Target: W
similar structures
(distance near 0)
Cs3Sb!
TiGaFeCo!
CeMg2Cu!
50. 50
Text mining: learning from scientific abstracts
Matstract
corpus
Unlabeled
data
Data
labels
Feature engineering
Text cleaning
Tokenization
POS tag
labels
Word embeddings
(word2vec)
Text processing
Hand crafted features
Supervised learning
Neural network
(LSTM)
Logistic regression
Train/test
sets
Named
Entities
Named
Entities
“Learning” what a
scientific study is about
from >2 million
materials science
abstracts
51. 51
Application: a revised materials search engine
Auto-generated summaries of materials based on text mining
53. 53
• The Materials Project today
mostly compiles fundamental
simulation output data
• Many users don’t really know
what to “do” with this data
– i.e., they would be interested in lattice
thermal conductivity, but don’t know
they can get there using MP data +
other models
• How can we decorate MP
database with additional
properties and clearly show how
we got there?
How to connect different materials properties together?
“Constitutive relations”
54. 54
A materials map / network / atlas
• What we need is a
“connected property
network”
• This is a graph in which
nodes are materials
properties and edges are
relationships between
those properties
• Given a set of known
properties, e.g. simulation
data, one can easily figure
out what are all the derived
engineering properties one
can get
Starting with the three properties in blue, one can
derive many additional properties using one
(orange), two (purple), or three (green) physical
models.!
!
The value of computations are not only in the direct
simulation outputs!!
56. 56
We are now feeding the “atlas” into the Materials Project
to derive new data
57. • High-throughput density functional theory, materials
databases, and machine learning are a new set of
tools for doing materials science
• We are developing many methods and software
implementations to try to advance the field
• If you are interested, give the software a try!
57
Conclusions
Quantum mechanics Density functional theory High-throughput DFT
e– e–
e– e–
e– e–
Materials databases Machine learning
58. • Atomate
– K Matthew (project lead) & team
• Structure order parameters
– N. Zimmermann (project lead) & team
• Rocketsled
– A. Dunn, J. Brenneck
• Matminer
– L. Ward (project lead, U. Chicago) & team
• Text mining
– V. Tshitoyan, J. Dagdelen, L. Weston
• Propnet
– M. K. Horton, D. Mrdjenovich
• All that provided feedback & contributed code to open-source software efforts!
• Funding: DOE-BES (Early Career + Materials Project Center)
• Computing: NERSC
58
Thank you!