Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
materiaIs
virtuaLab
Creating It from Bit
Designing Materials by Integrating
Quantum Mechanics,Informatics and
Computer Sci...
Electronic structure calculations are today
reliable and reasonably accurate.
February 23, 2017
tials in Quantum ESPRESSO)...
February 23, 2017
• What are the tools necessary for automation
of electronic structure calculations?Automation
• What is ...
February 23, 2017
• What are the tools necessary for automation
of electronic structure calculations?Automation
• What is ...
Our automated future
February 23, 2017
X
Write reusable
software
frameworks
57th Sanibel Symposium
“User requirements” for electronic structure
calculation automation
February 23, 2017
ˆHψ = ˆT + ˆV + ˆU⎡
⎣
⎤
⎦ψ = Eψ
???
...
Software frameworks for HT electronic structure
computations
February 23, 2017
Atomic Simulation Environment
https://wiki....
Extensive Materials Analysis Capabilities
Input/O
utput
objects
(Modular, Reusable, Extendable)
Defects and Transformation...
February 23, 2017
Global network of users
Some recent additions
• Dielectric constants
• Elastic constants and
phonons
• X...
Custodian
Simple, robust and flexible just-in-time (JIT)
job management framework.
• Wrappers to perform error checking, j...
FireWorks is the Workflow Manager
Custom material
A cool material !!
Lots of information about
cool material !!
Submit!
In...
FireWorks as a platform
Community can write any
workflow in FireWorks
à
We can automate it over
most supercomputing
resour...
February 23, 2017
• What are the tools necessary for automation
of electronic structure calculations?Automation
• What is ...
February 23, 2017 57th Sanibel Symposium
With great automation comes great quantities of
data…
February 23, 2017
Jain, et al. ,APL Mater., 2013, 1, 11002.
57th San...
User-friendly web interface
(but unfriendly for advanced
users requiring large quantities
of data!)
Materials Explorer
Bat...
Materials
Project DB
How do I
access MP
data?
WebApps
RESTfulAPI
The MaterialsAPI
• Provides programmatic access to large ...
A modern data model for high-throughput
computational research groups
February 23, 2017
Materials
Project
Large, open
elec...
February 23, 2017
• What are the tools necessary for automation
of electronic structure calculations?Automation
• What is ...
Applications
February 23, 2017 57th Sanibel Symposium
• Rapidly explore vast chemical spaces
• Exclude bad candidates usin...
Next GenerationAll-solid-state Batteries
February 23, 2017 57th Sanibel Symposium
NMC Li metalSolid electrolyte
~
Ω
Li+
So...
Predicting non-dilute diffusion properties
with first principles Lithium motion in Li10GeP2S12
(sulfur tetrahedra frozen f...
Completely automated ab initio molecular
dynamics
February 23, 2017 57th Sanibel Symposium
Dynamically
add
continuation
AI...
Li3OClxBr1-x
pristine-Na3PS4
Li4P2S6
Li3OClxBr1-x
doped-Na3PS4
Li10GeP2S12
Li10Si1.5P1.5S11.5Cl0.5
Li10SnP2S12
Li10SiP2S12...
Many Li superionic conductors haveAg analogues
February 23, 2017 57th Sanibel Symposium
Li7P3S11Ag7P3S11
a
c
Yamane et al....
February 23, 2017 57th Sanibel Symposium
Phase stability
Ehull  30 meV/atom
Promising
candidates
Li-P-S and
Li-M-P-S
compo...
ConvergedAIMD simulations confirm 3D
Superionic Conductivities
February 23, 2017 57th Sanibel Symposium
Zhu et al. Chem. M...
Li3Y(PS4)2 likely to exhibit better electrochemical
stability than current state-of-the-art
February 23, 2017 57th Sanibel...
Crystalium –World’s Largest Database of Surface
Energies andWulff Shapes
February 23, 2017 57th Sanibel Symposium
Generati...
Insights from Large Materials Datasets
February 23, 2017 57th Sanibel Symposium
DFT does surprisingly well in terms of sur...
Building software infrastructure enables new
capabilities
February 23, 2017 57th Sanibel Symposium
Tran, R.; Xu, Z.; Zhou,...
Learning new insights into dopant effect on GB
strength
February 23, 2017 57th Sanibel Symposium
or the 29 dopants in the ...
materiaIs
virtuaLab
Future
challenges
February 23, 2017
57th Sanibel Symposium
Challenge 1:Software development is typically not
viewed as“science”
Most research group software are:
a. Badly coded
b. P...
Challenge 2:DataAPIs are not that common
• Many materials data repository projects still Web 1.0.
• Development of API for...
Challenge 3:There are still problems too“large”
for first principles
February 23, 2017 57th Sanibel Symposium
CostScale
Tr...
“It from bit.”
- JohnArchibaldWheeler
Materials
Data57th Sanibel Symposium
Acknowledgements
February 23, 2017 57th Sanibel Symposium
Ong group
Iek-Heng Chu, Balachandran Radhakrishnan,
Zhuoying Zhu...
materiaIs
virtuaLab
Thank you.
February 23, 2017
57th Sanibel Symposium
Upcoming SlideShare
Loading in …5
×

Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, Informatics and Computer Science

1,354 views

Published on

This is the plenary talk given by Prof Shyue Ping Ong at the 57th Sanibel Symposium held on St Simon's Island in Georgia, USA.

Abstract: Powered by methodological breakthroughs and computing advances, electronic structure methods have today become an indispensable toolkit in the materials designer’s arsenal. In this talk, I will discuss two emerging trends that holds the promise to continue to push the envelope in computational design of materials. The first trend is the development of robust software and data frameworks for the automatic generation, storage and analysis of materials data sets. The second is the advent of reliable central materials data repositories, such as the Materials Project, which provides the research community with efficient access to large quantities of property information that can be mined for trends or new materials. I will show how we have leveraged on these new tools to accelerate discovery and design in energy and structural materials as well as our efforts in contributing back to the community through further tool or data development. I will also provide my perspective on future challenges in high-throughput computational materials design.

Published in: Science
  • Be the first to comment

Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, Informatics and Computer Science

  1. 1. materiaIs virtuaLab Creating It from Bit Designing Materials by Integrating Quantum Mechanics,Informatics and Computer Science Shyue Ping Ong February 23, 2017 57th Sanibel Symposium
  2. 2. Electronic structure calculations are today reliable and reasonably accurate. February 23, 2017 tials in Quantum ESPRESSO). In this case, too, the small D values indicate a good agreement between codes. This agreementmoreoverencom- passes varying degrees of numerical convergence, differences in the numerical implementation of the particular potentials, and computational dif- ferences beyond the pseudization scheme, most of which are expected to be of the same order of magnitude or smaller than the differences among all-electron codes (1 meV per atom at most). Conclusions and outlook Solid-state DFT codes have evolved considerably. The change from small and personalized codes to widespread general-purpose packages has pushed developers to aim for the best possible precision. Whereas past DFT-PBE literature on the lattice parameter of silicon indicated a spread of 0.05 Å, the most recent versions of the implementations discussed here agree on this value within 0.01 Å (Fig. 1 and tables S3 to S42). By comparing codes on a more detailed level using the D gauge, we have found the most recent methods to yield nearly indistinguishable EOS, with the associ- ated error bar comparable to that between dif- ferent high-precision experiments. This underpins thevalidityof recentDFTEOSresults andconfirms that correctly converged calculations yield reliable predictions. The implications are moreover rele- vant throughout the multidisciplinary set of fields that build upon DFT results, ranging from the physical to the biological sciences. In spite of the absence of one absolute refer- ence code, we were able to improve and demon- strate the reproducibility of DFT results by means of a pairwise comparison of a wide range of codes and methods. It is now possible to verify whether any newly developed methodology can reach the same precision described here, and new DFT applications can be shown to have used a meth- od and/or potentials that were screened in this way. The data generated in this study serve as a crucial enabler for such a reproducibility-driven paradigm shift, and future updates of available D values will be presented at http://molmod. ugent.be/deltacodesdft. The reproducibility of reported results also provides a sound basis for further improvement to the accuracy of DFT, particularly in the investigation of new DFT func- tionals, or for the development of new computa- tional approaches. This work might therefore Fig. 4. D values for comparisons between the most important DFT methods considered (in millielectron volts per atom). Shown are comparisons of all-electron (AE), PAW, ultrasoft (USPP), and norm-conserving pseudopotential (NCPP) results with all-electron results (methods are listed in alpha- betical order in each category). The labels for each method stand for code, code/specification (AE), or potential set/code (PAW, USPP, and NCPP) and are explained in full in tables S3 to S42.The color coding RESEARCH | RESEARCH ARTICLE onFebruary19,2017http://science.sciencemag.org/Downloadedfrom Lejaeghere et al. Science, 2016, 351 (6280), aad3000. Nitrides are an important class of optoel ported synthesizability of highly metasta nitrogen precursors (36, 37) suggests th spectrum of promising and technologica trides awaiting discovery. Although our study focuses on the m crystals, polymorphism and metastability is of great technological relevance to pha tronics, and protein folding (7). Our obs energy to metastability could address a d in organic molecular solids: Why do man numerous polymorphs within a small (~ whereas inorganic solids often see >100°C morph transition temperatures? The wea molecular solids yield cohesive energies o or −1 eV per molecule, about a third of t class of inorganic solids (iodides; Fig. 2B). yields a correspondingly small energy scal (38). When this small energy scale of orga is coupled with the rich structural diversity a tional degrees of freedom during molecular leads to a wide range of accessible polymorp modynamic conditions. Influence of composition The space of metastable compounds hov scape of equilibrium phases. As chemica thermodynamic system, the complexity grows. Figure 2A shows an example ca for the ternary Fe-Al-O system, plotted a tion energies referenced to the elemental S1.2 for discussion). We anticipate the th of a phase to be different when it is compe S C I E N C E A D V A N C E S | R E S E A R C H A R T I C L E HAUTIER, ONG, JAIN, MOORE, AND CEDER PHYSICAL REVIEW B 85, 155208 (2012) or meV/atom); 10 meV/atom corresponds to about 1 kJ/mol- atom. III. RESULTS Figure 2 plots the experimental reaction energies as a function of the computed reaction energies. All reactions involve binary oxides to ternary oxides and have been chosen as presented in Sec. II. The error bars indicate the experimental error on the reaction energy. The data points follow roughly the diagonal and no computed reaction energy deviates from the experimental data by more than 150 meV/atom. Figure 2 does not show any systematic increase in the DFT error with larger reaction energies. This justifies our focus in this study on absolute and not relative errors. In Fig. 3, we plot a histogram of the difference between the DFT and experimental reaction energies. GGA + U un- derestimates and overestimates the energy of reaction with the same frequency, and the mean difference between computed and experimental energies is 9.6 meV/atom. The root-mean- square (rms) deviation of the computed energies with respect to experiments is 34.4 meV/atom. Both the mean and rms are very different from the results obtained by Lany on reaction energies from the elements.52 Using pure GGA, Lany found that elemental formation energies are underestimated by GGA with a much larger rms of 240 meV/atom. Our results are closer to experiments because of the greater accuracy of DFT when comparing chemically similar compounds such as binary and ternary oxides due to errors cancellation.40 We should note that even using elemental energies that are fitted to minimize the error versus experiment in a large set of reactions, Lany reports that the error is still 70 meV/atom and much larger than what we find for the relevant reaction energies. The rms we found is consistent with the error of 3 kJ/mol-atom 600 800 l V/at) FIG. 3. (Color online) Histogram of the difference between computed ( E comp 0 K ) and experimental ( E expt 0 K ) energies of reaction (in meV/atom). (30 meV/atom) for reaction energies from the binaries in the limited set of perovskites reported by Martinez et al.29 Very often, instead of the exact reaction energy, one is interested in knowing if a ternary compound is stable enough to form with respect to the binaries. This is typically the case when a new ternary oxide phase is proposed and tested for stability versus the competing binary phases.18 From the 131 compounds for which reaction energies are negative according to experiments, all but two (Al2SiO5 and CeAlO3) are also negative according to computations. This success in predicting stability versus binary oxides of known ternary oxides can be related to the very large magnitude of reaction energies from binary to ternary oxides compared to the typical errors observed (rms of 34 meV/atom). Indeed, for the vast majority of the reactions (109 among 131), the experimental reaction en- ergies are larger than 50 meV/atom. It is unlikely then that the DFT error would be large enough to offset this large reaction energy and make a stable compound unstable versus the binary oxides. The histogram in Fig. 3 shows several reaction energies with significant errors. Failures and successes of DFT are often JSON document in the format of a Crystallographic Information File (cif), which can also be downloaded via the Materials Project website and Crystalium web application. In addition, the weighted surface energy (equation (2)), shape factor (equation (3)), and surface anisotropy (equation (4)) are given. Table 2 provides a full description of all properties available in each entry as well as their corresponding JSON key. Technical Validation The data was validated through an extensive comparison with surface energies from experiments and other DFT studies in the literature. Due to limitations in the available literature, only the data on ground state phases were compared. Comparison to experimental measurements Experimental determination of surface energy typically involves measuring the liquid surface tension and solid-liquid interfacial energy of the material20 to estimate the solid surface energy at the melting temperature, which is then extrapolated to 0 K under isotropic approximations. Surface energies for individual crystal facets are rarely available experimentally. Figure 5 compares the weighted surface energies of all crystals (equation (2)) to experimental values in the literature20,23,26–28 . It should be noted that we have adopted the latest experimental values available for comparison, i.e., values were obtained from the 2016 review by Mills et al.27 , followed by Keene28 , and finally Niessen et al.26 and Miller and Tyson20 . A one-factor linear regression line γDFT ¼ γEXP þ c was fitted for the data points. The choice of the one factor fit is motivated by the fact that standard broken bond models show that there is a direct relationship between surface energies and cohesive energies, and previous studies have found no evidence that DFT errors in the cohesive energy scale with the magnitude of the cohesive energy itself61 . We find that the DFT weighted surface energies are in excellent agreement with experimental values, with an average underestimation of only 0.01 J m− 2 and a standard error of the estimate (SEE) of 0.27 J m− 2 . The Pearson correlation coefficient r is 0.966. Crystals with surfaces that are well-known to undergo significant reconstruction tend to have errors in weighted surface energies that are larger than the SEE. The differences between the calculated and experimental surface energies can be attributed to three main factors. First, there are uncertainties in the experimental surface energies. The experimental values derived by Miller and Tyson20 are extrapolations from extreme temperatures beyond the melting point. The surface energy of Ge, Si62 , Te63 , and Se64 were determined at 77, 77, 432 and 313 K respectively while Figure 5. Comparison to experimental surface energies. Plot of experimental versus calculated weighted surface energies for ground-state elemental crystals. Structures known to reconstruct have blue data points while square data points correspond to non-metals. Points that are within the standard error of the estimate − 2 Phase stability Formation energies Tran, et al. Sci. Data 2016, 3, 160080. Sun, et al. Sci. Adv. 2016, 2 (11), e1600225. Figure 2. Distribution of calculated volume per atom, Poisson ratio, bulk modulus and shear modulus. Vector field-plot showing the distribution of the bulk and shear modulus, Poisson ratio and atomic volume for 1,181 metals, compounds and non-metals. Arrows pointing at 12 o’clock correspond to minimum volume-per-atom and move anti-clockwise in the direction of maximum volume-per-atom, which is located at 6 o’clock. Bar plots indicate the distribution of materials in terms of their shear and bulk moduli. www.nature.com/sdata/ Surface energies Elastic constants de Jong et al. Sci. Data 2015, 2, 150009. Hautier et al. Phys. Rev. B 2012, 85, 155208. 57th Sanibel Symposium Modern electronic structure codes give relatively consistent equations of state. Of course, challenges remain …
  3. 3. February 23, 2017 • What are the tools necessary for automation of electronic structure calculations?Automation • What is a model for open-vs-private data?Data • How and what can we learn from large quantities of materials data? • Can we really do in silico design of materials? Learning & Design Reliability + Reasonable Accuracy 57th Sanibel Symposium
  4. 4. February 23, 2017 • What are the tools necessary for automation of electronic structure calculations?Automation • What is a model for open-vs-private data?Data • How and what can we learn from large quantities of materials data? • Can we really do in silico design of materials? Learning & Design Reliability + Reasonable Accuracy 57th Sanibel Symposium
  5. 5. Our automated future February 23, 2017 X Write reusable software frameworks 57th Sanibel Symposium
  6. 6. “User requirements” for electronic structure calculation automation February 23, 2017 ˆHψ = ˆT + ˆV + ˆU⎡ ⎣ ⎤ ⎦ψ = Eψ ??? Need #1: Robust materials analysis software to “talk” between application, computable property and electronic structure calculations Need #2: Error detection and correction Need #3: Scientific workflow management 57th Sanibel Symposium
  7. 7. Software frameworks for HT electronic structure computations February 23, 2017 Atomic Simulation Environment https://wiki.fysik.dtu.dk/ase Materials Project1 https://www.materialsproject.org Custodianhttp://aflowlib.org http://www.aiida.net 1 Jain et al. APL Mater. 2013, 1 (1), 11002. 2 Ong et al. Comput. Mater. Sci. 2013, 68, 314–319. 3 Jain et al. Concurr. Comput. Pract. Exp. 2015, 27 (17), 5037–5059. 2 3 57th Sanibel Symposium
  8. 8. Extensive Materials Analysis Capabilities Input/O utput objects (Modular, Reusable, Extendable) Defects and TransformationsElectronic Structure XRD Patterns Phase and Pourbaix Diagrams Functional properties Comprehensively documented Continuously tested and integrated Active dev/user community Ong et al. Comput. Mater. Sci. 2013, 68, 314–319. February 23, 2017 57th Sanibel Symposium
  9. 9. February 23, 2017 Global network of users Some recent additions • Dielectric constants • Elastic constants and phonons • X-ray Absorption Spectroscopy Number of visits on pymatgen.org Very active developer community! 57th Sanibel Symposium
  10. 10. Custodian Simple, robust and flexible just-in-time (JIT) job management framework. • Wrappers to perform error checking, job management and error recovery. • Error recovery is an important aspect for HT: O(100,000) jobs + 1% error rate => O(1000) errored jobs. • Existing sub-packages for error handling forVASP, NwChem and QChem calculations. February 23, 2017 57th Sanibel Symposium
  11. 11. FireWorks is the Workflow Manager Custom material A cool material !! Lots of information about cool material !! Submit! Input generation (parameter choice) Workflow mapping Supercomputer submission / monitoring Error handling File Transfer File Parsing / DB insertion February 23, 2017 57th Sanibel Symposium
  12. 12. FireWorks as a platform Community can write any workflow in FireWorks à We can automate it over most supercomputing resources structure charge Band structure DOS Optical phonons XAFS spectra GW February 23, 2017 57th Sanibel Symposium
  13. 13. February 23, 2017 • What are the tools necessary for automation of electronic structure calculations?Automation • What is a model for open-vs-private data?Data • How and what can we learn from large quantities of materials data? • Can we really do in silico design of materials? Learning & Design Reliability + Reasonable Accuracy 57th Sanibel Symposium
  14. 14. February 23, 2017 57th Sanibel Symposium
  15. 15. With great automation comes great quantities of data… February 23, 2017 Jain, et al. ,APL Mater., 2013, 1, 11002. 57th Sanibel Symposium
  16. 16. User-friendly web interface (but unfriendly for advanced users requiring large quantities of data!) Materials Explorer Battery Explorer CrystalToolkit Structure Predictor Phase Diagram App Pourbaix Diagram App Reaction Calculator February 23, 2017 57th Sanibel Symposium Structure Electronic Structure Elastic properties XRD Energetic properties Jain, et al. ,APL Mater., 2013, 1, 11002.
  17. 17. Materials Project DB How do I access MP data? WebApps RESTfulAPI The MaterialsAPI • Provides programmatic access to large quantities of data. • Data can be used for analysis or for learning. Ong et al. Comput. Mater. Sci. 2015, 97, 209–215. February 23, 2017 57th Sanibel Symposium
  18. 18. A modern data model for high-throughput computational research groups February 23, 2017 Materials Project Large, open electronic structure databases AFLOW OQMD APIs Private “small” databases for individual research groups 57th Sanibel Symposium
  19. 19. February 23, 2017 • What are the tools necessary for automation of electronic structure calculations?Automation • What is a model for open-vs-private data?Data • How and what can we learn from large quantities of materials data? • Can we really do in silico design of materials? Learning & Design Reliability + Reasonable Accuracy 57th Sanibel Symposium
  20. 20. Applications February 23, 2017 57th Sanibel Symposium • Rapidly explore vast chemical spaces • Exclude bad candidates using a minimum of computational resources • Multi-property optimization • Identify “best” candidate(s) Screening • Analyze large data sets • Identify trends • Obtain new physics/chemistry insights Learning
  21. 21. Next GenerationAll-solid-state Batteries February 23, 2017 57th Sanibel Symposium NMC Li metalSolid electrolyte ~ Ω Li+ Solid electrolyte (SE) is non-flammable and can be stacked for higher system energy densities. Can potentially enable high voltage cathodes like NMC Can potentially enable Li metal anode, with significantly higher energies densities Design Requirements for SE q Extremely high (“super”) ionic conductivity > 0.1 mS/cm q Low electronic conductivity q Phase stability q Good electrochemical stability with electrodes (intrinsic or passivation) q Mechanical compatibility
  22. 22. Predicting non-dilute diffusion properties with first principles Lithium motion in Li10GeP2S12 (sulfur tetrahedra frozen for clarity) Eψ(r) = − h 2 2m ∇2 ψ(r)+V(r)ψ(r) Quantum mechanics F = ma Newton’s laws of motion Ab initio molecular dynamics or AIMD D = 1 2Ndt ri (t +t0 )2 −ri (t)2 i ∑ • Computational and human-time intensive • Huge quantities data need to be stored and analyzed Typical AIMD simulation: 50,000-100,000 time steps of 2 fs at 4-6 temperatures (400,000 structures, 2-3 weeks on cluster) February 23, 2017 57th Sanibel Symposium
  23. 23. Completely automated ab initio molecular dynamics February 23, 2017 57th Sanibel Symposium Dynamically add continuation AIMD job starting from previous one Dynamically add initial AIMD jobs running at different temperatures Converged? Converged? Converged? AIMD simulation AIMD simulation AIMD simulation Setup simulation box Initial relaxation Start End N N N Y Y Y Deng, Z.; Zhu, Z.; Chu, I.-H.; Ong, S. P. Chem. Mater. 2017, 29 (1), 281–288. In-house system built entirely on pymatgen, custodian and fireworks!
  24. 24. Li3OClxBr1-x pristine-Na3PS4 Li4P2S6 Li3OClxBr1-x doped-Na3PS4 Li10GeP2S12 Li10Si1.5P1.5S11.5Cl0.5 Li10SnP2S12 Li10SiP2S12 pristine-Na3PS4 Li4P2S6 Li7P3S11 Li3OClxBr1-x doped-Na3PS4 Li10GeP2S12 Li10Si1.5P1.5S11.5Cl0.5 Li10SnP2S12 Li10SiP2S12 pristine-Na3PS4 Li4P2S6 Li15P4S16Cl3 LiZnPS4 LiAl(PS3)2 MSD800K=5 Li7P3S11 Li5PS4Cl2 Li3Y(PS4)2 MSD 1200K = 7 MSD 800K Superionic conductor region Can we screen with short (~50 ps)AIMD simulations? February 23, 2017 57th Sanibel Symposium Zhu et al. Chem. Mater. 2017, acs.chemmater.6b04049. Establish minimum level of diffusivity
  25. 25. Many Li superionic conductors haveAg analogues February 23, 2017 57th Sanibel Symposium Li7P3S11Ag7P3S11 a c Yamane et al., Solid State Ionics 2007, 178 (15–18), 1163–1167. Hautier et al., Inorg. Chem., 2010, 656–663. Data-mined ionic substitution probabilities pnðX, X0 Þ e P i λi f ðnÞ i ðX,X0 Þ Z ð5Þ The λi indicate the weight given to the feature fi (n) (X, X0 ) in the probabilistic model. Z is a partition function ensuring the normalization of the probability func- tion. The exponential form chosen in eq 5 follows a commonly used convention in the machine learning community.25 2.3. Binary Feature Model. A first assumption we make is to consider that the feature functions do not depend on the number n of ions in the compound. Simply put, we assume that the ionic substitution rules are independent of the compound’s number of components (binary, ternary, quaternary, ...). Therefore, we will omit any reference to n in the probability and feature functions. Equation 5 becomes pðX, X0 Þ e P i λi fiðX,X0 Þ Z ð6Þ While the feature functions could be more complex, only simple binary substitutions are considered in this paper. This means that the likelihood for two ions to substitute to each other is independent of the nature of the other ionic species present in the compound. Mathe- matically, this translates in assuming that the relevant feature functions are simple binary features of the form: 0 ( well estab still need from the structure From structura comparis BaTiO3 b and Ba o ematical variables O2- ). We designing (X,X0 ) by ture data (x, x0 )t w D ¼ fðX Comin probabili database The ana substitut Using we follow proach to available community.25 2.3. Binary Feature Model. A first assumption we make is to consider that the feature functions do not depend on the number n of ions in the compound. Simply put, we assume that the ionic substitution rules are independent of the compound’s number of components (binary, ternary, quaternary, ...). Therefore, we will omit any reference to n in the probability and feature functions. Equation 5 becomes pðX, X0 Þ e P i λi fiðX,X0 Þ Z ð6Þ While the feature functions could be more complex, only simple binary substitutions are considered in this paper. This means that the likelihood for two ions to substitute to each other is independent of the nature of the other ionic species present in the compound. Mathe- matically, this translates in assuming that the relevant feature functions are simple binary features of the form: f a,b k ðX, X0 Þ ¼ 1 Xk ¼ a and Xk 0 ¼ b 0 else ( ð7Þ Each pair of ions a and b present in the domain Ω is assigned a set of feature functions with corresponding weights λk a,b indicating how likely the ions a and b can substitute in position k. For instance, one of the feature 2þ 2þ
  26. 26. February 23, 2017 57th Sanibel Symposium Phase stability Ehull 30 meV/atom Promising candidates Li-P-S and Li-M-P-S compounds Topological analysis rc 1.75 Å Short AIMD estimation MSD800K 5 Å2 MSD1200K/MSD800K 7 Long AIMD at multiple temperatures σ300K 1 mS/cm ICSD Diffusivity screening Initial candidates Ag-P-S and Ag-M-P-S compounds Substitute Ag for Li Dopant and composition optimization Li3Y(PS4)2 Li5PS4Cl2 Ehull = 2 meV/atom rc = 1.88 Å MSD800K = 65.1 Å2 MSD ratio = 4.5 Ehull = 17 meV/atom rc = 1.76 Å MSD800K = 77.9 Å2 MSD ratio = 3.1 Parent:Ag3Y(PS4)2 (ICSD 417658) Parent: Li5PS4Cl2 (ICSD 416587) Zhu et al. Chem. Mater. 2017, acs.chemmater.6b04049. Provisional patent filed A B Ehull A0.5B0.5
  27. 27. ConvergedAIMD simulations confirm 3D Superionic Conductivities February 23, 2017 57th Sanibel Symposium Zhu et al. Chem. Mater. 2017, acs.chemmater.6b04049. σ300K = 2.16± 1 mS/cm Ea = 278 meV σ300K = 1.78± 1 mS/cm Ea = 304 meV Ca-doped σ300K = 7.14 ± 3 mS/cm Zr-doped σ300K = 5.25 ± 1 mS/cm
  28. 28. Li3Y(PS4)2 likely to exhibit better electrochemical stability than current state-of-the-art February 23, 2017 57th Sanibel Symposium Zhu, Z.; et al., Chem. Mater., 2016,Accepted Kato, et al., Nat. Energy, 2016, 1, 16030.
  29. 29. Crystalium –World’s Largest Database of Surface Energies andWulff Shapes February 23, 2017 57th Sanibel Symposium Generation of OUCs up to max Miller index Input bulk crystal structure Relaxation calculation of OUC (hkl) Termination 1 calculation Termination 2 calculation … Surface calculations (h2k2l2) … Surface calculations (h1k1l1) Calculations completed? Parameter adjustment No Yes Surface Database Surface energy and Wulff shape calculations Materials Project Dryad Repository http://crystalium.materialsvirtuallab.orgTran, R.; Xu, Z.; Radhakrishnan, B.;Winston, D.; Sun,W.; Persson, K.A.; Ong, S. P., Sci. Data, 2016, 3, 160080.
  30. 30. Insights from Large Materials Datasets February 23, 2017 57th Sanibel Symposium DFT does surprisingly well in terms of surface energies, contrary to popular perception Trends in energies between reconstructed and unreconstructed surfaces can be reproduced. Fcc(110) “missing row” Expt. known to reconstruct! Tran, R.; Xu, Z.; Radhakrishnan, B.;Winston, D.; Sun,W.; Persson, K.A.; Ong, S. P., Sci. Data, 2016, 3, 160080. SEE = 0.27 J m−2
  31. 31. Building software infrastructure enables new capabilities February 23, 2017 57th Sanibel Symposium Tran, R.; Xu, Z.; Zhou, N.; Radhakrishnan, B.; Luo, J.; Ong, S. P.,Acta Mater., 2016, 117, 91–99, doi:10.1016/j.actamat.2016.07.005. Re, Os,Ta and W are strengthening dopants for Mo alloys.
  32. 32. Learning new insights into dopant effect on GB strength February 23, 2017 57th Sanibel Symposium or the 29 dopants in the S5(310) tilt GB. (a) based on lowest energy dopant site in GB and free surface region (positive E X seg) prefer to stay in the bulk. For dopants that segregate, those with negative E X SE (bl colour in this figure legend, the reader is referred to the web version of this article.) Classic bond- breaking theory gives 0.33 Strain effect due to dopant-host radius mismatch
  33. 33. materiaIs virtuaLab Future challenges February 23, 2017 57th Sanibel Symposium
  34. 34. Challenge 1:Software development is typically not viewed as“science” Most research group software are: a. Badly coded b. Poorly documented c. Not available to the community d. Does not last beyond the current PhD/postdoc e. All of the above February 23, 2017 57th Sanibel Symposium https://xkcd.com/292/
  35. 35. Challenge 2:DataAPIs are not that common • Many materials data repository projects still Web 1.0. • Development of API for programmatic materials data access not part of distribution strategy. • APIs need complementary software support. February 23, 2017 57th Sanibel Symposium The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles Shyue Ping Ong a,⇑ , Shreyas Cholia b , Anubhav Jain b , Miriam Brafman b , Dan Gunter b , Gerbrand Ceder c , Kristin A. Persson b a Department of NanoEngineering, University of California, San Diego, 9500 Gilman Drive, Mail Code 0448, La Jolla, CA 92093, USA b Lawrence Berkeley National Lab, 1 Cyclotron Rd, Berkeley, CA 94720, USA c Department of Materials Science and Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA a r t i c l e i n f o Article history: Received 18 August 2014 Accepted 18 October 2014 Keywords: Materials Project Application Programming Interface High-throughput Materials genome Rest Representational state transfer a b s t r a c t In this paper, we describe the Materials Application Programming Interface (API), a simple, flexible and efficient interface to programmatically query and interact with the Materials Project database based on the REpresentational State Transfer (REST) pattern for the web. Since its creation in Aug 2012, the Materials API has been the Materials Project’s de facto platform for data access, supporting not only the Materials Project’s many collaborative efforts but also enabling new applications and analyses. We will highlight some of these analyses enabled by the Materials API, particularly those requiring consolidation of data on a large number of materials, such as data mining of structural and property trends, and generation of phase diagrams. We will conclude with a discussion of the role of the API in building a community that is developing novel applications and analyses based on Materials Project data. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction First principles methods are today a critical tool in the study and design of materials. Starting from the fundamental laws of physics with minimal assumptions and approximations, first prin- ciples techniques can access a wide range of chemistries in a rela- tively agnostic manner, making them especially powerful in materials investigations or design problems spanning diverse chemical spaces. In the past decade, electronic structure calculation codes [1–4] have reached a level of maturity that it is now possible to reliably automate and scale first principles calculations across any number of compounds. Coupled with computing advances, this develop- ment has led to the advent of high throughput (HT) first principles calculations as an investigative and design tool in materials science. Even today, there are already several examples of HT first principles computation-guided materials design efforts in hydrogen production [10], topological insulators [11], and organic semiconductors [12], with many of these efforts resulting in the discovery of novel materials that have already been synthesized and verified experimentally. This HT capability has also spurred the development of large databases of computed data on materials, such as the Materials Project [13], the AFLOWLIB library [14] and the Harvard Clean Energy Project [12]. In particular, the Materials Project [13], created by the authors of this paper, has led the charge of combining a large database of materials properties with a diverse and growing set of online anal- ysis and comprehensive open source software tools [15–17]. The Materials Project’s database today contains computed energetic properties for over 59,000 crystal structures along with over 25,000 electronic structure properties. More structures and prop- erties (e.g., elastic constants, dielectric constants, etc.) are being added on a daily basis. A series of web applications provide users with the capability to perform advanced searches and common Computational Materials Science 97 (2015) 209–215 Contents lists available at ScienceDirect Computational Materials Science journal homepage: www.elsevier.com/locate/commatsci Ong et al. Comput. Mater. Sci. 2015, 97, 209–215. A RESTful API for exchanging materials data in the AFLOWLIB.org consortium Richard H. Taylor a,b , Frisco Rose b , Cormac Toher b , Ohad Levy b,1 , Kesong Yang c , Marco Buongiorno Nardelli d,e , Stefano Curtarolo f,⇑ a National Institute of Standards and Technology, Gaithersburg, MD 20878, USA b Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA c Department of Nanoengineering, University of California, San Diego, La Jolla, CA 92093, USA d Department of Physics, University of North Texas, Denton, TX, USA e Department of Chemistry, University of North Texas, Denton, TX, USA f Materials Science, Electrical Engineering, Physics and Chemistry, Duke University, Durham, NC 27708, USA a r t i c l e i n f o Article history: Received 20 March 2014 Received in revised form 5 May 2014 Accepted 10 May 2014 Available online 24 July 2014 Keywords: High-throughput Combinatorial materials science Computer simulations Materials databases AFLOWLIB a b s t r a c t The continued advancement of science depends on shared and reproducible data. In the field of compu- tational materials science and rational materials design this entails the construction of large open dat- abases of materials properties. To this end, an Application Program Interface (API) following REST principles is introduced for the AFLOWLIB.org materials data repositories consortium. AUIDs (Aflowlib Unique IDentifier) and AURLs (Aflowlib Uniform Resource Locator) are assigned to the database resources according to a well-defined protocol described herein, which enables the client to access, through appro- priate queries, the desired data for post-processing. This introduces a new level of openness into the AFLOWLIB repository, allowing the community to construct high-level work-flows and tools exploiting its rich data set of calculated structural, thermodynamic, and electronic properties. Furthermore, feder- ating these tools will open the door to collaborative investigations of unprecedented scope that will dra- matically accelerate the advancement of computational materials design and development. Ó 2014 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/3.0/). 1. Introduction Data-driven materials science has gained considerable traction over the last decade or so. This is due to the confluence of three key factors: (1) Improved computational methods and tools; (2) greater computational power; and (3) heightened awareness of the power of extensive databases in science [1]. The recent Materi- als Genome Initiative (MGI) [1,2] reflects the recognition that many important social and economic challenges of the 21st cen- tury could be solved or mitigated by advanced materials. Compu- tational materials science currently presents the most promising trended toward more public and user-friendly frameworks. The emphasis is increasingly on portability and sharing of tools and data [13–15]. Similar to the effort presented here, the Materials- Project [16] has been providing open access to its database of com- puted materials properties through a RESTful API and a python library enabling ad hoc applications [17]. Other examples of online material properties databases include that being implemented by the Engineering Virtual Organization for Cyber Design (EVOCD) [18], which contains a repository of experimental data, materials constants and computational tools for use in Integrated Computa- tional Material Engineering (ICME). The future advance of compu- Computational Materials Science 93 (2014) 178–192 Contents lists available at ScienceDirect Computational Materials Science journal homepage: www.elsevier.com/locate/commatsci Taylor et al. Comput. Mater. Sci. 2014, 93, 178–192 Pymatgen provides high-level tools for users to easily obtain and work with data via Materials API.
  36. 36. Challenge 3:There are still problems too“large” for first principles February 23, 2017 57th Sanibel Symposium CostScale Transferability Transferable Costly Short time/length scales Non-transferable Cheap Longer time/length scales First principles EmpiricalAutomation
  37. 37. “It from bit.” - JohnArchibaldWheeler Materials Data57th Sanibel Symposium
  38. 38. Acknowledgements February 23, 2017 57th Sanibel Symposium Ong group Iek-Heng Chu, Balachandran Radhakrishnan, Zhuoying Zhu,Yuh-Chieh Lin, Zhenbin Wang, Zihan Xu, Zhi Deng, Hanmei Tang,WeikeYe, Chen Zheng Collaborators Prof Shirley Meng, Christopher Kompella, Han Nguyen, Sunny Hy Prof Joanna McKittrick, Jungmin Ha Creating It from Bit
  39. 39. materiaIs virtuaLab Thank you. February 23, 2017 57th Sanibel Symposium

×