This document summarizes a talk on discovering advanced materials for energy applications using high-throughput computing and mining the scientific literature. It discusses how materials discovery and optimization typically take decades due to the vast number of possible atomic configurations. Density functional theory provides a way to computationally screen millions of potential materials by automating calculations on supercomputers. Examples are given of new battery cathode and thermoelectric materials that have been discovered through high-throughput density functional theory calculations and later experimentally confirmed.
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsAnubhav Jain
This document summarizes several projects from Anubhav Jain at Lawrence Berkeley National Laboratory related to using artificial intelligence and data mining for materials science. It discusses (1) developing interpretable descriptors of crystal structure based on local environments, (2) the matminer toolkit for connecting materials data to machine learning algorithms, and (3) the atomate/Rocketsled software for running high-throughput density functional theory calculations on supercomputers. It also briefly outlines a project to develop a text mining database for materials science literature.
Materials discovery through theory, computation, and machine learningAnubhav Jain
The document discusses using theory, computation, and machine learning to discover new materials. It summarizes that density functional theory (DFT) can model material properties from first principles, and how DFT calculations have been automated and run on supercomputers to enable high-throughput screening of materials. Examples are given of computations predicting new materials that were later experimentally confirmed, like sidorenkite cathodes for sodium ion batteries. Related projects are outlined like the open-source Materials Project database of DFT data on over 85,000 materials and software libraries to support high-throughput computation and materials science. Text mining of scientific literature is also discussed to help predict new materials in advance.
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
1. The document discusses using density functional theory (DFT) and high-throughput computing to screen large numbers of materials for promising thermoelectric properties.
2. Early high-throughput studies screened tens of thousands of materials for applications like scintillators, topological insulators, and high temperature superconductors, finding candidates with a hit rate of around 1 in 1000.
3. Predictions from high-throughput DFT have been experimentally confirmed for new thermoelectric, battery cathode, and CO2 capture materials. However, accurately predicting all thermoelectric properties like the figure of merit remains challenging.
Overview of accelerated materials design efforts in the Hacking Materials res...Anubhav Jain
This document provides an overview of accelerated materials design efforts in the Hacking Materials research group. It discusses using high-throughput computing and simulations like density functional theory to generate large datasets for materials screening. Machine learning techniques like matminer are used to represent materials as feature vectors to enable predictive modeling. Text mining of scientific literature is also discussed as a way to automatically extract knowledge from millions of published articles to inform new materials discoveries. The goal is to develop automated methods that can suggest the next best computational experiments to optimize properties of interest.
High-throughput computation and machine learning methods applied to materials...Anubhav Jain
High-throughput computation and machine learning methods can be applied to materials design problems at scale. Density functional theory (DFT) allows modeling of materials at the quantum mechanical level but large computational resources are required. "High-throughput DFT" uses automation, parallelization across supercomputers, and data mining approaches to rapidly screen millions of potential new materials in silico before experimental validation. This helps address the challenge of discovering new materials for applications like energy technologies by searching the vast space of possible compositions and structures more efficiently than traditional experimentation alone.
Introduction (Part I): High-throughput computation and machine learning appli...Anubhav Jain
High-throughput computation and machine learning applied to materials design
The document discusses how high-throughput density functional theory (DFT) calculations and materials databases can help address the challenge of discovering new materials. DFT calculations are automated and run in parallel on supercomputers to rapidly screen large numbers of potential materials. This generates huge datasets that are compiled into online materials databases for the community to access and reuse. However, DFT has limitations in accuracy and certain properties remain difficult to model. Data mining approaches are discussed that apply machine learning to these large datasets to help guide materials discovery and design.
Methods, tools, and examples (Part II): High-throughput computation and machi...Anubhav Jain
The document discusses using high-throughput density functional theory (DFT) calculations and machine learning to aid in the design of thermoelectric materials. It describes how the author's group has used automated DFT workflows to screen over 50,000 compounds for potential thermoelectric performance. Several new materials with promising figures of merit were identified through this process, including TmAgTe2, though experimental realization proved challenging. It also discusses efforts to incorporate machine learning to help guide materials discovery and address limitations of DFT, such as accurately modeling doping concentrations. Overall, the document outlines the author's work applying computational methods at a large scale to accelerate the discovery of efficient thermoelectric materials.
Discovering advanced materials for energy applications by mining the scientif...Anubhav Jain
This document discusses natural language processing (NLP) techniques for extracting materials-related information from scientific literature. It describes how Matscholar uses NLP to analyze over 4 million paper abstracts, identifying entities like materials, properties, and methods. Key steps include tokenizing text, training word embeddings, and using an LSTM neural network to recognize entities in context. Applications include searching materials by property and predicting promising new materials for applications based on word vector relationships. Future work aims to improve predictions for new compositions and automatically generate databases of materials properties from literature.
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsAnubhav Jain
This document summarizes several projects from Anubhav Jain at Lawrence Berkeley National Laboratory related to using artificial intelligence and data mining for materials science. It discusses (1) developing interpretable descriptors of crystal structure based on local environments, (2) the matminer toolkit for connecting materials data to machine learning algorithms, and (3) the atomate/Rocketsled software for running high-throughput density functional theory calculations on supercomputers. It also briefly outlines a project to develop a text mining database for materials science literature.
Materials discovery through theory, computation, and machine learningAnubhav Jain
The document discusses using theory, computation, and machine learning to discover new materials. It summarizes that density functional theory (DFT) can model material properties from first principles, and how DFT calculations have been automated and run on supercomputers to enable high-throughput screening of materials. Examples are given of computations predicting new materials that were later experimentally confirmed, like sidorenkite cathodes for sodium ion batteries. Related projects are outlined like the open-source Materials Project database of DFT data on over 85,000 materials and software libraries to support high-throughput computation and materials science. Text mining of scientific literature is also discussed to help predict new materials in advance.
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
1. The document discusses using density functional theory (DFT) and high-throughput computing to screen large numbers of materials for promising thermoelectric properties.
2. Early high-throughput studies screened tens of thousands of materials for applications like scintillators, topological insulators, and high temperature superconductors, finding candidates with a hit rate of around 1 in 1000.
3. Predictions from high-throughput DFT have been experimentally confirmed for new thermoelectric, battery cathode, and CO2 capture materials. However, accurately predicting all thermoelectric properties like the figure of merit remains challenging.
Overview of accelerated materials design efforts in the Hacking Materials res...Anubhav Jain
This document provides an overview of accelerated materials design efforts in the Hacking Materials research group. It discusses using high-throughput computing and simulations like density functional theory to generate large datasets for materials screening. Machine learning techniques like matminer are used to represent materials as feature vectors to enable predictive modeling. Text mining of scientific literature is also discussed as a way to automatically extract knowledge from millions of published articles to inform new materials discoveries. The goal is to develop automated methods that can suggest the next best computational experiments to optimize properties of interest.
High-throughput computation and machine learning methods applied to materials...Anubhav Jain
High-throughput computation and machine learning methods can be applied to materials design problems at scale. Density functional theory (DFT) allows modeling of materials at the quantum mechanical level but large computational resources are required. "High-throughput DFT" uses automation, parallelization across supercomputers, and data mining approaches to rapidly screen millions of potential new materials in silico before experimental validation. This helps address the challenge of discovering new materials for applications like energy technologies by searching the vast space of possible compositions and structures more efficiently than traditional experimentation alone.
Introduction (Part I): High-throughput computation and machine learning appli...Anubhav Jain
High-throughput computation and machine learning applied to materials design
The document discusses how high-throughput density functional theory (DFT) calculations and materials databases can help address the challenge of discovering new materials. DFT calculations are automated and run in parallel on supercomputers to rapidly screen large numbers of potential materials. This generates huge datasets that are compiled into online materials databases for the community to access and reuse. However, DFT has limitations in accuracy and certain properties remain difficult to model. Data mining approaches are discussed that apply machine learning to these large datasets to help guide materials discovery and design.
Methods, tools, and examples (Part II): High-throughput computation and machi...Anubhav Jain
The document discusses using high-throughput density functional theory (DFT) calculations and machine learning to aid in the design of thermoelectric materials. It describes how the author's group has used automated DFT workflows to screen over 50,000 compounds for potential thermoelectric performance. Several new materials with promising figures of merit were identified through this process, including TmAgTe2, though experimental realization proved challenging. It also discusses efforts to incorporate machine learning to help guide materials discovery and address limitations of DFT, such as accurately modeling doping concentrations. Overall, the document outlines the author's work applying computational methods at a large scale to accelerate the discovery of efficient thermoelectric materials.
Discovering advanced materials for energy applications by mining the scientif...Anubhav Jain
This document discusses natural language processing (NLP) techniques for extracting materials-related information from scientific literature. It describes how Matscholar uses NLP to analyze over 4 million paper abstracts, identifying entities like materials, properties, and methods. Key steps include tokenizing text, training word embeddings, and using an LSTM neural network to recognize entities in context. Applications include searching materials by property and predicting promising new materials for applications based on word vector relationships. Future work aims to improve predictions for new compositions and automatically generate databases of materials properties from literature.
Computational materials design with high-throughput and machine learning methodsAnubhav Jain
Computational materials design with high-throughput and machine learning methods was presented. The presentation discussed (1) using density functional theory and high-throughput screening to rapidly generate data on many materials, (2) developing data mining approaches like matminer and matbench to extract useful information and connect to machine learning algorithms from the large volumes of data, and (3) concluded with a discussion of using these methods to accelerate materials innovation.
Data dissemination and materials informatics at LBNLAnubhav Jain
The document summarizes data dissemination and materials informatics work done at LBNL. It discusses several key points:
1) The Materials Project shares simulation data on hundreds of thousands of materials through a science gateway and REST API, with millions of data points downloaded.
2) A new feature called MPContribs allows users to contribute their own data sets to be disseminated through the Materials Project.
3) A materials data mining platform called MIDAS is being built to retrieve, analyze, and visualize materials data from several sources using machine learning algorithms.
Conducting and Enabling Data-Driven Research Through the Materials ProjectAnubhav Jain
The Materials Project provides a free database of calculated materials properties for over 150,000 materials. It aims to enable data-driven materials research by conducting high-throughput calculations and providing tools for researchers to explore the data. The presentation discusses how the Materials Project has been used to discover new functional materials, including p-type transparent conductors, thermoelectrics, and phosphors, by screening for materials with desirable predicted properties. Engaging the research community through contributions of experimental data and machine learning benchmarking helps add value to the Materials Project platform.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
Combining density functional theory calculations, supercomputing, and data-driven methods to design new thermoelectric materials
Anubhav Jain presents on using computational methods like density functional theory calculations combined with large datasets and machine learning to design new thermoelectric materials. He discusses how DFT can be used for high-throughput screening of many materials to discover promising candidates. He highlights the Materials Project database which has calculated properties of over 65,000 materials and is used by many researchers. An example is given of screening over 50,000 compounds to find new thermoelectric materials like TmAgTe2 which was later experimentally verified. The goal is to accelerate materials discovery through these computational approaches.
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...Anubhav Jain
Combined computational and data-driven approaches are being used to discover new thermoelectric materials. AMSET is a new model that calculates mobility and Seebeck coefficient more accurately than conventional models while remaining computationally efficient. Natural language processing of literature abstracts is also being used to predict promising new thermoelectric compositions based on similarity to known materials. Several recent experimental findings have validated predictions made using this approach.
Open Source Tools for Materials InformaticsAnubhav Jain
This document discusses open source tools for materials informatics, including Matminer and Matscholar. Matminer is a library of descriptors for materials science data that can generate features for machine learning models. It includes over 60 featurizer classes and supports scikit-learn. Matscholar applies natural language processing to over 2 million materials science abstracts to extract keywords and enable improved literature searching. The document argues that open datasets like Matbench and automated tools like Automatminer could help lower barriers for developing machine learning models in materials science by making it easier to obtain training data and evaluate model performance.
Computational screening of tens of thousands of compounds as potential thermo...Anubhav Jain
Computational screening of tens of thousands of compounds as potential thermoelectrics and their experimental followup
This document discusses using computational screening to identify promising thermoelectric materials. It summarizes past successes in predicting materials with high zT values through density functional theory screening of large databases. Two materials identified through screening, TmAgTe2 and YCuTe2, were experimentally synthesized with zT values close to predictions. The document also introduces a new model called AMSET that aims to more accurately calculate electronic transport properties compared to the commonly used BoltzTraP approach with a fixed relaxation time.
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
This talk introduces several open-source software tools for accelerating materials design efforts:
- Atomate enables high-throughput DFT simulations through automated workflows. It has been used to generate large datasets for the Materials Project.
- Rocketsled uses machine learning to suggest the most informative calculations to optimize a target property faster than random searches.
- Matminer provides features to represent materials for machine learning and connects to data mining tools and databases.
- Automatminer develops machine learning models automatically from raw input-output data without requiring feature engineering by users.
- Robocrystallographer analyzes crystal structures and describes them in an interpretable text format.
Software tools for data-driven research and their application to thermoelectr...Anubhav Jain
This document summarizes several software tools for materials data science and their application to thermoelectrics materials discovery. It discusses Atomate for high-throughput calculations, Matminer for materials feature extraction and machine learning, AMSET for electron transport modeling, and integration with the Materials Project database. Example applications are described like using order parameters for structure characterization and a computational screening identifying new thermoelectric materials like YCuTe2.
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...Anubhav Jain
Anubhav Jain presented research using high-throughput computations to predict new thermoelectric materials. Computations screened over 48,000 compounds and identified some promising candidates, including TmAgTe2, YCuTe2, and bournonites like CuPbSbS3. Experimental validation found TmAgTe2 achieved a maximum zT of 0.35, limited by low carrier concentration, while YCuTe2 reached zT of 0.75. Attempts to synthesize predicted bournonite CuPbSnSe3 were unsuccessful. Future work will improve the computational models and search for more candidate materials informed by the large computational dataset.
Machine learning for materials design: opportunities, challenges, and methodsAnubhav Jain
Machine learning techniques show promise for accelerating materials design by serving as surrogates for experiments and computations, enabling "self-driving laboratories", and extracting insights from natural language text. Key opportunities include using ML to screen large areas of chemical space before running computationally expensive DFT calculations or laboratory experiments. Challenges include limited materials data, data heterogeneity across problems, and ensuring ML models can accurately extrapolate beyond the training data distribution. Overcoming these challenges could substantially reduce the decades-long timelines currently needed for new materials discovery and optimization.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
The document summarizes how computational materials science using density functional theory (DFT) calculations, supercomputing, and data-driven methods can help design new materials faster than traditional experimental approaches. It describes how high-throughput DFT calculations are run on supercomputers to screen large numbers of potential materials. The results are compiled in open databases like the Materials Project to be shared and reused by researchers. While computational limitations remain, combining computation and data is helping accelerate the discovery of new materials with improved properties for applications like batteries, thermoelectrics, and carbon capture.
Software tools for high-throughput materials data generation and data miningAnubhav Jain
Atomate and matminer are open-source Python libraries for high-throughput materials data generation and data mining. Atomate makes it easy to automatically generate large datasets by running standardized computational workflows with different simulation packages. Matminer contains tools for featurizing materials data and integrating it with machine learning algorithms and data visualization methods. Both aim to accelerate materials discovery by automating and standardizing computational workflows and data analysis tasks.
Atomate: a tool for rapid high-throughput computing and materials discoveryAnubhav Jain
Atomate is a tool for automating materials simulations and high-throughput computations. It provides predefined workflows for common calculations like band structures, elastic tensors, and Raman spectra. Users can customize workflows and simulation parameters. FireWorks executes workflows on supercomputers and detects/recovers from failures. Data is stored in databases for analysis with tools like pymatgen. The goal is to make simulations easy and scalable by automating tedious steps and leveraging past work.
Materials design using knowledge from millions of journal articles via natura...Anubhav Jain
This document discusses natural language processing (NLP) techniques for materials design using information from millions of journal articles. It begins with an overview of how materials are typically discovered and optimized over decades before discussing how NLP could help address this challenge. The document then provides a high-level view of how NLP is used to extract and analyze information from millions of materials science abstracts, including data collection, tokenization, training machine learning models on labeled text, and using the models to automatically extract entities. Examples are given of how word embeddings can encode scientific concepts and relationships in ways that allow predicting promising new materials for applications like thermoelectrics. The talk concludes by discussing future directions for the NLP work.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
Combining density functional theory calculations, supercomputing, and data-driven methods, the speaker aims to understand and design new thermoelectric materials for waste heat recovery. He discusses using high-throughput computations and large databases like the Materials Project to efficiently search for promising thermoelectric materials candidates among thousands of potential compositions. Experimental validation is then needed to confirm computational predictions.
Computational Materials Design and Data Dissemination through the Materials P...Anubhav Jain
The Materials Project is a free online database containing calculated properties of over 150,000 materials designed to help researchers discover new functional materials. It provides data on electronic, thermal, mechanical, magnetic and other material properties calculated using high-performance computing. The database can be accessed through a website, API, and various apps. The code powering the Materials Project is also open source. It has been heavily used by the research community, with over 180,000 registered users conducting data-driven materials design studies. The Materials Project team is working to expand the community through initiatives like allowing experimental data contributions and benchmarking machine learning algorithms.
Software tools for calculating materials properties in high-throughput (pymat...Anubhav Jain
This document discusses software tools for automating materials simulations. It introduces pymatgen, atomate, and FireWorks which can be used together to define a workflow of calculations, execute the workflow on supercomputers, and recover from errors or failures. The tools allow researchers to focus on designing and analyzing simulations rather than manual setup and execution of jobs. Workflows in atomate can compute many materials properties including elastic tensors, band structures, and transport coefficients. Parameters are customizable but sensible defaults are provided. FireWorks then executes the workflows across multiple supercomputing clusters.
Available methods for predicting materials synthesizability using computation...Anubhav Jain
This document summarizes a talk about computational and machine learning approaches for predicting materials synthesizability. It discusses how machine learning algorithms are generating millions of potential stable compound predictions, far more than can be experimentally tested. It also examines ways to better prioritize candidate materials for synthesis, such as by assessing their likelihood of dynamical stability and calculating their finite-temperature Gibbs free energies more efficiently using machine-learned interatomic force constants. Finally, it describes efforts to integrate literature knowledge using natural language processing to further guide experimental exploration and reduce the number of experiments needed to synthesize predicted materials.
This document discusses exploring new 2D materials through computational modeling and simulations using high-performance computing (HPC). It describes the University of Bath's research agenda and need for flexible HPC resources to efficiently model experimental data from cutting-edge materials. The document outlines the HPC resources used, including a new cloud-based Janus environment, and how modeling scales up based on factors like system size and vacuum levels in 2D materials simulations. The goal is extracting maximum understanding from large experimental datasets through timely modeling and calculations.
Computational materials design with high-throughput and machine learning methodsAnubhav Jain
Computational materials design with high-throughput and machine learning methods was presented. The presentation discussed (1) using density functional theory and high-throughput screening to rapidly generate data on many materials, (2) developing data mining approaches like matminer and matbench to extract useful information and connect to machine learning algorithms from the large volumes of data, and (3) concluded with a discussion of using these methods to accelerate materials innovation.
Data dissemination and materials informatics at LBNLAnubhav Jain
The document summarizes data dissemination and materials informatics work done at LBNL. It discusses several key points:
1) The Materials Project shares simulation data on hundreds of thousands of materials through a science gateway and REST API, with millions of data points downloaded.
2) A new feature called MPContribs allows users to contribute their own data sets to be disseminated through the Materials Project.
3) A materials data mining platform called MIDAS is being built to retrieve, analyze, and visualize materials data from several sources using machine learning algorithms.
Conducting and Enabling Data-Driven Research Through the Materials ProjectAnubhav Jain
The Materials Project provides a free database of calculated materials properties for over 150,000 materials. It aims to enable data-driven materials research by conducting high-throughput calculations and providing tools for researchers to explore the data. The presentation discusses how the Materials Project has been used to discover new functional materials, including p-type transparent conductors, thermoelectrics, and phosphors, by screening for materials with desirable predicted properties. Engaging the research community through contributions of experimental data and machine learning benchmarking helps add value to the Materials Project platform.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
Combining density functional theory calculations, supercomputing, and data-driven methods to design new thermoelectric materials
Anubhav Jain presents on using computational methods like density functional theory calculations combined with large datasets and machine learning to design new thermoelectric materials. He discusses how DFT can be used for high-throughput screening of many materials to discover promising candidates. He highlights the Materials Project database which has calculated properties of over 65,000 materials and is used by many researchers. An example is given of screening over 50,000 compounds to find new thermoelectric materials like TmAgTe2 which was later experimentally verified. The goal is to accelerate materials discovery through these computational approaches.
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...Anubhav Jain
Combined computational and data-driven approaches are being used to discover new thermoelectric materials. AMSET is a new model that calculates mobility and Seebeck coefficient more accurately than conventional models while remaining computationally efficient. Natural language processing of literature abstracts is also being used to predict promising new thermoelectric compositions based on similarity to known materials. Several recent experimental findings have validated predictions made using this approach.
Open Source Tools for Materials InformaticsAnubhav Jain
This document discusses open source tools for materials informatics, including Matminer and Matscholar. Matminer is a library of descriptors for materials science data that can generate features for machine learning models. It includes over 60 featurizer classes and supports scikit-learn. Matscholar applies natural language processing to over 2 million materials science abstracts to extract keywords and enable improved literature searching. The document argues that open datasets like Matbench and automated tools like Automatminer could help lower barriers for developing machine learning models in materials science by making it easier to obtain training data and evaluate model performance.
Computational screening of tens of thousands of compounds as potential thermo...Anubhav Jain
Computational screening of tens of thousands of compounds as potential thermoelectrics and their experimental followup
This document discusses using computational screening to identify promising thermoelectric materials. It summarizes past successes in predicting materials with high zT values through density functional theory screening of large databases. Two materials identified through screening, TmAgTe2 and YCuTe2, were experimentally synthesized with zT values close to predictions. The document also introduces a new model called AMSET that aims to more accurately calculate electronic transport properties compared to the commonly used BoltzTraP approach with a fixed relaxation time.
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
This talk introduces several open-source software tools for accelerating materials design efforts:
- Atomate enables high-throughput DFT simulations through automated workflows. It has been used to generate large datasets for the Materials Project.
- Rocketsled uses machine learning to suggest the most informative calculations to optimize a target property faster than random searches.
- Matminer provides features to represent materials for machine learning and connects to data mining tools and databases.
- Automatminer develops machine learning models automatically from raw input-output data without requiring feature engineering by users.
- Robocrystallographer analyzes crystal structures and describes them in an interpretable text format.
Software tools for data-driven research and their application to thermoelectr...Anubhav Jain
This document summarizes several software tools for materials data science and their application to thermoelectrics materials discovery. It discusses Atomate for high-throughput calculations, Matminer for materials feature extraction and machine learning, AMSET for electron transport modeling, and integration with the Materials Project database. Example applications are described like using order parameters for structure characterization and a computational screening identifying new thermoelectric materials like YCuTe2.
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...Anubhav Jain
Anubhav Jain presented research using high-throughput computations to predict new thermoelectric materials. Computations screened over 48,000 compounds and identified some promising candidates, including TmAgTe2, YCuTe2, and bournonites like CuPbSbS3. Experimental validation found TmAgTe2 achieved a maximum zT of 0.35, limited by low carrier concentration, while YCuTe2 reached zT of 0.75. Attempts to synthesize predicted bournonite CuPbSnSe3 were unsuccessful. Future work will improve the computational models and search for more candidate materials informed by the large computational dataset.
Machine learning for materials design: opportunities, challenges, and methodsAnubhav Jain
Machine learning techniques show promise for accelerating materials design by serving as surrogates for experiments and computations, enabling "self-driving laboratories", and extracting insights from natural language text. Key opportunities include using ML to screen large areas of chemical space before running computationally expensive DFT calculations or laboratory experiments. Challenges include limited materials data, data heterogeneity across problems, and ensuring ML models can accurately extrapolate beyond the training data distribution. Overcoming these challenges could substantially reduce the decades-long timelines currently needed for new materials discovery and optimization.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
The document summarizes how computational materials science using density functional theory (DFT) calculations, supercomputing, and data-driven methods can help design new materials faster than traditional experimental approaches. It describes how high-throughput DFT calculations are run on supercomputers to screen large numbers of potential materials. The results are compiled in open databases like the Materials Project to be shared and reused by researchers. While computational limitations remain, combining computation and data is helping accelerate the discovery of new materials with improved properties for applications like batteries, thermoelectrics, and carbon capture.
Software tools for high-throughput materials data generation and data miningAnubhav Jain
Atomate and matminer are open-source Python libraries for high-throughput materials data generation and data mining. Atomate makes it easy to automatically generate large datasets by running standardized computational workflows with different simulation packages. Matminer contains tools for featurizing materials data and integrating it with machine learning algorithms and data visualization methods. Both aim to accelerate materials discovery by automating and standardizing computational workflows and data analysis tasks.
Atomate: a tool for rapid high-throughput computing and materials discoveryAnubhav Jain
Atomate is a tool for automating materials simulations and high-throughput computations. It provides predefined workflows for common calculations like band structures, elastic tensors, and Raman spectra. Users can customize workflows and simulation parameters. FireWorks executes workflows on supercomputers and detects/recovers from failures. Data is stored in databases for analysis with tools like pymatgen. The goal is to make simulations easy and scalable by automating tedious steps and leveraging past work.
Materials design using knowledge from millions of journal articles via natura...Anubhav Jain
This document discusses natural language processing (NLP) techniques for materials design using information from millions of journal articles. It begins with an overview of how materials are typically discovered and optimized over decades before discussing how NLP could help address this challenge. The document then provides a high-level view of how NLP is used to extract and analyze information from millions of materials science abstracts, including data collection, tokenization, training machine learning models on labeled text, and using the models to automatically extract entities. Examples are given of how word embeddings can encode scientific concepts and relationships in ways that allow predicting promising new materials for applications like thermoelectrics. The talk concludes by discussing future directions for the NLP work.
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
Combining density functional theory calculations, supercomputing, and data-driven methods, the speaker aims to understand and design new thermoelectric materials for waste heat recovery. He discusses using high-throughput computations and large databases like the Materials Project to efficiently search for promising thermoelectric materials candidates among thousands of potential compositions. Experimental validation is then needed to confirm computational predictions.
Computational Materials Design and Data Dissemination through the Materials P...Anubhav Jain
The Materials Project is a free online database containing calculated properties of over 150,000 materials designed to help researchers discover new functional materials. It provides data on electronic, thermal, mechanical, magnetic and other material properties calculated using high-performance computing. The database can be accessed through a website, API, and various apps. The code powering the Materials Project is also open source. It has been heavily used by the research community, with over 180,000 registered users conducting data-driven materials design studies. The Materials Project team is working to expand the community through initiatives like allowing experimental data contributions and benchmarking machine learning algorithms.
Software tools for calculating materials properties in high-throughput (pymat...Anubhav Jain
This document discusses software tools for automating materials simulations. It introduces pymatgen, atomate, and FireWorks which can be used together to define a workflow of calculations, execute the workflow on supercomputers, and recover from errors or failures. The tools allow researchers to focus on designing and analyzing simulations rather than manual setup and execution of jobs. Workflows in atomate can compute many materials properties including elastic tensors, band structures, and transport coefficients. Parameters are customizable but sensible defaults are provided. FireWorks then executes the workflows across multiple supercomputing clusters.
Available methods for predicting materials synthesizability using computation...Anubhav Jain
This document summarizes a talk about computational and machine learning approaches for predicting materials synthesizability. It discusses how machine learning algorithms are generating millions of potential stable compound predictions, far more than can be experimentally tested. It also examines ways to better prioritize candidate materials for synthesis, such as by assessing their likelihood of dynamical stability and calculating their finite-temperature Gibbs free energies more efficiently using machine-learned interatomic force constants. Finally, it describes efforts to integrate literature knowledge using natural language processing to further guide experimental exploration and reduce the number of experiments needed to synthesize predicted materials.
This document discusses exploring new 2D materials through computational modeling and simulations using high-performance computing (HPC). It describes the University of Bath's research agenda and need for flexible HPC resources to efficiently model experimental data from cutting-edge materials. The document outlines the HPC resources used, including a new cloud-based Janus environment, and how modeling scales up based on factors like system size and vacuum levels in 2D materials simulations. The goal is extracting maximum understanding from large experimental datasets through timely modeling and calculations.
Data Mining to Discovery for Inorganic Solids: Software Tools and Applicationsaimsnist
This document summarizes four projects from Lawrence Berkeley National Laboratory related to using artificial intelligence and data mining for materials science:
1) Interpretable descriptors of crystal structure that describe local environments as fingerprints to distinguish structures.
2) The matminer toolkit which connects materials data to machine learning algorithms and data visualization.
3) The atomate and Rocketsled software for running high-throughput density functional theory calculations and building a computational optimizer.
4) A text mining approach to label the content of materials science abstracts to build a revised materials search engine and identify related materials.
Advanced Computational Materials Science: Application to Fusion and Generatio...myatom
This document summarizes a workshop on advanced computational materials science and its application to fusion and Generation IV fission reactors. The workshop brought together international experts to examine the role of high-performance computing in predicting materials behavior under irradiation conditions for nuclear reactors, and to evaluate the potential for computational modeling to bridge gaps in experimental data needed for reactor design. Key challenges for structural materials in fusion and Generation IV reactors are discussed, as well as recent progress and future goals in multiscale computational modeling of irradiation effects. While computational modeling shows promise, the workshop participants agreed that prototypic irradiation experiments will still be needed to fully validate models and provide sufficient data for reactor licensing and investment decisions.
The Materials Project and computational materials discoveryAnubhav Jain
1. The Materials Project aims to accelerate materials discovery through high-throughput computational screening of materials properties using density functional theory calculations.
2. Over 60,000 compounds have been computed so far, with properties including total energies, optimized structures, band structures, and elastic tensors.
3. The goal is to compute properties for over 90,000 materials to help researchers discover new materials for applications like batteries, thermoelectrics, and other energy technologies.
This Presentation is based on our Research work carried out in GNDU Amritsar and DAVIET, Jallandhar. We fabricated Ion track filters; nanowires and some Exotic Patterns for the first time in India using simple Techniques.
UCSD NANO 266 Quantum Mechanical Modelling of Materials and Nanostructures is a graduate class that provides students with a highly practical introduction to the application of first principles quantum mechanical simulations to model, understand and predict the properties of materials and nano-structures. The syllabus includes: a brief introduction to quantum mechanics and the Hartree-Fock and density functional theory (DFT) formulations; practical simulation considerations such as convergence, selection of the appropriate functional and parameters; interpretation of the results from simulations, including the limits of accuracy of each method. Several lab sessions provide students with hands-on experience in the conduct of simulations. A key aspect of the course is in the use of programming to facilitate calculations and analysis.
The Materials Project: Applications to energy storage and functional materia...Anubhav Jain
The Materials Project is a free online database containing calculated properties of over 150,000 materials designed to help researchers discover new functional materials. It has been used extensively in academia and industry to identify novel battery electrode materials and solid electrolytes through high-throughput computational screening. Researchers are now using the Materials Project dataset to train machine learning models to predict battery properties and screen for new materials. Related efforts aim to bridge the gap between computational design and physical synthesis by developing an automated synthesis lab to experimentally validate candidate materials identified from the database.
- Nanotechnology is the manipulation of matter on an atomic, molecular, and supramolecular scale where quantum mechanical effects are observed. It involves engineering materials and devices within the nanometer scale (1-100 nm).
- Some examples of nanotechnology include carbon nanotubes, graphene, buckminsterfullerenes, plasmonic nanoparticles, and quantum dots. Nanomaterials are characterized using techniques like atomic force microscopy, scanning electron microscopy, and transmission electron microscopy.
- Properties of materials change at the nanoscale due to increased surface area effects, quantum confinement, and single electron tunneling effects. This allows for applications in areas like energy storage, catalysis, drug delivery, and electronics.
Discovering new functional materials for clean energy and beyond using high-t...Anubhav Jain
- The research group develops computational methods and machine learning models to design new functional materials using high-throughput computing. This includes developing databases of materials properties, benchmarking machine learning algorithms, and applying natural language processing to materials design. Recent work also involves automating materials synthesis and characterization. The group maintains several open-source software packages that power their research.
Nanotechnology and display applications.pdfNirmalM15
This document provides an overview of nanotechnology applications in display technology. It discusses how quantum dots enable colorful displays through quantum confinement effects. It also describes how silicon nanocrystals can be used in LEDs and lasers through controlling their size and properties. Additionally, the document outlines various nanotechnology applications that have improved modern displays, such as liquid crystal alignment layers, polymer-stabilized cholesteric textures, and metal nanoparticles in liquid crystals. Emerging display technologies discussed include electrophoretic displays, carbon nanotube field emission displays, organic light-emitting diodes, and flexible displays.
Nanotechnology Presentation For Electronic Industrytabirsir
Nanoelectronics aims to process, transmit, and store information using properties of matter at the nanoscale that are different from macroscale properties. Relevant length scales are a few nanometers for molecules acting as transistors or memory, and up to 999 nm for quantum dots using electron spin. While microelectronics uses gate sizes as small as 50 nm, it does not qualify as nanoelectronics as it does not exploit new physical properties related to reduced size.
The Materials Project: An Electronic Structure Database for Community-Based M...Anubhav Jain
The document summarizes the Materials Project, an electronic structure database for materials design maintained by Lawrence Berkeley National Laboratory. It describes how the Materials Project uses high-throughput density functional theory calculations to compute properties of over 50,000 materials in its database. Users can search for materials, analyze computed properties, and design new materials using tools on the project's website.
The Materials Project: A Community Data Resource for Accelerating New Materia...Anubhav Jain
The Materials Project is a free online database containing calculated properties of over 150,000 materials designed to accelerate materials design. It contains electronic, thermal, mechanical, magnetic, and other properties powered by hundreds of millions of CPU hours. Users can access core data, tools for analysis, and open-source simulation code. The Materials Project has been used to computationally design new materials that were then experimentally confirmed, such as transparent conductors and thermoelectrics. The project seeks to engage the community through contributions of experimental data, benchmarking of machine learning methods, and disseminating discoveries.
The document discusses the history and development of nanotechnology. It begins with key early milestones like Richard Feynman's 1959 talk envisioning atomic engineering and the 1981 invention of the scanning tunneling microscope. The 1985 discovery of buckyballs also represented an important early discovery. The document then discusses topics like integrated circuits, nanotechnology funding levels, applications of nanotechnology, and how properties change at the nanoscale.
Discovering advanced materials for energy applications: theory, high-throughp...Anubhav Jain
Anubhav Jain presented on using density functional theory and high-throughput calculations to design advanced materials for energy applications. Key points included:
1) Density functional theory can be used to model materials physics and properties by approximating many-body quantum mechanics.
2) Thermoelectric materials were discussed as an example application, where the goal is to optimize the figure of merit which depends on conductivity, Seebeck coefficient, and thermal conductivity.
3) High-throughput calculations were performed on over 50,000 materials to efficiently screen for promising thermoelectric candidates like TmAgTe2, though experimental validation is still needed due to approximations.
Nature-inspired Solutions for Engineering: A Transformative Methodology for I...KTN
Nature- Inspired Engineering (NIE) is the application of fundamental scientific mechanisms, underpinning desirable properties observed in nature (e.g., resilience, scalability, efficiency), to inform the design of advanced technological solutions. As illustrated by the many applications, from energy technology, catalysis and reactor engineering, to functional materials for the built environment, electronic or optical devices, biomedical and healthcare engineering, NIE has the opportunity to inform transformative solutions to tackle some of our most pressing challenges, as well as to be a pathway to innovation.
The webcast recording is now available. Click here to watch it: https://www.youtube.com/watch?v=gPyTb_-qhgo
Find out more about the Nature Inspired Solutions special interest group at https://ktn-uk.co.uk/interests/nature-inspired-solutions
Join the Nature Inspired Solutions LinkedIn group at https://www.linkedin.com/groups/13701855/
Similar to Discovering advanced materials for energy applications (with high-throughput computing and by mining the scientific literature) (20)
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
The document discusses applications of large language models (LLMs) in materials discovery and design. It describes how LLMs have improved natural language processing tasks related to materials science literature by requiring less custom model training and fine-tuning. As an example, the document discusses how LLMs were used to extract doping information from scientific papers and create a database of over 200,000 doped material compositions. The document suggests LLMs will continue enhancing materials databases and interfaces by integrating search and question-answering capabilities.
An AI-driven closed-loop facility for materials synthesisAnubhav Jain
The document summarizes an AI-driven closed-loop facility for materials synthesis using robotics, machine learning, and optimization algorithms. The facility aims to close the loop on rapid synthesis of new materials by using automated systems to synthesize compounds predicted by algorithms, characterize the results, and feed the data back to improve predictions. In less than 3 weeks, the facility synthesized 41 new chemical compositions out of 58 computationally predicted stable compounds. The facility is now collaborating with other groups to synthesize more complex materials, with the goal of accelerating the discovery of new materials through fully automated closed-loop synthesis and characterization.
Best practices for DuraMat software disseminationAnubhav Jain
The document provides best practices for disseminating software produced by DuraMat-funded projects. It discusses establishing standards and guidance for software produced by DuraMat to save time and effort in development and dissemination, and to provide consistency. The document outlines three levels of dissemination depending on the software's purpose and maturity. Level 1 is for one-off scripts, level 2 is for software used over a project's lifetime, and level 3 is for ongoing, community-maintained projects. Recommendations include documentation, licensing, and use of services like GitHub, Zenodo, and continuous integration tools.
Best practices for DuraMat software disseminationAnubhav Jain
The document provides best practices for disseminating software produced by DuraMat-funded projects. It discusses three levels of dissemination depending on the software's purpose and maturity. For all levels, it recommends documenting code, adding licenses, and hosting on GitHub. For more mature software, it suggests continuous integration, documentation, releases on Zenodo, and submitting to journals. The goal is to effectively share software, establish consistency, and give proper credit for products.
Efficient methods for accurately calculating thermoelectric properties – elec...Anubhav Jain
1) AMSET is a new method for efficiently calculating electronic transport properties from first principles that provides accurate results comparable to more computationally expensive methods.
2) HiPhive uses a data fitting approach to extract interatomic force constants from a small number of non-systematic displacement calculations, avoiding the need for many systematic calculations required by traditional methods to obtain phonon and thermal properties.
3) These new efficient methods enable high-throughput screening of thermoelectric materials by providing accurate transport properties while being computationally feasible for large numbers of materials.
Natural Language Processing for Data Extraction and Synthesizability Predicti...Anubhav Jain
This document discusses using natural language processing and machine learning techniques to extract and analyze synthesis recipes from materials science literature. It presents work using sequence-to-sequence models to extract entities and relationships for the synthesis of gold nanorods and bismuth ferrite from research papers. Decision trees trained on the extracted data are able to reproduce conclusions about the effects of synthesis parameters from literature. However, applying these techniques to predictive synthesis still faces challenges regarding reproducibility, missing information, and lack of negative examples in literature datasets.
This document summarizes a presentation on developing an electrochemical system for selenium removal from water. The project aims to apply machine learning and automated synthesis techniques to accelerate materials development timelines. Initial calculations have reproduced experimental trends for nitrate reduction and screened candidate materials from databases. Procedures have also been established for electrode preparation, testing, and using robots to synthesize predicted candidates. While still early, progress has been made on computational screening, mitigating competing reactions, and testing baseline cathode materials for selenium removal performance and energy efficiency. The remainder of the first project year will focus on refining methods before demonstrating a commercially viable selenium removal system in years two and three.
Accelerating New Materials Design with Supercomputing and Machine LearningAnubhav Jain
Anubhav Jain gave a presentation summarizing his career in materials science research from high school internships through his current role leading the Materials Project. During his PhD and Alvarez fellowship, he developed high-throughput workflows and open source software like FireWorks to automate materials calculations. This allowed him to launch the Materials Project database and scale it up over time with a growing team. The Materials Project has now screened over 180,000 materials and led to successful experimental validations of computational predictions.
DuraMat CO1 Central Data Resource: How it started, how it’s going …Anubhav Jain
The document summarizes several projects developed as part of the DuraMat CO1 Central Data Resource initiative to analyze photovoltaic performance and degradation data. A secure data portal was developed that currently hosts data from 239 users and 271 datasets. Software tools were also created, such as pvAnalytics for data cleaning and filtering, pvOps for operational and maintenance data analysis, and pv-vision for electroluminescence image analysis. These open source tools are publicly available and have helped advance the analysis of PV degradation through access to larger datasets. Overall, the projects have established a foundation for ongoing collaborative research on PV performance and lifetime under DuraMat 2.0.
The Materials Project is a multidisciplinary project with over 250,000 registered users that accelerates materials design. A small team generates data on specific materials using advanced computations and provides organization and dissemination of the data. Over 260,000 registered users can access the data for research and contribute their own experimental or theoretical data. The project continues to deliver new calculated data and works on improving accuracy, modeling magnetic orderings, vibrational properties, and non-ordered compounds. The Materials Project allows users to contribute their own data sets and integrate them with the core data through a new MPContribs capability.
Evaluating Chemical Composition and Crystal Structure Representations using t...Anubhav Jain
This document discusses the Matbench testing protocol for evaluating machine learning models for materials property prediction. Matbench contains 13 standardized tasks to compare different models. Several existing models have been tested, including those using composition features and graph neural networks using structural representations. While some tasks have seen significant improvement, others have seen little progress. The document suggests ways to improve Matbench, such as adding new materials classes, properties, and evaluation metrics to further benchmark progress and encourage development of better models.
Perspectives on chemical composition and crystal structure representations fr...Anubhav Jain
The document discusses the Matbench testing protocol for evaluating machine learning models for materials property prediction. It summarizes the 13 different machine learning tasks in Matbench and the various models that have been tested, including Magpie, Automatminer, MODNet, CGCNN, ALIGNN, and CRABNet. The document outlines ways Matbench could be further improved, such as including a greater diversity of tasks, changing the data splitting methodology, and incorporating active learning into the scoring. The overall goal of Matbench is to provide a standard way to evaluate new machine learning algorithms for materials property prediction and measure progress in the field.
Machine Learning Platform for Catalyst DesignAnubhav Jain
This project aims to develop new electrocatalyst materials for nitrate removal from water using machine learning and computational screening. The team performed calculations on over 1,000 potential compositions to identify promising catalysts with low costs. Experimental synthesis of candidates such as ZnNi and Zn3Co was attempted but it is unclear if the desired alloys were produced. The screening approach is now being applied to identify materials for selenium removal. If successful, low-cost catalysts could be developed to reduce the costs of electrocatalytic water treatment.
Applications of Natural Language Processing to Materials DesignAnubhav Jain
This document discusses using natural language processing (NLP) techniques to extract useful information from unstructured text sources in materials science literature. It describes how NLP models can be trained on large datasets of materials science publications to perform tasks like chemistry-aware search, summarizing material properties, and suggesting synthesis methods. The models are developed using techniques like word embeddings, LSTM networks, and named entity recognition. The goal is to organize materials science knowledge from text into a database called Matscholar to enable new applications of the information.
Assessing Factors Underpinning PV Degradation through Data AnalysisAnubhav Jain
The document discusses using PVPRO methods and large-scale data analysis to distinguish system and module degradation in PV systems. It involves 3 main tasks: 1) Developing an algorithm to detect off-maximum power point operation and compare it to existing tools. 2) Applying PVPRO to additional datasets to refine methods and perform degradation analysis on 25 large PV systems. 3) Connecting bill-of-materials data to degradation results from accelerated stress tests through data-driven analysis and publishing findings while anonymizing data.
Extracting and Making Use of Materials Data from Millions of Journal Articles...Anubhav Jain
- The document discusses using natural language processing techniques to extract materials data from millions of journal articles.
- It aims to organize the world's information on materials science by using NLP models to extract useful data from unstructured text sources like research literature in an automated manner.
- The process involves collecting raw text data, developing machine learning models to extract entities and relationships, and building search interfaces to make the extracted data accessible.
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
The document discusses the development of Matbench, a standardized benchmark for evaluating machine learning algorithms for materials property prediction. Matbench includes 13 standardized datasets covering a variety of materials prediction tasks. It employs a nested cross-validation procedure to evaluate algorithms and ranks submissions on an online leaderboard. This allows for reproducible evaluation and comparison of different algorithms. Matbench has provided insights into which algorithm types work best for certain prediction problems and has helped measure overall progress in the field. Future work aims to expand Matbench with more diverse datasets and evaluation procedures to better represent real-world materials design challenges.
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
1. The document discusses using natural language processing (NLP) algorithms to extract useful information from unstructured text sources in materials science literature to help organize the world's materials science information and enable new search and analysis capabilities.
2. It describes a project called Matscholar that applies NLP techniques like named entity recognition and relation extraction to millions of article abstracts to build a searchable database with summarized materials property and application data.
3. The approach involves collecting text sources, developing machine learning models trained on annotated examples to extract entities and relations, and integrating the extracted structured data with materials property databases to enable new search and analysis functions.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Discovering advanced materials for energy applications (with high-throughput computing and by mining the scientific literature)
1. Discovering advanced materials for energy
applications
(with high-throughput computing and by mining the scientific literature)
Anubhav Jain
Energy Technologies Area
Lawrence Berkeley National Laboratory
Berkeley, CA
ACM Meetup, Jan 2020
Slides (already) posted to hackingmaterials.lbl.gov
2. 2
Often, world-changing ideas are inhibited by the physical
properties of available materials at the time
Electric vehicles and solar power
are two technologies that had
been dreamed about for many
decades, yet are only seeing
wide adoption today
1910
1956
3. • Often, materials are known for several decades
before their functional applications are known
– MgB2 sitting on lab shelves for 50 years before its
identification as a superconductor in 2001
– LiFePO4 known since 1938, only identified as a Li-ion
battery cathode in 1997
• Even after discovery, optimization and
commercialization still take decades
• To get a sense for why this is so hard, let’s look at
the problem in more detail …
3
Typically, both new materials discovery and optimization
take decades
4. 4
A material is defined at multiple length scales –
stick to the fundamental scale for now
5. 5
A material is defined at multiple length scales –
stick to the fundamental scale for now
6. 6
Atoms in a box – the materials universe is huge!
• Bag of 30 atoms
• Each atom is one of 50
elements
• Arrange on 10x10x10 lattice
• Over 10108 possibilities!
– more than grains of sand on all
beaches (1021)
– more than number of atoms in
universe (1080)
8. What constrains traditional approaches to materials design?
8
“[The Chevrel] discovery resulted from a lot of
unsuccessful experiments of Mg ions insertion
into well-known hosts for Li+ ions insertion, as
well as from the thorough literature analysis
concerning the possibility of divalent ions
intercalation into inorganic materials.”
-Aurbach group, on discovery of Chevrel cathode
for multivalent (e.g., Mg2+) batteries
Levi, Levi, Chasid, Aurbach
J. Electroceramics (2009)
9. • Materials are:
– Important – constrain what’s possible in the physical
world
– Difficult to design – many, many possibilities
– Ripe for new ways of approaching the problem
9
Why do we need new ways of designing materials?
10. 10
Researchers are starting to fundamentally re-think how we
invent the materials that make up our devices
Next-
generation
materials
design
Computer-
aided
materials
design
Natural
language
processing
“Self-driving
laboratories”
11. 11
Today, computer aided design of products is ubiquitous –
but what are the governing equations to model materials?
12. Materials physics is determined by quantum mechanics
12
−!2
2m
∇2
Ψ(r)+V (r)Ψ(r) = EΨ(r)
Schrödinger equation describes all the properties
of a system through the wavefunction:
Time-independent, non-relativistic Schrödinger equation
13. • There aren’t too many real situations where we can
get a closed solution to the Schrödinger equation
• Let’s pretend we want to approach things
numerically for 1000 electrons
– There are ~500,000 electron-electron interactions to worry
about.
– Even storing the wavefunction would take ~101000 GB!
• Discretize the x,y,z, position of each electron into a 1000-
element grid = 1 billion positions per electron
• Need the wavefunction output (real + complex part) for each
combination of all electron positions, i.e. 1E9 ^ (1000) * 2, or
2E9000 values
• even at 1 byte per wavefunction value (low resolution), you have
about 2E1000 GB needed needed to store the wavefunction!
13
The wave function is formidable
14. Maybe Dirac said it best …
14
“The underlying physical laws necessary
for the mathematical theory of a large part
of physics and the whole of chemistry are
thus completely known, and the difficulty
is only that the exact application of these
laws leads to equations much too
complicated to be soluble.”
“It therefore becomes desirable that
approximate practical methods of applying
quantum mechanics should be developed,
which can lead to an explanation of the
main features of complex atomic systems
without too much computation.”
15. What is density functional theory (DFT)?
15
DFT is a method solve for the electronic structure and energetics of arbitrary
materials starting from first-principles. It replaces many-body interactions with
a mean field interaction that reproduces the same charge density.
In theory, it is exact for the ground state. In practice, accuracy depends on the
choice of (some) parameters, the type of material, the property to be studied,
and whether the simulated system (crystal) is a good approximation of reality.
DFT resulted in the 1999 Nobel Prize for chemistry (W. Kohn). It is responsible
for 2 of the top 10 cited papers of all time, across all sciences.
e–e–
e– e–
e– e–
16. How does one use DFT to design new materials?
16
A. Jain, Y. Shin, and K. A.
Persson, Nat. Rev. Mater.
1, 15004 (2016).
17. • System size is essentially limited to a few thousand atoms
– many important materials phenomena simply do not occur at this
length scale; other techniques available with reduced accuracy
• Certain materials, such as those with strong electron
correlation, remain difficult to model accurately
• Certain properties, including excited state properties
such as band gap, remain difficult to model accurately
• These are all active areas of research and improvement to
the theory, and the situation is improving on all fronts
17
Limitations of density functional theory
18. • Ok, so we have a computational model now that
allows us to assemble atoms in a computer and
predict their physical properties
• What next?
18
19. A big advantage of computational modeling is that it can be
automated – so we can screen many ideas in parallel
19
Automate the DFT
procedure
Supercomputing
Power
FireWorks
Software for programming
general computational
workflows that can be
scaled across large
supercomputers.
NERSC
Supercomputing center,
processor count is
~100,000 desktop
machines. Other centers
are also viable.
High-throughput
materials screening
G. Ceder & K.A.
Persson, Scientific
American (2015)
S. Kirklin et al., Acta Mater. 102 (2016) 125-135
20. • The answer is “it really varies a lot”
– how big / complicated are the materials you are modeling?
– how complex / expensive are the physical properties you
are trying to predict?
• Ballpark numbers:
– Low range: optimize structure of ~3-atom compounds
• time to do a million materials ~ 10 million core-hours
– Medium range: bulk modulus of ~50 atom compounds
• time to do a million materials ~ 2 billion core-hours
– The “high range” can go almost as high as you’d like …
• A “tiered” screening strategy is common
20
How much computer time is needed for
high-throughput DFT?
21. Example of high-throughput materials screening:
Li ion battery cathodes
21
anode electrolyte cathode
Li+ discharge
e- discharge
e.g.
graphitic carbon
e.g.
LiPF6 / (EC/DMC)
e.g.
LiCoO2
LiFePO4
Li+ charge
e- charge
22. The cathode material is like a Li sponge (on the atomic scale)
The cathode material must quickly
absorb and release large
quantities of Li without
degrading
It must be cost-effective and safe
It should be light, compact, and
highly absorbent (high voltage)
22
23. Anatomy of a cathode composition
Lia Mb (XYc)d
Li ion
source
electron
donor /
acceptor
structural
framework /
charge neutrality
examples:
V4+/5+,Fe2+/3+
examples:
O2-, (PO4)3-, (SiO4)4-
common cathodes: LiCoO2, LiMn2O4, LiFePO4 23
24. Calculate average voltage by computing energy differences
in structures w/ or w/o Li
24
24
GGA+U
results
Li
avg
OC
xF
G
V
D
D
= - [ + ]
E (Li Mn O2) - [ E (MnO2) + E (Li) ]
ΔG ~
25. Diffusion via Nudged Elastic Band
Hexagonal phase
low Li 529 meV
high Li 723 meV
monoclinic phase
low Li 395 meV
high Li 509 meV
• 525 meV means a micron-sized
particle can be charged in 2 hours
• Every 60 meV difference represents
a10X difference in diffusion coefficient
Kim, Moore, Kang,
Hautier, Jain, Ceder
J ECS (2011)
LiMnBO3
27. New mixed phosphate-pyrophosphate
Chemistry Novelty Energy density
vs. LiFePO4
% of theoretical capacity
already achieved in the lab
Li9V3(P2O7)3(PO4)2 New 20% greater ~65%
Origin:
V to Fe substitution in Li9Fe3(P2O7)3(PO4)2*
Remarks:
• Structure has “layers” and “tunnels”
• Pyrophosphate-phosphate mixture
• Potential 2-electron material
Jain, Hautier, Moore, Kang, Lee,
Chen, Twu, and Ceder
Journal of The Electrochemical Society
159, A622–A633 (2012).
27
C/35 at RT
2.0mg
3.0V – 4.7V
28. One can apply this template to many different applications
28
Sidorenkite-based Li-ion battery
cathodes
YCuTe2 thermoelectrics
Chen, H.; Hao, Q.; Zivkovic, O.; Hautier, G.; Du, L.-S.;
Tang, Y.; Hu, Y.-Y.; Ma, X.; Grey, C. P.; Ceder, G.
Sidorenkite (Na3MnPO4CO3): A New Intercalation
Cathode Material for Na-Ion Batteries, Chem. Mater., 2013
Aydemir, U; Pohls, J-H; Zhu, H; Hautier, G; Bajaj, S; Gibbs,
ZM; Chen, W; Li, G; Broberg, D; White, MA; Asta, M;
Persson, K; Ceder, G; Jain, A; Snyder, GJ. Thermoelectric
Properties of Intrinsically Doped YCuTe2 with CuTe4-
based Layered Structure. J. Mat. Chem C, 2016
More examples here: A. Jain, Y. Shin, and K. A. Persson, Nat. Rev. Mater. 1, 15004 (2016).
Li-M-O CO2 capture compounds
Dunstan, M. T., Jain, A., Liu, W., Ong, S. P., Liu, T., Lee,
J., Persson, K. A., Scott, S. A., Dennis, J. S. & Grey, C. .
Energy and Environmental Science (2016)
29. 29
Examples of experimentally-confirmed materials designed
with DFT (1)
Jain, A., Shin, Y., Persson, K.A., 2016. Computational predictions of energy materials using density functional theory.
Nature Reviews Materials 1, 15004.
30. 30
Examples of experimentally-confirmed materials designed
with DFT (2)
Jain, A., Shin, Y., Persson, K.A., 2016. Computational predictions of energy materials using density functional theory.
Nature Reviews Materials 1, 15004.
31. • This information is much harder to find, but:
– New alkaline battery from Duracell with assist from high-throughput
screening from Computational Modeling Consultants
• (based on personal communication)
– New alloys for watch and phones from Apple with assist from computational
alloy design by Questek
• https://www.americaninno.com/chicago/inside-the-small-evanston-company-whose-
tech-was-acquired-by-apple-and-used-by-spacex/
– New alloys for 3D printing with guidance from ML-based models from
Citrine
• https://citrine.io/media-post/aluminum-alloy-designed-using-citrine-platform-becomes-
first-ever-officially-registered-for-3d-printing/
– New phosphor materials from Lumenari with guidance from MaterialsQM
Consulting
• (own work)
31
How about commercial impact?
32. 32
Today, DFT is often used within a pipeline that includes
machine learning – but that is a separate talk …
Machine learning /
optimization
High-throughput DFT
Expensive calculation
Experiment
Training
data
Compounds to
screen
external databases
(DFT or expt)
33. 33
Researchers are starting to fundamentally re-think how we
invent the materials that make up our devices
Next-
generation
materials
design
Computer-
aided
materials
design
Natural
language
processing
“Self-driving
laboratories”
34. 34
Can ML help us work through our backlog of information we
need to assimilate from text sources?
papers to read “someday”
NLP algorithms
35. Extracted ~2 million
abstracts of relevant
scientific articles
Use natural language
processing algorithms
to try to extract
knowledge from all this
data
35
Use computers to parse research abstracts on our behalf
36. 36
Algorithms to automatically identify keywords in the
abstracts based on word2vec and LSTM networks
Weston, L. et al Named Entity
Recognition and Normalization
Applied to Large-Scale
Information Extraction from
the Materials Science
Literature. J. Chem. Inf. Model.
(2019).
37. 37
Named entity recognition to detect materials, applications,
etc.
Named Entity Recognition
X
• Custom machine learning models to
extract the most valuable materials-related
information.
• Utilizes a long short-term memory (LSTM)
network trained on ~1000 hand-annotated
abstracts.
• f1 scores of ~0.9. f1 score for inorganic
materials extraction is >0.9.
Weston, L., et al. J. Chem. Inf. Model. (2019).
doi:10.1021/acs.jcim.9b00470
41. 41
Could these techniques also be used to predict which
materials we might want to screen for an application?
papers to read “someday”
NLP algorithms
42. • We use the word2vec
algorithm (Google) to turn
each unique word in our
corpus into a 200-
dimensional vector
• These vectors encode the
meaning of each word
meaning based on trying to
predict context words
around the target
42
Key concept 1: the word2vec algorithm
Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017
43. • We use the word2vec
algorithm (Google) to turn
each unique word in our
corpus into a 200-
dimensional vector
• These vectors encode the
meaning of each word
meaning based on trying to
predict context words
around the target
43
Key concept 1: the word2vec algorithm
Barazza, L. How does Word2Vec’s Skip-Gram work? Becominghuman.ai. 2017
“You shall know a word by
the company it keeps”
- John Rupert Firth (1957)
44. • The classic example is:
– “king” - “man” + “woman” = ? → “queen”
44
Word embeddings trained on ”normal” text learns
relationships between words
45. 45
When trained on materals science abstracts,
word2vec learns scientific concepts
crystal structures and principal
oxides of the elements
“word
embedding”
periodic table
Tshitoyan, V. et al. Unsupervised word embeddings capture latent
knowledge from materials science literature. Nature 571, 95–98 (2019).
46. • Dot product of a composition word with
the word “thermoelectric” essentially
predicts how likely that word is to appear
in an abstract with the word
thermoelectric
• Compositions with high dot products are
typically known thermoelectrics
• Sometimes, compositions have a high dot
product with “thermoelectric” but have
never been studied as a thermoelectric
• These compositions usually have high
computed power factors!
(DFT+BoltzTraP)
46
Key concept 2: vector dot products can be used to predict
which words might co-occur in abstracts
Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from
materials science literature. Nature 571, 95–98 (2019).
47. “Go back in time”
approach:
– For every year since
2001, see which
compounds we would
have predicted using only
literature data until that
point in time
– Make predictions of what
materials are the most
promising thermoelectrics
for data until that year
– See if those materials
were actually studied as
thermoelectrics in
subsequent years 47
Can we predict future thermoelectrics discoveries with this
method?
Tshitoyan, V. et al. Unsupervised word embeddings capture
latent knowledge from materials science literature. Nature
571, 95–98 (2019).
48. • Thus far, 2 of our top 20 predictions made in
~August 2018 have already been reported in the
literature for the first time as thermoelectrics
– Li3Sb was the subject of a computational study
(predicted zT=2.42) in Oct 2018
– SnTe2 was experimentally found to be a moderately
good thermoelectric (expt zT=0.71) in Dec 2018
• We are working with an experimentalist on one
of the predictions (but ”spare time” project)
48
How about “forward” predictions?
[1] Yang et al. "Low lattice thermal conductivity and
excellent thermoelectric behavior in Li3Sb and Li3Bi."
Journal of Physics: Condensed Matter 30.42 (2018):
425401
[2] Wang et al. "Ultralow lattice thermal conductivity and
electronic properties of monolayer 1T phase semimetal
SiTe2 and SnTe2." Physica E: Low-dimensional Systems and
Nanostructures 108 (2019): 53-59
49. 49
How is this working?
“Context
words” link
together
information
from different
sources
50. • Developing new materials is of fundamental
importance to realizing new physical
technologies
• Today, it possible to start designing phases of
matter in a computer (or supercomputer)
• New advancements in computation and machine
learning will bring us closer to being able to
design new substances from our desks
50
Conclusions
51. 51
Acknowledgements
Slides (already) posted to hackingmaterials.lbl.gov
• High-throughput DFT
– Gerbrand Ceder and “BURP” team
– Funding: Bosch / Umicore
• Natural language processing
– Gerbrand Ceder, Kristin Persson, and “Matscholar” team
– Funding: Toyota Research Institutes
• Overall work funded by US Department of Energy