SlideShare a Scribd company logo
1 of 40
Download to read offline
The Materials Project
Ecosystem
A Complete Software and Data
Platform for Materials Informatics
Shyue Ping Ong, University of California, San Diego
“Information wants to be free.”
– Steward Brand, 1960s
“Information wants to be free and
code wants to be wrong.”
– RSA Conference 2008
“Materials information and code
wants to be free and right.”
The Materials Project is an open science
project to make the computed properties of
all known inorganic materials publicly
available to all researchers to accelerate
materials innovation.
June 2011: Materials Genome Initiative which
aims to “fund computational tools, software, new
methods for material characterization, and the
development of open standards and databases that
will make the process of discovery and development
of advanced materials faster, less expensive, and
more predictable”
https://www.materialsproject.org
As of Jun 5 2015
q  Over 58,000 unique
compounds, and growing
q  Diverse set of many
properties
q Structural (lattice parameters,
atomic positions, etc.),
q Energetic (formation
energies, phase stability, etc.)
q Electronic structure (DOS,
Bandstructures)
q Elastic constants
q  Suite of Web Apps for
materials analysis
User-friendly Web Apps
Materials Explorer: Search for materials by formula,
elements or properties
Battery Explorer: Search for battery materials by
voltage, capacity and other properties
Crystal Toolkit: Design new materials from existing
materials
Structure Predictor: Predict novel structures
Phase Diagram App: Generate compositional and
grand canonical phase diagrams
Pourbaix Diagram App: Generate Pourbaix
diagrams
Reaction Calculator: Balance reactions and calculate
their enthalpies
Materials Project data in User papers
M. Meinert, M.P. Geisler, Phase stability of chromium based
compensated ferrimagnets with inverse Heusler structure, J.
Magn. Magn. Mater. 341 (2013) 72–74.
J. Rustad, Density functional calculations of the enthalpies of
formation of rare-earth orthophosphates, Am. Mineral. 97
(2012) 791–799.
M. Fondell, T.J. Jacobsson, M. Boman, T. Edvinsson, Optical
quantum confinement in low dimensional hematite, J. Mater.
Chem. A. 2 (2014) 3352.
Web frontend is only the tip of the iceberg…
pymatgen
FireWorks
REST API
custodian
MPWorks
MPEnv
rubicon
Hierarchical design of codebases
keeps infrastructure nimble to changes
WORKFLOW CODE
CHEMISTRY CODE
Many types of use cases
FireWorks pymatgen custodian MPWorks
Crystal workflows
FireWorks pymatgen custodian rubicon (private)
Molecule workflows
pymatgen
FireWorks
external
MAST, MaterialsHub
external
Berlin ML, JGI, MoDeNa
Sustainable software development
¨  Open-source
¤  Managed via
¤  More eyes => robustness
¤  Contributions from all over the world
¨  Benevolent dictators
¤  Unified vision
¤  Quality control
¨  Clear documentation
¤  Prevent code rot
¤  More users
¨  Continuous integration and testing
¤  Ensure code is always working
Python Materials Genomics (pymatgen)
¨  Core materials analysis powering the Materials
Project
¨  Defines core extensible Python objects for materials
data representation.
¨  Provides a robust and well-documented set of
structure and thermodynamic analysis tools relevant to
many applications.
¨  Establishes an open platform for researchers to
collaboratively develop sophisticated analyses of
materials data.
Extensive Materials Analysis Capabilities
Input/
Output
objects
(Modular, Reusable, Extendable)
Defects and TransformationsElectronic Structure
XRD Patterns
Phase and Pourbaix Diagrams
Functional properties
Comprehensively
documented
Continuously tested
and integrated
Active dev/user community
www.pymatgen.org stats
•  > 6000 views per month on average
•  (~50% increase from previous year)
V2.9.12 è v3.0.13
*Python 2/3 compatible!
Other improvements
•  ABINIT support
•  Defects (Haranczyk/LBNL)
•  Qchem (JCESR)
•  Bug fixes & improvements
Very active user community!
81 forks (developers making changes and contributing)
Actual commits has slowed somewhat, as expected for
a maturing and robust code base.
Pymatgen-db
¨  Database add-on for pymatgen. Enables the
creation of Materials Project-style MongoDB
(www.mongodb.org) databases for management of
materials data. Key features:
¤  Query engine for easy translation of MongoDB docs to
useful pymatgen objects for analysis purposes.
¤  Includes a clean and intuitive web ui (the Materials
Genomics UI) for exploring Mongo collections.
¤  http://pythonhosted.org//pymatgen-db/
Custodian
¨  Simple, robust and flexible just-in-time
(JIT) job management framework.
¤  Wrappers to perform error checking,
job management and error recovery.
¤  Error recovery is an important aspect
for HT: O(100,000) jobs + 1% error
rate => O(1000) errored jobs.
¤  Existing sub-packages for error
handling for VASP, NwChem and
QChem calculations.
¨  Blue: Controlled by subclasses of Job
¨  Red: Defined by ErrorHandlers.
Concrete Example for VASP
calculations
¨  Extensive set of rules have been codified for running VASP
calculations
¨  Significantly reduces error rate of calculations (< 1%)
VaspJob class
¨  auto_npar: automatically modifies NPAR in INCAR to a
relatively optimal number based on detected number of
processors! Enhances vasp calculation efficiency by ~10-30%!!!
¨  auto_gamma: If this is a gamma-only calculation and a
gamma compiled version of vasp exists, use it. Another
10-20% increase in efficiency!
¨  Even without error handling, custodian already significantly
improves resource utilization of running VASP calculations!
VaspJob(vasp_cmd, output_file="vasp.out”,
auto_npar=True, auto_gamma=True,
…<other options>...)
FireWorks is the Workflow Manager
21	
  
Custom material
A cool material !!
Lots of information about
cool material !!
Submit!	
  
Input generation
(parameter choice) Workflow mapping
Supercomputer
submission /
monitoring
Error
handling File Transfer
File Parsing /
DB insertion
FireWorks as a platform
Community can write any
workflow in FireWorks
à
We can automate it over
most supercomputing
resources
structure
charge
Band
structure
DOS
Optical
phonons
XAFS
spectra
GW
Workflows in Development by Internal/
External Collaborations
¨  Elastic constants (in production)
¨  Thermal properties (Phonon / GIBBS: in testing)
¨  Surfaces (in testing)
¨  GW / hybrid calculations
¨  ABINIT workflows (Geoffroy Hautier, UCL)
¨  Any code can be added and automated
Materials
Project DB
How do I
access MP
data?
Materials
Project DB
How do I
access MP
data?
Option 1: Direct access
Most flexible and powerful, but
•  User needs to know db language
•  Security is an issue
•  Fragile – if db tech or schema
changes, user’s analysis breaks
Materials
Project DB
How do I
access MP
data?
Option 2: Web Apps
Pros
•  Intuitive and user-friendly
•  Secure
Cons
•  Significant loss in flexibility
and power
WebApps
Materials
Project DB
How do I
access MP
data?
Option 3: Web Apps
built on RESTful API
Pros
•  Intuitive and user-friendly
•  Secure
WebApps
RESTfulAPI
•  Programmatic access for developers
and researchers
The Materials API
An open platform for accessing Materials
Project data based on REpresentational State
Transfer (REST) principles.
Flexible and scalable to cater to large
number of users, with different access
privileges.
Simple to use and code agnostic.
A REST API maps a URL to a resource.
Example:
GET https://api.dropbox.com/1/account/info
Returns information about a user’s account.
Methods: GET, POST, PUT, DELETE, etc.
Response: Usually JSON or XML or both
Who implements REST APIs?
https://www.materialsproject.org/rest/v2/materials/Fe2O3/vasp/energy
Preamble
Identifier, typically a
formula (Fe2O3), id
(1234) or chemical
system (Li-Fe-O)
Data type (vasp,
exp, etc.)
Property
Request
type
Secure access
An individual API key provides secure access
with defined privileges.
All https requests must supply API key as
either a “x-api-key” header or a GET/POST
“API_KEY” parameter.
API key available at
https://www.materialsproject.org/dashboard
Sample output (JSON)
¨  Intuitive response
format
¨  Machine-readable
(JSON parsers
available for most
programming
languages)
¨  Metadata provides
provenance for
tracking
{
}
created_at: "2014-07-18T11:23:25.415382",
valid_response: true,
version: {
},
-
pymatgen: "2.9.9",
db: "2014.04.18",
rest: "1.0"
response: [
],
-
{
},
-
energy: -67.16532048,
material_id: "mp-24972"
{
},
-
energy: -132.33035197,
material_id: "mp-542309"
{…},+
{…},+
{…},+
{…},+
{…},+
{…},+
{…},+
{…}+
copyright: "Materials Project, 2012"
Can I really access any piece of data
in the Materials Project?
Github-powered RESTful documentation
http://bit.ly/materialsapi
Via the shockingly powerful
https://www.materialsproject.org/rest/v2/query
Demo
http://localhost:8888/notebooks
The Materials API + pymatgen in Education
– UCSD’s NANO 106
¨  Data mined over the Materials Project’s 49,000+ unique
crystals
http://www.bit.ly/sg_stats
P21/c is the most common
space group, comprising
~9.8% of all compounds
The Materials Virtual Lab @ UCSD’s
One-click AIMD
Starting candidates
Topological Screening
(augmented by DFT)
Stability (phase &
EW) screening
Diffusivity
Optimized
candidates
Automated “one-click” MD
workflow based on pymatgen,
custodian and fireworks
AIMD SDSC
Multi-week AIMD simulation
Statistical exclusionary
screening
Y. Mo, S. P. Ong, G. Ceder, “Insights into Diffusion Mechanisms in P2
Layered Oxide Materials by First-Principles Calculations”, submitted
Automated pathway
extraction + NEB
Coming soon (full
launch in next few
weeks)!!
Sounds good, where do I learn more?
¨  The Materials Project
¤  https://www.materialsproject.org/open
¨  The Materials API Github Doc
¤  http://bit.ly/materialsapi
¨  The Materials Virtual Lab (MAVRL) @ UCSD
¤  Slides from Workshop on MP infrastructure (
http://mavrl.org/software)
Thank you.

More Related Content

What's hot

Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in RAnqi Fu
 
PigSPARQL: A SPARQL Query Processing Baseline for Big Data
PigSPARQL: A SPARQL Query Processing Baseline for Big DataPigSPARQL: A SPARQL Query Processing Baseline for Big Data
PigSPARQL: A SPARQL Query Processing Baseline for Big DataAlexander Schätzle
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanBoris Glavic
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationTanu Malik
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflowsmarpierc
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14Sri Ambati
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overviewmarpierc
 
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...jaxLondonConference
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMfnothaft
 
The Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentThe Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentRutger Vos
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute modelDean Wampler
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkSri Ambati
 
H2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangH2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangSri Ambati
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Kim Herzig
 
Ase2010 shang
Ase2010 shangAse2010 shang
Ase2010 shangSAIL_QU
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSSumant Tambe
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsTanu Malik
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Supportmarpierc
 

What's hot (20)

Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
PigSPARQL: A SPARQL Query Processing Baseline for Big Data
PigSPARQL: A SPARQL Query Processing Baseline for Big DataPigSPARQL: A SPARQL Query Processing Baseline for Big Data
PigSPARQL: A SPARQL Query Processing Baseline for Big Data
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database Virtualization
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
 
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
The Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentThe Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environment
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute model
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
 
H2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangH2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy Wang
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)
 
Ase2010 shang
Ase2010 shangAse2010 shang
Ase2010 shang
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJS
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
 

Viewers also liked

Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLAnubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Anubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Combining High-Throughput Computing and Statistical Learning to Develop and U...
Combining High-Throughput Computing and Statistical Learning to Develop and U...Combining High-Throughput Computing and Statistical Learning to Develop and U...
Combining High-Throughput Computing and Statistical Learning to Develop and U...Anubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science researchAnubhav Jain
 
Application of the Materials Project database and data mining towards the des...
Application of the Materials Project database and data mining towards the des...Application of the Materials Project database and data mining towards the des...
Application of the Materials Project database and data mining towards the des...Anubhav Jain
 

Viewers also liked (8)

Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Combining High-Throughput Computing and Statistical Learning to Develop and U...
Combining High-Throughput Computing and Statistical Learning to Develop and U...Combining High-Throughput Computing and Statistical Learning to Develop and U...
Combining High-Throughput Computing and Statistical Learning to Develop and U...
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
 
Application of the Materials Project database and data mining towards the des...
Application of the Materials Project database and data mining towards the des...Application of the Materials Project database and data mining towards the des...
Application of the Materials Project database and data mining towards the des...
 

Similar to The Materials Project Ecosystem - A Complete Software and Data Platform for Materials Informatics

Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Anubhav Jain
 
Scientific
Scientific Scientific
Scientific marpierc
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Herman Wu
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"Pinar Alper
 
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...Shadab Ali Khan
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowKaxil Naik
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
Making project data avalialble eNanomapper through Database
Making project data avalialble eNanomapper through  DatabaseMaking project data avalialble eNanomapper through  Database
Making project data avalialble eNanomapper through DatabaseNina Jeliazkova
 
Data munging and analysis
Data munging and analysisData munging and analysis
Data munging and analysisRaminder Singh
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorialmarpierc
 
jlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STARjlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STARJonathan Lettvin
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningAnubhav Jain
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddhartha Mitra
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
Ogce Workflow Suite
Ogce Workflow SuiteOgce Workflow Suite
Ogce Workflow Suitesmarru
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBDeep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBArangoDB Database
 

Similar to The Materials Project Ecosystem - A Complete Software and Data Platform for Materials Informatics (20)

Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 
Scientific
Scientific Scientific
Scientific
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
Making project data avalialble eNanomapper through Database
Making project data avalialble eNanomapper through  DatabaseMaking project data avalialble eNanomapper through  Database
Making project data avalialble eNanomapper through Database
 
Data munging and analysis
Data munging and analysisData munging and analysis
Data munging and analysis
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
 
jlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STARjlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STAR
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdf
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
Ogce Workflow Suite
Ogce Workflow SuiteOgce Workflow Suite
Ogce Workflow Suite
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
VictorCassen
VictorCassenVictorCassen
VictorCassen
 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBDeep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
 

More from University of California, San Diego

NANO281 Lecture 01 - Introduction to Data Science in Materials Science
NANO281 Lecture 01 - Introduction to Data Science in Materials ScienceNANO281 Lecture 01 - Introduction to Data Science in Materials Science
NANO281 Lecture 01 - Introduction to Data Science in Materials ScienceUniversity of California, San Diego
 
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...University of California, San Diego
 
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal Structures
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal StructuresUCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal Structures
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal StructuresUniversity of California, San Diego
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designUniversity of California, San Diego
 
UCSD NANO106 - 08 - Principal Directions and Representation Quadrics
UCSD NANO106 - 08 - Principal Directions and Representation QuadricsUCSD NANO106 - 08 - Principal Directions and Representation Quadrics
UCSD NANO106 - 08 - Principal Directions and Representation QuadricsUniversity of California, San Diego
 

More from University of California, San Diego (20)

A*STAR Webinar on The AI Revolution in Materials Science
A*STAR Webinar on The AI Revolution in Materials ScienceA*STAR Webinar on The AI Revolution in Materials Science
A*STAR Webinar on The AI Revolution in Materials Science
 
NANO281 Lecture 01 - Introduction to Data Science in Materials Science
NANO281 Lecture 01 - Introduction to Data Science in Materials ScienceNANO281 Lecture 01 - Introduction to Data Science in Materials Science
NANO281 Lecture 01 - Introduction to Data Science in Materials Science
 
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...
Creating It from Bit - Designing Materials by Integrating Quantum Mechanics, ...
 
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal Structures
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal StructuresUCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal Structures
UCSD NANO106 - 13 - Other Diffraction Techniques and Common Crystal Structures
 
NANO266 - Lecture 14 - Transition state modeling
NANO266 - Lecture 14 - Transition state modelingNANO266 - Lecture 14 - Transition state modeling
NANO266 - Lecture 14 - Transition state modeling
 
NANO266 - Lecture 13 - Ab initio molecular dyanmics
NANO266 - Lecture 13 - Ab initio molecular dyanmicsNANO266 - Lecture 13 - Ab initio molecular dyanmics
NANO266 - Lecture 13 - Ab initio molecular dyanmics
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials design
 
NANO266 - Lecture 11 - Surfaces and Interfaces
NANO266 - Lecture 11 - Surfaces and InterfacesNANO266 - Lecture 11 - Surfaces and Interfaces
NANO266 - Lecture 11 - Surfaces and Interfaces
 
NANO266 - Lecture 10 - Temperature
NANO266 - Lecture 10 - TemperatureNANO266 - Lecture 10 - Temperature
NANO266 - Lecture 10 - Temperature
 
UCSD NANO106 - 12 - X-ray diffraction
UCSD NANO106 - 12 - X-ray diffractionUCSD NANO106 - 12 - X-ray diffraction
UCSD NANO106 - 12 - X-ray diffraction
 
UCSD NANO106 - 11 - X-rays and their interaction with matter
UCSD NANO106 - 11 - X-rays and their interaction with matterUCSD NANO106 - 11 - X-rays and their interaction with matter
UCSD NANO106 - 11 - X-rays and their interaction with matter
 
UCSD NANO106 - 10 - Bonding in Materials
UCSD NANO106 - 10 - Bonding in MaterialsUCSD NANO106 - 10 - Bonding in Materials
UCSD NANO106 - 10 - Bonding in Materials
 
UCSD NANO106 - 09 - Piezoelectricity and Elasticity
UCSD NANO106 - 09 - Piezoelectricity and ElasticityUCSD NANO106 - 09 - Piezoelectricity and Elasticity
UCSD NANO106 - 09 - Piezoelectricity and Elasticity
 
UCSD NANO106 - 08 - Principal Directions and Representation Quadrics
UCSD NANO106 - 08 - Principal Directions and Representation QuadricsUCSD NANO106 - 08 - Principal Directions and Representation Quadrics
UCSD NANO106 - 08 - Principal Directions and Representation Quadrics
 
UCSD NANO106 - 07 - Material properties and tensors
UCSD NANO106 - 07 - Material properties and tensorsUCSD NANO106 - 07 - Material properties and tensors
UCSD NANO106 - 07 - Material properties and tensors
 
NANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling TradeNANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling Trade
 
NANO266 - Lecture 8 - Properties of Periodic Solids
NANO266 - Lecture 8 - Properties of Periodic SolidsNANO266 - Lecture 8 - Properties of Periodic Solids
NANO266 - Lecture 8 - Properties of Periodic Solids
 
NANO266 - Lecture 7 - QM Modeling of Periodic Structures
NANO266 - Lecture 7 - QM Modeling of Periodic StructuresNANO266 - Lecture 7 - QM Modeling of Periodic Structures
NANO266 - Lecture 7 - QM Modeling of Periodic Structures
 
UCSD NANO106 - 06 - Plane and Space Groups
UCSD NANO106 - 06 - Plane and Space GroupsUCSD NANO106 - 06 - Plane and Space Groups
UCSD NANO106 - 06 - Plane and Space Groups
 
UCSD NANO106 - 05 - Group Symmetry and the 32 Point Groups
UCSD NANO106 - 05 - Group Symmetry and the 32 Point GroupsUCSD NANO106 - 05 - Group Symmetry and the 32 Point Groups
UCSD NANO106 - 05 - Group Symmetry and the 32 Point Groups
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxSilpa
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body Areesha Ahmad
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxSilpa
 

Recently uploaded (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 

The Materials Project Ecosystem - A Complete Software and Data Platform for Materials Informatics

  • 1. The Materials Project Ecosystem A Complete Software and Data Platform for Materials Informatics Shyue Ping Ong, University of California, San Diego
  • 2. “Information wants to be free.” – Steward Brand, 1960s
  • 3. “Information wants to be free and code wants to be wrong.” – RSA Conference 2008
  • 4. “Materials information and code wants to be free and right.”
  • 5. The Materials Project is an open science project to make the computed properties of all known inorganic materials publicly available to all researchers to accelerate materials innovation. June 2011: Materials Genome Initiative which aims to “fund computational tools, software, new methods for material characterization, and the development of open standards and databases that will make the process of discovery and development of advanced materials faster, less expensive, and more predictable” https://www.materialsproject.org
  • 6. As of Jun 5 2015 q  Over 58,000 unique compounds, and growing q  Diverse set of many properties q Structural (lattice parameters, atomic positions, etc.), q Energetic (formation energies, phase stability, etc.) q Electronic structure (DOS, Bandstructures) q Elastic constants q  Suite of Web Apps for materials analysis
  • 7. User-friendly Web Apps Materials Explorer: Search for materials by formula, elements or properties Battery Explorer: Search for battery materials by voltage, capacity and other properties Crystal Toolkit: Design new materials from existing materials Structure Predictor: Predict novel structures Phase Diagram App: Generate compositional and grand canonical phase diagrams Pourbaix Diagram App: Generate Pourbaix diagrams Reaction Calculator: Balance reactions and calculate their enthalpies
  • 8. Materials Project data in User papers M. Meinert, M.P. Geisler, Phase stability of chromium based compensated ferrimagnets with inverse Heusler structure, J. Magn. Magn. Mater. 341 (2013) 72–74. J. Rustad, Density functional calculations of the enthalpies of formation of rare-earth orthophosphates, Am. Mineral. 97 (2012) 791–799. M. Fondell, T.J. Jacobsson, M. Boman, T. Edvinsson, Optical quantum confinement in low dimensional hematite, J. Mater. Chem. A. 2 (2014) 3352.
  • 9. Web frontend is only the tip of the iceberg… pymatgen FireWorks REST API custodian MPWorks MPEnv rubicon
  • 10.
  • 11. Hierarchical design of codebases keeps infrastructure nimble to changes WORKFLOW CODE CHEMISTRY CODE
  • 12. Many types of use cases FireWorks pymatgen custodian MPWorks Crystal workflows FireWorks pymatgen custodian rubicon (private) Molecule workflows pymatgen FireWorks external MAST, MaterialsHub external Berlin ML, JGI, MoDeNa
  • 13. Sustainable software development ¨  Open-source ¤  Managed via ¤  More eyes => robustness ¤  Contributions from all over the world ¨  Benevolent dictators ¤  Unified vision ¤  Quality control ¨  Clear documentation ¤  Prevent code rot ¤  More users ¨  Continuous integration and testing ¤  Ensure code is always working
  • 14. Python Materials Genomics (pymatgen) ¨  Core materials analysis powering the Materials Project ¨  Defines core extensible Python objects for materials data representation. ¨  Provides a robust and well-documented set of structure and thermodynamic analysis tools relevant to many applications. ¨  Establishes an open platform for researchers to collaboratively develop sophisticated analyses of materials data.
  • 15. Extensive Materials Analysis Capabilities Input/ Output objects (Modular, Reusable, Extendable) Defects and TransformationsElectronic Structure XRD Patterns Phase and Pourbaix Diagrams Functional properties Comprehensively documented Continuously tested and integrated Active dev/user community
  • 16. www.pymatgen.org stats •  > 6000 views per month on average •  (~50% increase from previous year) V2.9.12 è v3.0.13 *Python 2/3 compatible! Other improvements •  ABINIT support •  Defects (Haranczyk/LBNL) •  Qchem (JCESR) •  Bug fixes & improvements Very active user community! 81 forks (developers making changes and contributing) Actual commits has slowed somewhat, as expected for a maturing and robust code base.
  • 17. Pymatgen-db ¨  Database add-on for pymatgen. Enables the creation of Materials Project-style MongoDB (www.mongodb.org) databases for management of materials data. Key features: ¤  Query engine for easy translation of MongoDB docs to useful pymatgen objects for analysis purposes. ¤  Includes a clean and intuitive web ui (the Materials Genomics UI) for exploring Mongo collections. ¤  http://pythonhosted.org//pymatgen-db/
  • 18. Custodian ¨  Simple, robust and flexible just-in-time (JIT) job management framework. ¤  Wrappers to perform error checking, job management and error recovery. ¤  Error recovery is an important aspect for HT: O(100,000) jobs + 1% error rate => O(1000) errored jobs. ¤  Existing sub-packages for error handling for VASP, NwChem and QChem calculations. ¨  Blue: Controlled by subclasses of Job ¨  Red: Defined by ErrorHandlers.
  • 19. Concrete Example for VASP calculations ¨  Extensive set of rules have been codified for running VASP calculations ¨  Significantly reduces error rate of calculations (< 1%)
  • 20. VaspJob class ¨  auto_npar: automatically modifies NPAR in INCAR to a relatively optimal number based on detected number of processors! Enhances vasp calculation efficiency by ~10-30%!!! ¨  auto_gamma: If this is a gamma-only calculation and a gamma compiled version of vasp exists, use it. Another 10-20% increase in efficiency! ¨  Even without error handling, custodian already significantly improves resource utilization of running VASP calculations! VaspJob(vasp_cmd, output_file="vasp.out”, auto_npar=True, auto_gamma=True, …<other options>...)
  • 21. FireWorks is the Workflow Manager 21   Custom material A cool material !! Lots of information about cool material !! Submit!   Input generation (parameter choice) Workflow mapping Supercomputer submission / monitoring Error handling File Transfer File Parsing / DB insertion
  • 22. FireWorks as a platform Community can write any workflow in FireWorks à We can automate it over most supercomputing resources structure charge Band structure DOS Optical phonons XAFS spectra GW
  • 23. Workflows in Development by Internal/ External Collaborations ¨  Elastic constants (in production) ¨  Thermal properties (Phonon / GIBBS: in testing) ¨  Surfaces (in testing) ¨  GW / hybrid calculations ¨  ABINIT workflows (Geoffroy Hautier, UCL) ¨  Any code can be added and automated
  • 24. Materials Project DB How do I access MP data?
  • 25. Materials Project DB How do I access MP data? Option 1: Direct access Most flexible and powerful, but •  User needs to know db language •  Security is an issue •  Fragile – if db tech or schema changes, user’s analysis breaks
  • 26. Materials Project DB How do I access MP data? Option 2: Web Apps Pros •  Intuitive and user-friendly •  Secure Cons •  Significant loss in flexibility and power WebApps
  • 27. Materials Project DB How do I access MP data? Option 3: Web Apps built on RESTful API Pros •  Intuitive and user-friendly •  Secure WebApps RESTfulAPI •  Programmatic access for developers and researchers
  • 28. The Materials API An open platform for accessing Materials Project data based on REpresentational State Transfer (REST) principles. Flexible and scalable to cater to large number of users, with different access privileges. Simple to use and code agnostic.
  • 29. A REST API maps a URL to a resource. Example: GET https://api.dropbox.com/1/account/info Returns information about a user’s account. Methods: GET, POST, PUT, DELETE, etc. Response: Usually JSON or XML or both
  • 31. https://www.materialsproject.org/rest/v2/materials/Fe2O3/vasp/energy Preamble Identifier, typically a formula (Fe2O3), id (1234) or chemical system (Li-Fe-O) Data type (vasp, exp, etc.) Property Request type
  • 32. Secure access An individual API key provides secure access with defined privileges. All https requests must supply API key as either a “x-api-key” header or a GET/POST “API_KEY” parameter. API key available at https://www.materialsproject.org/dashboard
  • 33. Sample output (JSON) ¨  Intuitive response format ¨  Machine-readable (JSON parsers available for most programming languages) ¨  Metadata provides provenance for tracking { } created_at: "2014-07-18T11:23:25.415382", valid_response: true, version: { }, - pymatgen: "2.9.9", db: "2014.04.18", rest: "1.0" response: [ ], - { }, - energy: -67.16532048, material_id: "mp-24972" { }, - energy: -132.33035197, material_id: "mp-542309" {…},+ {…},+ {…},+ {…},+ {…},+ {…},+ {…},+ {…}+ copyright: "Materials Project, 2012"
  • 34. Can I really access any piece of data in the Materials Project? Github-powered RESTful documentation http://bit.ly/materialsapi Via the shockingly powerful https://www.materialsproject.org/rest/v2/query
  • 36. The Materials API + pymatgen in Education – UCSD’s NANO 106 ¨  Data mined over the Materials Project’s 49,000+ unique crystals http://www.bit.ly/sg_stats P21/c is the most common space group, comprising ~9.8% of all compounds
  • 37. The Materials Virtual Lab @ UCSD’s One-click AIMD Starting candidates Topological Screening (augmented by DFT) Stability (phase & EW) screening Diffusivity Optimized candidates Automated “one-click” MD workflow based on pymatgen, custodian and fireworks AIMD SDSC Multi-week AIMD simulation Statistical exclusionary screening Y. Mo, S. P. Ong, G. Ceder, “Insights into Diffusion Mechanisms in P2 Layered Oxide Materials by First-Principles Calculations”, submitted Automated pathway extraction + NEB
  • 38. Coming soon (full launch in next few weeks)!!
  • 39. Sounds good, where do I learn more? ¨  The Materials Project ¤  https://www.materialsproject.org/open ¨  The Materials API Github Doc ¤  http://bit.ly/materialsapi ¨  The Materials Virtual Lab (MAVRL) @ UCSD ¤  Slides from Workshop on MP infrastructure ( http://mavrl.org/software)