Using MongoDB for Materials Discovery
Upcoming SlideShare
Loading in...5

Using MongoDB for Materials Discovery



How the Materials Project uses MongoDB

How the Materials Project uses MongoDB



Total Views
Views on SlideShare
Embed Views



1 Embed 1 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Using MongoDB for Materials Discovery Using MongoDB for Materials Discovery Presentation Transcript

    • Using MongoDB forMaterials Discovery Michael Kocher and Dan Gunter Lawrence Berkeley National Lab
    • Energy Mission at LBNL• Li-ion Batteries• Photovoltaic (Solar Cells)• Thermoelectrics• Biofuels• New Computational Tools• Cutting edge Spectroscopic Tools (Advanced Light Source)
    • Current Material Design model is Slow18 Years... from the averagenew materials discovery tocommercialization Bringing New Materials to the Market: Eagar, T.W. Technology Review Feb 1995, 98, 42.
    • Materials Genome Initiative: A Renaissance of American Manufacturing “To help businesses discover, develop, and deploy new materials twice as fast, were launching what we call the Materials Genome Initiative. The invention of silicon circuits and lithium-ion batteries made computers and iPods and iPads possible -- but it took years to get those technologies from the drawing board to the marketplace. We can do it faster.” - President Obama at Carnegie Mellon University 6/24/2011
    • What is a Material?
    • NaCl Silicon
    • LiCoO2 Li O Co
    • What can we Compute using quantum mechanics? volume density total energy + formation energy metallic? etc... No empirical parameters!
    • ‘The Google of Material Science Data” + MIT and LBNL collaboration
    • Inverting the Problem
    • Detailed Properties
    • Machine Learning How often can you Structure 1 substitute Mg for Ca? Structure 2 (new materials) Structure 3 Structure 4materials.bson Learning Structure 5 Algorithm Structure 6 What about Na, V, P, O? Prof. Gerbrand Ceder (DOI: 10.1103/PhysRevLett.91.135503)
    • Materials Project: A Play in Three ActsI.Data generation using HTCII. Data storageIII.Data analysis/logging
    • Act I: Managing Calculations• Centralized distributed model is the only way to go• Hub is at LBNL• Store the state in db• Overview of running many MPI jobs at many different HP centers
    • MasterQueue create a new engine, add to queue pull crystal builder.x master_queue.bson ‘The Brain’ manager.x manager.x manager.x manager.x manager.xHPC Franklin Hopper Carver lr1 lr2 NERSC Lawrencium (Oakland) (Berkeley)
    • Centralized LoggingExample MongoDB and Managementmanager.x manager.x manager.x manager.x manager.x manager.x manager.x manager.x O1 Cathode Hopper Franklin Carver lr1 lr2 DLX MIT NERSC (Oakland) LBNL Kentucky query = {‘elements’: {‘$all’: [“Li”, “O”], ‘nelectrons’ :{“$lte: 200}}
    • Act II :Core Data storage
    • Very Complex Documents
    • Powerful QueryingEvery crystal that has (Li or Na or K), (Mn), (O or S or F or Si)plus one other element except (Zn or Ni or Fe or Cu or Co){ "lattice.volume" : { "$lt" : 500 }, "elements" : {"$all" : [Mn],"$size" : 4, “$nin”:[Zn,Ni,Fe,Cu,Co]}, "atoms" : { "$elemMatch" : { ‘oxidation_state’ : 3, ‘symbol’:’Mn’} }, "$where" : "match_all( this.element_names, [Li, Na, K], [Mn], [O, S, F, Si])" }
    • pre-MongoDB :(((SELECT structure.structureid FROM structure NATURAL INNER JOINdatabase NATURAL INNER JOIN databaseentry WHERE structureid IN((select structure.structureid from structure NATURAL INNER JOINelemententry where elemententry.symbol=Li INTERSECT selectstructure.structureid from structure NATURAL INNER JOIN elemententrywhere elemententry.symbol=O) INTERSECT select structure.structureidfrom structure NATURAL INNER JOIN database NATURAL INNER JOINdatabaseentry where database.title=ICSD)) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select duplicateentry.entryid from duplicateentry))) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select entryid from removals))Search for materials with Li and O, excluding duplicates
    • Map/Reduce Calculation 12 Calculation 13 ✓ Calculation 14 Calculation 15 MRtasks.bson materials.bson
    • Every App uses MongoDB structure_predictors.bson candidate_materials.bson diffraction_patterns.bson by G. Hautier
    • Structure Predictor
    • Diffraction Pattern
    • Act III:Analytics and Logging
    • Rich Error Analysis Experimental Calculated
    • Integrated logging just makes sense• Semi-structured data easily stored• Can correlate with all other data• Automation Layer: Failed tasks• Web/App Layer
    • Conclusions• MongoDB is a very versatile tool• Used in several different cases• Elegant query syntax• Very useful for scientific data storage• A lot of exciting future ideas
    • Acknowledgements
    • Thanks!