Materials informatics

  • 583 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
583
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Evgeny Blokhin Chelyabinsk SUSU’2013 summer workshop Max-Planck Institute for Solid State Research Stuttgart, Germany Materials informatics
  • 2. Outlook 1. Data-mining in materials science 2. Blue Obelisk 3. Python programming language
  • 3. What is data-mining? statistics databases information theory machine learning artificial intelligence optimization Data mining
  • 4. Tasks of data-mining 1. Classification 2. Prognosing 3. Visualization 4. Reasoning 5. Analysis 6. Expert systems
  • 5. Big data in materials science EXAMPLE: nearly for the last 4 years with my colleagues-theoreticians we produced: over 9000 simulation output files over 50 articles
  • 6. 1. Accelrys Pipeline Pilot and Materials Studio, http://accelrys.com/products 2. AFLOW framework and Aflowlib.org repository, http://www.aflowlib.org 3. AIDA, Bosch LLC 4. Blue Obelisk Data Repository (XSLT, XML), http://bodr.sourceforge.net 5. CCLib (Python), http://cclib.sf.net 6. CDF (Python), http://kitchingroup.cheme.cmu.edu/cdf 7. CMR (Python), https://wiki.fysik.dtu.dk/cmr 8. Comp. Chem. Comparison and Benchmark Database, http://cccbdb.nist.gov 9. cctbx: Computational Crystallography Toolbox, http://cctbx.sourceforge.net 10. ESTEST (Python, XQuery), http://estest.ucdavis.edu 11. J-ICE online viewer (based on Jmol, Java), http://j-ice.sourceforge.net 12. Materials Project (Python), http://www.materialsproject.org 13. PAULING FILE world largest database for inorganic compounds, http://paulingfile.com 14. Quixote, http://quixote.wikispot.org 15. Scipio (Java), https://scipio.iciq.es 16. WebMO: Web-based interface to computational chemistry packages (Java, Perl), http://webmo.net New type of modeling software
  • 7. …and smart codes ENCUT = 500 IBRION = 2 ISIF = 3 NSW = 20 IDIOT = 3 NELMIN = 5 EDIFF = 1.0e-08 EDIFFG = -1.0e-08 IALGO = 38 ISMEAR = 0 LREAL = .FALSE. LWAVE = .FALSE. *** VASP MASTER: I AM SURE YOU KNOW WHAT YOU ARE DOING ***
  • 8. d-metal oxides band gap problem standard DFT GGA approach Hartree-Fock admixing LCAO approximation Usage of Gaussian basis sets good atomization energy Example of inference over an ontology
  • 9. Open data, open standards, open source in chemistry
  • 10. Open data, open standards, open source in chemistry 1.Elsevier, Wiley, Springer publishers are “evil” 2.“The right to read is right to mine” 3.“Jailbreaking” the scientific data from PDFs: access, reuse, integrity 4.Why the level of collaboration is so low?
  • 11. Materials Project Prof. G. Ceder, MIT, Boston
  • 12. Guido van Rossum, Google, Dropbox http://goo.gl/FtFS7h Python programming language
  • 13. Advantages of Python Syntax: tabulation, syntactic sugar, speech- like, flexibility, expression VERY fast prototyping Great popularity in scientific community 100% cross-platform and portable
  • 14. Disadvantages of Python Relatively slow speed comparing to compiled languages like C++ or Fortran Global Interpreter Lock (GIL) Historically not popular in some narrow scientific areas (“reigns” of Java)
  • 15. Two examples list = [x**2 for x in range(10)] numbers = [10, 4, 2, -1, 6] filter(lambda x: x < 5, numbers)
  • 16. 1. Multi-dimensional array manipulation (fast!) 2. Discrete fourier transform 3. Linear Algebra 4. Mathematical functions 5. Matrix library 6. Polynomials 7. Set routines 8. Sorting, searching and counting 9. Statistics
  • 17. eigvals, eigvecs = numpy.linalg.eigh(dynmat) Solving eigenvalue problem for a dynamical matrix (phonopy code):