Materials informatics

1,179 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,179
On SlideShare
0
From Embeds
0
Number of Embeds
199
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Materials informatics

  1. 1. Evgeny Blokhin Chelyabinsk SUSU’2013 summer workshop Max-Planck Institute for Solid State Research Stuttgart, Germany Materials informatics
  2. 2. Outlook 1. Data-mining in materials science 2. Blue Obelisk 3. Python programming language
  3. 3. What is data-mining? statistics databases information theory machine learning artificial intelligence optimization Data mining
  4. 4. Tasks of data-mining 1. Classification 2. Prognosing 3. Visualization 4. Reasoning 5. Analysis 6. Expert systems
  5. 5. Big data in materials science EXAMPLE: nearly for the last 4 years with my colleagues-theoreticians we produced: over 9000 simulation output files over 50 articles
  6. 6. 1. Accelrys Pipeline Pilot and Materials Studio, http://accelrys.com/products 2. AFLOW framework and Aflowlib.org repository, http://www.aflowlib.org 3. AIDA, Bosch LLC 4. Blue Obelisk Data Repository (XSLT, XML), http://bodr.sourceforge.net 5. CCLib (Python), http://cclib.sf.net 6. CDF (Python), http://kitchingroup.cheme.cmu.edu/cdf 7. CMR (Python), https://wiki.fysik.dtu.dk/cmr 8. Comp. Chem. Comparison and Benchmark Database, http://cccbdb.nist.gov 9. cctbx: Computational Crystallography Toolbox, http://cctbx.sourceforge.net 10. ESTEST (Python, XQuery), http://estest.ucdavis.edu 11. J-ICE online viewer (based on Jmol, Java), http://j-ice.sourceforge.net 12. Materials Project (Python), http://www.materialsproject.org 13. PAULING FILE world largest database for inorganic compounds, http://paulingfile.com 14. Quixote, http://quixote.wikispot.org 15. Scipio (Java), https://scipio.iciq.es 16. WebMO: Web-based interface to computational chemistry packages (Java, Perl), http://webmo.net New type of modeling software
  7. 7. …and smart codes ENCUT = 500 IBRION = 2 ISIF = 3 NSW = 20 IDIOT = 3 NELMIN = 5 EDIFF = 1.0e-08 EDIFFG = -1.0e-08 IALGO = 38 ISMEAR = 0 LREAL = .FALSE. LWAVE = .FALSE. *** VASP MASTER: I AM SURE YOU KNOW WHAT YOU ARE DOING ***
  8. 8. d-metal oxides band gap problem standard DFT GGA approach Hartree-Fock admixing LCAO approximation Usage of Gaussian basis sets good atomization energy Example of inference over an ontology
  9. 9. Open data, open standards, open source in chemistry
  10. 10. Open data, open standards, open source in chemistry 1.Elsevier, Wiley, Springer publishers are “evil” 2.“The right to read is right to mine” 3.“Jailbreaking” the scientific data from PDFs: access, reuse, integrity 4.Why the level of collaboration is so low?
  11. 11. Materials Project Prof. G. Ceder, MIT, Boston
  12. 12. Guido van Rossum, Google, Dropbox http://goo.gl/FtFS7h Python programming language
  13. 13. Advantages of Python Syntax: tabulation, syntactic sugar, speech- like, flexibility, expression VERY fast prototyping Great popularity in scientific community 100% cross-platform and portable
  14. 14. Disadvantages of Python Relatively slow speed comparing to compiled languages like C++ or Fortran Global Interpreter Lock (GIL) Historically not popular in some narrow scientific areas (“reigns” of Java)
  15. 15. Two examples list = [x**2 for x in range(10)] numbers = [10, 4, 2, -1, 6] filter(lambda x: x < 5, numbers)
  16. 16. 1. Multi-dimensional array manipulation (fast!) 2. Discrete fourier transform 3. Linear Algebra 4. Mathematical functions 5. Matrix library 6. Polynomials 7. Set routines 8. Sorting, searching and counting 9. Statistics
  17. 17. eigvals, eigvecs = numpy.linalg.eigh(dynmat) Solving eigenvalue problem for a dynamical matrix (phonopy code):

×