Your SlideShare is downloading. ×
Materials informatics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Materials informatics

607
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
607
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Evgeny Blokhin Chelyabinsk SUSU’2013 summer workshop Max-Planck Institute for Solid State Research Stuttgart, Germany Materials informatics
  • 2. Outlook 1. Data-mining in materials science 2. Blue Obelisk 3. Python programming language
  • 3. What is data-mining? statistics databases information theory machine learning artificial intelligence optimization Data mining
  • 4. Tasks of data-mining 1. Classification 2. Prognosing 3. Visualization 4. Reasoning 5. Analysis 6. Expert systems
  • 5. Big data in materials science EXAMPLE: nearly for the last 4 years with my colleagues-theoreticians we produced: over 9000 simulation output files over 50 articles
  • 6. 1. Accelrys Pipeline Pilot and Materials Studio, http://accelrys.com/products 2. AFLOW framework and Aflowlib.org repository, http://www.aflowlib.org 3. AIDA, Bosch LLC 4. Blue Obelisk Data Repository (XSLT, XML), http://bodr.sourceforge.net 5. CCLib (Python), http://cclib.sf.net 6. CDF (Python), http://kitchingroup.cheme.cmu.edu/cdf 7. CMR (Python), https://wiki.fysik.dtu.dk/cmr 8. Comp. Chem. Comparison and Benchmark Database, http://cccbdb.nist.gov 9. cctbx: Computational Crystallography Toolbox, http://cctbx.sourceforge.net 10. ESTEST (Python, XQuery), http://estest.ucdavis.edu 11. J-ICE online viewer (based on Jmol, Java), http://j-ice.sourceforge.net 12. Materials Project (Python), http://www.materialsproject.org 13. PAULING FILE world largest database for inorganic compounds, http://paulingfile.com 14. Quixote, http://quixote.wikispot.org 15. Scipio (Java), https://scipio.iciq.es 16. WebMO: Web-based interface to computational chemistry packages (Java, Perl), http://webmo.net New type of modeling software
  • 7. …and smart codes ENCUT = 500 IBRION = 2 ISIF = 3 NSW = 20 IDIOT = 3 NELMIN = 5 EDIFF = 1.0e-08 EDIFFG = -1.0e-08 IALGO = 38 ISMEAR = 0 LREAL = .FALSE. LWAVE = .FALSE. *** VASP MASTER: I AM SURE YOU KNOW WHAT YOU ARE DOING ***
  • 8. d-metal oxides band gap problem standard DFT GGA approach Hartree-Fock admixing LCAO approximation Usage of Gaussian basis sets good atomization energy Example of inference over an ontology
  • 9. Open data, open standards, open source in chemistry
  • 10. Open data, open standards, open source in chemistry 1.Elsevier, Wiley, Springer publishers are “evil” 2.“The right to read is right to mine” 3.“Jailbreaking” the scientific data from PDFs: access, reuse, integrity 4.Why the level of collaboration is so low?
  • 11. Materials Project Prof. G. Ceder, MIT, Boston
  • 12. Guido van Rossum, Google, Dropbox http://goo.gl/FtFS7h Python programming language
  • 13. Advantages of Python Syntax: tabulation, syntactic sugar, speech- like, flexibility, expression VERY fast prototyping Great popularity in scientific community 100% cross-platform and portable
  • 14. Disadvantages of Python Relatively slow speed comparing to compiled languages like C++ or Fortran Global Interpreter Lock (GIL) Historically not popular in some narrow scientific areas (“reigns” of Java)
  • 15. Two examples list = [x**2 for x in range(10)] numbers = [10, 4, 2, -1, 6] filter(lambda x: x < 5, numbers)
  • 16. 1. Multi-dimensional array manipulation (fast!) 2. Discrete fourier transform 3. Linear Algebra 4. Mathematical functions 5. Matrix library 6. Polynomials 7. Set routines 8. Sorting, searching and counting 9. Statistics
  • 17. eigvals, eigvecs = numpy.linalg.eigh(dynmat) Solving eigenvalue problem for a dynamical matrix (phonopy code):