Data Science Solutions by Materials Scientists: The Early Case Studies


Published on

Improvements in algorithms, technology, and computation are directly impacting the landscape of information use in materials science. The 3 V’s of Big Data (volume, velocity, and variety) are becoming evermore apparent within all sectors of the field. Novel approaches will be required to confront the emerging data deluge and extract the richest knowledge from simulated and empirical information in complex evolving 3-D spaces. Microstructure Informatics (μInformatics) is an emerging suite of signal processing techniques, advanced statistical tools, and data science methods tailored specifically for this new frontier. μInformatics curates and transforms large collections of materials science information using efficient workflows to extract knowledge of bi-directional structure-property/processing connections for most material classes.

In this talk, a few early case studies in data-driven methods to solve materials science problems will be explored. Emerging spatial statistics tools will be explored that enable an objective comparison of static and evolving 3-D material volumes from molecular dynamics simulation, micro-CT, and Scanning Electron Microscopy. Also, the statistics will provide a foundation to create improved bottom-up homogenization relationships in fuel cell materials. Lastly, applications of the Materials Knowledge System, a data-driven meta-model to create top-down localization relationships will be explored for phase field model and finite element model information.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Science Solutions by Materials Scientists: The Early Case Studies

  1. 1. Data Science Solutions by Materials Scientists The Early Case Studies Tony Fast Materials Data Analyst Materials Informatics for Engineering Design Woodruff School of Mechanical Engineering Georgia Institute of Technology *Any MINED shield is a link to a resource.
  2. 2. An Archival and Self Describing Data Format using HDF5 Data and Metadata stored in one file, Support in many languages, and Ideal support for high-dimensional data *MXADataModel – Archival Data Format – ONR/DARPA Dynamic 3-D Digital Structures Program
  3. 3. HDF5 - The little zip file that could… One Dataset – 1.6GB – 4 Experiments –with 160 Datasets each … long term value.
  4. 4. Volume Variety Velocity = Big Data Materials Science Polymer - MD Titanium Jacobs -GaTech Frasier -OSU Martensitic Steel Gumbusch SiC/SiC Ritchie- LLNL Bamboo Wegst - Dartmouth Al-Cu Solidification Voorhees - NW The velocity that data is generated will rise and the speed that it will be analyzed in will decrease.
  5. 5. β-Titanium REDUCED OUTPUT: Grain size Grain Faces Number of Grains Mean Curvature Nearest Grain Analysis 10 micron resolution with 4300 Grains Compare with empirical models Materials Science is a Big Data domain, but it is not treated that way. Rowenhorst, Lewis, Spanos, Acta Mat, 2010
  6. 6. Harvard Clean Energy Project Database AFLOW, Curtarolo Group Example Databases
  7. 7. STRUCTURE INFORMATICS WORKFLOW INTELLIGENT DESIGN OF EXPERIMENTS PHYSICS BASED MODELS SIMULATION EXPERIMENT MICROSTRUCTURE (MATERIAL) SIGNAL PROCESSING ADVANCED & OBJECTIVE STATISTICAL ENCODING DATA SCIENCE MODULES INNOVATION ACCOUNTING Microstructure Informatics is a scalable, data-driven system to mine structureproperty/processing connections from experimental and simulation materials science information; structure being the independent variable. The system is agnostic to material system and length scale, objectively quantifiable, and rapidly iterates in less cycles for both materials improvement and discovery.
  8. 8. DATA SCIENCE MODULES Property Microstructure Material Structure Processing Data science modules are machine learning and statistical tools to extract rich bi-directional structureproperty/processing linkages from encodings of materials & microstructure datasets. Mining modules create structure taxonomies, homogenization and localization relationships, ground truth comparison between simulation and experiment, materials discovery, and materials improvement.
  10. 10. SPATIAL STATISTICS Statistical correlations between random points in space/time which reveal systematic patterns in the microstructure. Contains the original μS within a translation & inversion. An objective encoding for most materials datasets. t h' msh ms+t ft hh' = å Dt s t t
  11. 11. The fidelity of the spatial statistics are impacted by how the material structure is parameterized as a signal. CURRENT APPLICATIONS metals, polymers, fuel cells, cmc, md, & a bunch of other things TYPES OF SIGNALS sparse, experimental, simulation, heterogeneous, surface, bulk
  12. 12. Objective Microstructure Classification of α-β Titanium Images StatisticsMine with Principal Component Analysis
  13. 13. Mechanical Deformation of Polymer Chains Molecular Dynamics of Aluminum Atoms
  14. 14. Bottom-up Homogenization Relationships model GDL MPL simulation X-CTFinite Element ModelingStatistics Regression to connect the statistics with diffusivity values from FEM
  15. 15. Meta-modeling with Materials Knowledge Systems Top-down localization relationships FEM ε=5e-4 ps = åå a m h t t h s+t h The MKS design filters that capture the effect of the local arrangement of the microstructure on the response. The filters are learned from physics based models and can only be as accurate as the model never better.
  16. 16. Meta-modeling with Materials Knowledge Systems Any Model Top-down localization relationships INPUT OUTPUT Control ps = åå a m h t t h s+t h The MKS design filters that capture the effect of the local arrangement of the microstructure on the response. The filters are learned from physics based models and can only be as accurate as the model never better.
  17. 17. Top-Down Localization Relationships for High Contrast Composites The MKS is a scalable, parallel meta-model that learns from physics based models to enable rapid simulation at a cost in accuracy. N2 vs. Nlog(N) complexity It learns top-down localization relationships to extra extreme value events and enables multiscale integration. OTHER APPLICATIONS Spinodal Decomposition, Grain Coarsening, Thermo-mechanical, Polycrystalline
  18. 18. Objective parametric descriptors and data science enable integration of bi-direction structure-property/processing linkages. Structure-Property Homogenization Structure-Processing MKS Processing History Structure-Property Localization
  19. 19. Data enables bidirectional S-P/P, multiscale integration, and higher throughput CORE TECHNOLOGIES TO FUEL THE DATA AGE OF MATERIALS SCIENCE Open Access, Open Source Software, Scalable Databases, High-Statistical Throughput Simulation and Experiment, Image Segmentation, Machine Learning, Scalable Databases, Metadata Integration, Mobile Technology, Visualization, High Performance Computing, Cyberinfrastructure/Collaboratories, Collaboration & Sharing
  20. 20. Selected Links Any shield in this presentation is a link HDF5 HDFView MXADataModel Curtarolo Group AFLOW Harvard Clean Energy Project Serial Sectioned Titanium MATIN Materials Genome Initiative