SlideShare a Scribd company logo
1 of 13
Higher-Order Microstructure Statistics for
  Next Generation Materials Taxonomy

                              Tony Fast
   University of California Santa Barbara, Materials Engineering
           Olga Wodo, Baskar Ganapathysubramanian
          Iowa State University, Mechanical Engineering
                          Surya R. Kalidindi
            Drexel University, Mechanical Engineering
From Materials Selection to Microstructure (μS) Informatics…




    μS informatics distill rich spatial and temporal information into tractable, usable,
                        and searchable bi-direction SPP linkages
Effective statistics are contained in μS Informatics
   Statistical spatial distributions capture traditional effective statistical measures




    A.
                                                          B.                                                         C.

                               Benefits of Using n-Point Correlations
                               • Ground Truth
                               • Fit naturally in higher-order
                               homogenization and localization theories




 A. 2-pt Correlation Function – Statistical correlation between random points in space/time
 B. Chord Length Distribution – length and orientation of chords in a heterogeneous medium
 C. Interfacial Surface Distribution - The principal curvatures of surfaces in the μS.


C. Kwon, Yongwoo, Morphology and topology of interfaces during coarsening via nonconserved and conserved dynamics,
Northwestern, Thesis, 2007.
The Microstructure is a stochastic process
Distributions provide a framework to effectively compare microstructures

                              HT1                         HT2                    Difference
    Microstructure




                                              -                          =
                          The comparison of μS is dubious due to the lack of origin.
     Autocorrelation




                                              -                          =
                       Autcorrelation contains all of the information in its respective μS.
                                      Extremely Large Dimensional Spaces!
MS informatics benefits from dimensional reduction
        Reducing the number of random variables for feature selection and extraction
        in discrete materials systems
     Principal Component Analysis: Reduced embedding of linearly independent variables
     that correspond to decreasing levels of variance starting with the highest (Dd)
        Improve Empirical Fitting:    ~1e6 Variables                        Microstructure Taxonomy:                            >6e6 Variables
        Porous Bi-layers in Fuel Cells                                      MS Mapping of α-βTitanium




Vf
     A. Çeçen, T. Fast, E. C. Kumbur, and S. R. Kalidindi, Data-driven       Kalidindi, S.R., S.R. Niezgoda, and A.A. Salem, Microstructure
     Approaches to Establishing Microstructure-property Relationships:       informatics using higher-order statistics and efficient data-mining
     Application to Transport through Porous Structures, submitted, 2012.    protocols. JOM, 2011. 63(4): p. 34-41.
μS Taxonomy of Continuous Material Feature
An Application to Organic Blends in Solar Cells

                             Many Topologies                                   11 Distinct Topologies

10% FAST PHASE SEPARATION                               90% SLOW GRAIN COARSENING




                                                         FINAL STRUCTURES




                            Use data driven techniques to classify the final topology before the
                                                Isosurfaces of atomic fraction
     End Goal
                              simulation is complete. i.e. Reduce Redundancy, Time Savings
                                                        1100 Datasets

                                   Develop Microstructure Taxonomies of the Final Structures
     First Goal
                                         i.e. Build Utilities for Continuous Materials Features



   Simulation data provided by Olga Wodo and Baskar Ganapathysubramanian at ISU.
μS function of continuous materials features
  Informatics benefit from a generalized higher-order microstructure description
                                H                     H
     Primitive Basis                    h   h                h                             h
                                      m v
                                        s         s
                                                            ms       1, 0          ms               1
     Function
                                h 1                   h 1

                         First-Order                                         Higher-Order                      m6
                                                                                                                    n



                                                                                                          n         n    n
                                                                                                        m1     ms       m3
                                                          Local conformation                                   n
                                                                                                              m2
                     0                white / solid
Discrete




                 h          s                             of pixels                                            m5
                                                                                                                    n
             m   s
                     1      s
                                      black / pore               ~
                                                            ~h
                                                            ms
                                                                               h               h
                                                                         m s 0 m s 1 t1  m s N t N
                                                                                                                    h




                                                                     ~
                                                                         s             s
                                                                                           ,  s , s
Continuous




                                                                                                              f s
                                                                 h                                  h2
                                                            ms 1         f         s
                                                                                                   ms
                                                                     ~
                                                                 ~
                                                                 m
                                                                     h
                                                                                m m m
                                                                                       h0            h1        h2
                                                                     s                 s            s         s
μS informatics workflow is a system
Microstructure Descriptor, Statistics, PCA, etc are isolated modules



                                                 1100 Datasets




                                                                 HO Descriptors
     Future Work




         PCA/k-means                                      NP Correlations




         Systems analysis allows one to prove the efficacy of methods
Reduced embedding of final topologies
PCA projection changes with the number of basis functions and the gradient

      Each point indiciates a 21x21x21 μS where each color is a different topology




 Hard clustering in the PCA space allows the final topologies to be classified qualitatively
Augmented embedding combines gradients
PCA embedding changes when different descriptors are combined
Data-mining with k-means clustering
 Automated topology recognition and quantitative metrics

k-Means Clustering: A data-mining approach that creates partitions based on the
means of clusters to automatically classify datapoints.


                                          Classification Cases
                                          TP – Correct classification
                                          TN – Correct nonclassification
                                          FP – Incorrect classification
                                          FN – Incorrect nonclassification


                                     Sensitivity – metric for accurate classification



                                     Specificity – metric for accurate nonclassification



                                                     Range = 0 to 1
Quantitative Measures of Clustering
Sensitivity and specificity analysis of PCA embedding

AF – Atomic Fraction     FG                              FG
FG – First Gradient                                           AF
SG – Second Gradient                        AF      SG




                    SG




        Clockwise
        FG,SG
        AF,SG
        AF,FG
        AF,FG,SG
Conclusions
Higher-Order Microstructure Statistics for Next Generation Materials Taxonomy


• An automated data-mining technique was successfully developed
  for 3D systems with continuous μS features.
• A generalized higher-order μS descriptor was developed using the
  primitive basis.
• Higher-order descriptors prove that higher-order terms play a strong
  role in developing structure-structure databases.
• This system naturally clusters in PCA, but other DR techniques
  show improvement.
• μS informatics are necessary to automatically disseminate structure-
  structure relationships of large collections of multi-dimensional
  datasets

More Related Content

More from Tony Fast

Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics OverviewTony Fast
 
Data Science Solutions by Materials Scientists: The Early Case Studies
Data Science Solutions by Materials Scientists: The Early Case StudiesData Science Solutions by Materials Scientists: The Early Case Studies
Data Science Solutions by Materials Scientists: The Early Case StudiesTony Fast
 
Spatially resolved pair correlation functions for structure processing taxono...
Spatially resolved pair correlation functions for structure processing taxono...Spatially resolved pair correlation functions for structure processing taxono...
Spatially resolved pair correlation functions for structure processing taxono...Tony Fast
 
Spatially resolved pair correlation functions for point cloud data
Spatially resolved pair correlation functions for point cloud dataSpatially resolved pair correlation functions for point cloud data
Spatially resolved pair correlation functions for point cloud dataTony Fast
 
Microstructure Informatics
Microstructure InformaticsMicrostructure Informatics
Microstructure InformaticsTony Fast
 
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...Novel and Enhanced Structure-Property-Processing Relationships with Microstru...
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...Tony Fast
 

More from Tony Fast (6)

Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
 
Data Science Solutions by Materials Scientists: The Early Case Studies
Data Science Solutions by Materials Scientists: The Early Case StudiesData Science Solutions by Materials Scientists: The Early Case Studies
Data Science Solutions by Materials Scientists: The Early Case Studies
 
Spatially resolved pair correlation functions for structure processing taxono...
Spatially resolved pair correlation functions for structure processing taxono...Spatially resolved pair correlation functions for structure processing taxono...
Spatially resolved pair correlation functions for structure processing taxono...
 
Spatially resolved pair correlation functions for point cloud data
Spatially resolved pair correlation functions for point cloud dataSpatially resolved pair correlation functions for point cloud data
Spatially resolved pair correlation functions for point cloud data
 
Microstructure Informatics
Microstructure InformaticsMicrostructure Informatics
Microstructure Informatics
 
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...Novel and Enhanced Structure-Property-Processing Relationships with Microstru...
Novel and Enhanced Structure-Property-Processing Relationships with Microstru...
 

Higher-Order Microstructure Statistics for Next Generation Materials Classification

  • 1. Higher-Order Microstructure Statistics for Next Generation Materials Taxonomy Tony Fast University of California Santa Barbara, Materials Engineering Olga Wodo, Baskar Ganapathysubramanian Iowa State University, Mechanical Engineering Surya R. Kalidindi Drexel University, Mechanical Engineering
  • 2. From Materials Selection to Microstructure (μS) Informatics… μS informatics distill rich spatial and temporal information into tractable, usable, and searchable bi-direction SPP linkages
  • 3. Effective statistics are contained in μS Informatics Statistical spatial distributions capture traditional effective statistical measures A. B. C. Benefits of Using n-Point Correlations • Ground Truth • Fit naturally in higher-order homogenization and localization theories A. 2-pt Correlation Function – Statistical correlation between random points in space/time B. Chord Length Distribution – length and orientation of chords in a heterogeneous medium C. Interfacial Surface Distribution - The principal curvatures of surfaces in the μS. C. Kwon, Yongwoo, Morphology and topology of interfaces during coarsening via nonconserved and conserved dynamics, Northwestern, Thesis, 2007.
  • 4. The Microstructure is a stochastic process Distributions provide a framework to effectively compare microstructures HT1 HT2 Difference Microstructure - = The comparison of μS is dubious due to the lack of origin. Autocorrelation - = Autcorrelation contains all of the information in its respective μS. Extremely Large Dimensional Spaces!
  • 5. MS informatics benefits from dimensional reduction Reducing the number of random variables for feature selection and extraction in discrete materials systems Principal Component Analysis: Reduced embedding of linearly independent variables that correspond to decreasing levels of variance starting with the highest (Dd) Improve Empirical Fitting: ~1e6 Variables Microstructure Taxonomy: >6e6 Variables Porous Bi-layers in Fuel Cells MS Mapping of α-βTitanium Vf A. Çeçen, T. Fast, E. C. Kumbur, and S. R. Kalidindi, Data-driven Kalidindi, S.R., S.R. Niezgoda, and A.A. Salem, Microstructure Approaches to Establishing Microstructure-property Relationships: informatics using higher-order statistics and efficient data-mining Application to Transport through Porous Structures, submitted, 2012. protocols. JOM, 2011. 63(4): p. 34-41.
  • 6. μS Taxonomy of Continuous Material Feature An Application to Organic Blends in Solar Cells Many Topologies 11 Distinct Topologies 10% FAST PHASE SEPARATION 90% SLOW GRAIN COARSENING FINAL STRUCTURES Use data driven techniques to classify the final topology before the Isosurfaces of atomic fraction End Goal simulation is complete. i.e. Reduce Redundancy, Time Savings 1100 Datasets Develop Microstructure Taxonomies of the Final Structures First Goal i.e. Build Utilities for Continuous Materials Features Simulation data provided by Olga Wodo and Baskar Ganapathysubramanian at ISU.
  • 7. μS function of continuous materials features Informatics benefit from a generalized higher-order microstructure description H H Primitive Basis h h h h m v s s ms 1, 0 ms 1 Function h 1 h 1 First-Order Higher-Order m6 n n n n m1 ms m3 Local conformation n m2 0 white / solid Discrete h s of pixels m5 n m s 1 s black / pore ~ ~h ms h h m s 0 m s 1 t1  m s N t N h ~ s s ,  s , s Continuous  f s h h2 ms 1 f s ms ~ ~ m h m m m h0 h1 h2 s s s s
  • 8. μS informatics workflow is a system Microstructure Descriptor, Statistics, PCA, etc are isolated modules 1100 Datasets HO Descriptors Future Work PCA/k-means NP Correlations Systems analysis allows one to prove the efficacy of methods
  • 9. Reduced embedding of final topologies PCA projection changes with the number of basis functions and the gradient Each point indiciates a 21x21x21 μS where each color is a different topology Hard clustering in the PCA space allows the final topologies to be classified qualitatively
  • 10. Augmented embedding combines gradients PCA embedding changes when different descriptors are combined
  • 11. Data-mining with k-means clustering Automated topology recognition and quantitative metrics k-Means Clustering: A data-mining approach that creates partitions based on the means of clusters to automatically classify datapoints. Classification Cases TP – Correct classification TN – Correct nonclassification FP – Incorrect classification FN – Incorrect nonclassification Sensitivity – metric for accurate classification Specificity – metric for accurate nonclassification Range = 0 to 1
  • 12. Quantitative Measures of Clustering Sensitivity and specificity analysis of PCA embedding AF – Atomic Fraction FG FG FG – First Gradient AF SG – Second Gradient AF SG SG Clockwise FG,SG AF,SG AF,FG AF,FG,SG
  • 13. Conclusions Higher-Order Microstructure Statistics for Next Generation Materials Taxonomy • An automated data-mining technique was successfully developed for 3D systems with continuous μS features. • A generalized higher-order μS descriptor was developed using the primitive basis. • Higher-order descriptors prove that higher-order terms play a strong role in developing structure-structure databases. • This system naturally clusters in PCA, but other DR techniques show improvement. • μS informatics are necessary to automatically disseminate structure- structure relationships of large collections of multi-dimensional datasets

Editor's Notes

  1. How can we classify and separate themicrstructure of materials systems based on their SS, SP, and SP linkages
  2. From Materials informatics we can develop atlases that facilitate materials selection in rich design problems. Thanks to the incredible work of the materials science community who are developing techniques that are capable of capturing rich heterogeneous 3d information like the tri beam, robomet, and microct we are able to generate rich microstructure information.On the opposite side of the coin though, there is a data deluge that must be coped with to leverage the rich the three and four dimensional information being acrued. Microstrcutre informatics is a suite of Big Data utilities that can cope with and address these emerging challenges.To realize these concepts we must modify the framework by which we look at material properties by looking statistically at the microstructure.
  3. Large DimensionalityGive an example of the dimensionalityTraditional Microstructure measures
  4. Discuss displacement in the image as changing differenceBetter comparitive measurePixel by pixel difference
  5. In our foray into dimensionality redunction we have been using a variety of tools, but to date work has been published on using PCA decomposition. Which….In the first example, we are observing a homogenization relationship for the diffusivity of porous bi-layers in fuel cells. The goal is extract an accurate metamodel to reproduce the diffusivity. Previously, the relationship was established solely by volume fraction which was unable to capture the dense MPL layer. By including various reductions of the 2-pt statistics an empirical model was established for both layers with improved accuracy in the GDL layer.From these examples we have shown that there are strong benefits in using reduced order representations of the 2-pt correlation functions in structure-structure connections and structure-property connections
  6. Isosurface of Topologies
  7. The methods we use are largely derived from digital signal processing methods that rely on basis functions to decompose signals, in our case, to the “salient” microstructure featuresThe higher-order description has been shown to provide drastic improvements in developing localization metamodels.Continuous microstructures are defined using the primitive basis wherein bounded microstructure features are discretized into bins and the microstructure function is defined by the above constraint. To extend the higher-order discrete description to continuous states we have a problem of scale.To circumvent this we have developed a generalized higher-order microstructure function that is applicable to both discrete and continuous states.
  8. In the grand scheme of Microstructure Informatics we are presenting a workflow to address the emerging big data conceptsMicrostructure informatics is a workflow that operates on physical data to extract important features in structure-structure, structure-property and structure-processing linkages.
  9. When we pass the 1100 data sets through the workflow, we get a PCA embedding wherein each points represents a 21x21x21 microstructure in the dataset. Mind you, originally these datasets were in 2*10*21^3 dimensional space.
  10. NOTE THE SHIT OUT OF THE DIFFERENCES IN MULTIPLICATIVE AND ADDITIVE
  11. For a datapoint, if the class is correctly identified then it is a true positive. If a class is not chosen and it is not in that class then it is a true negative. If the correct class is not found then it is a false negative. If an incorrect class is chosen then it is a false positive.
  12. Describe Verbally