Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Going Smart and Deep on Materials at ALCF

1,111 views

Published on

As we acquire large quantities of science data from experiment and simulation, it becomes possible to apply machine learning (ML) to those data to build predictive models and to guide future simulations and experiments. Leadership Computing Facilities need to make it easy to assemble such data collections and to develop, deploy, and run associated ML models.

We describe and demonstrate here how we are realizing such capabilities at the Argonne Leadership Computing Facility. In our demonstration, we use large quantities of time-dependent density functional theory (TDDFT) data on proton stopping power in various materials maintained in the Materials Data Facility (MDF) to build machine learning models, ranging from simple linear models to complex artificial neural networks, that are then employed to manage computations, improving their accuracy and reducing their cost. We highlight the use of new services being prototyped at Argonne to organize and assemble large data collections (MDF in this case), associate ML models with data collections, discover available data and models, work with these data and models in an interactive Jupyter environment, and launch new computations on ALCF resources.

Published in: Technology
  • Memory Improvement: How To Improve Your Memory In Just 30 Days, click here..  https://tinyurl.com/brainpill101
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • How can I sharpen my memory? How can I improve forgetfulness? find out more... ●●● https://bit.ly/2GEWG9T
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Going Smart and Deep on Materials at ALCF

  1. 1. Building an ALCF Data Service: Interactive, scalable, reproducible data science Ian Foster Rick Wagner, Nick Saint, Eric Blau Kyle Chard, Yadu Nand Babuji Logan Ward, Ben Blaiszik Mike Papka with André Schleife and Cheng-Wei Lee Aiichiro Nakano (USC - ALCF INCITE 2017), Maria Chan (ANL - ALCF INCITE 2016, André Schleife (UIUC - ALCF INCITE 2016)
  2. 2. Overview • Leadership simulations produce data of great scientific value
  3. 3. Overview • Leadership simulations produce data of great scientific value • We demonstrate how to:  Make data more accessible and useful by associating them with rich data lifecycle and analysis services Find Analyze Publish Interactive ♦ Scalable ♦ Reproducible
  4. 4. Overview • Leadership simulations produce data of great scientific value • We demonstrate how to:  Make data more accessible and useful by associating them with rich data lifecycle and analysis services  Leverage advanced data science and machine learning (ML) methods to reduce simulation costs and increase data quality and value Find Analyze Publish Interactive ♦ Scalable ♦ Reproducible Collect Process Represent Learn
  5. 5. Interactive, scalable, reproducible data science PUBLISH  Automate capture, publication, and indexing of results from ALCF projects  Enable creation of workspaces and reusable data objects to accelerate data analysis and promote replicability ANALYZE  Combine ML approaches with ALCF HPC resources to extract more information from existing datasets and to guide future simulation campaigns FIND  Unify search, discovery, and consumption of datasets, workspaces, and analysis results
  6. 6. Interactive, scalable, reproducible data science Data Movement Data Discovery Data Publication Automation Machine Learning HPC Data Interactivity Data Access ALCF Other services PUBLISH  Automate capture, publication, and indexing of results from ALCF projects  Enable creation of workspaces and reusable data objects to accelerate data analysis and promote replicability ANALYZE  Combine ML approaches with ALCF HPC resources to extract more information from existing datasets and to guide future simulation campaigns FIND  Unify search, discovery, and consumption of datasets, workspaces, and analysis results
  7. 7. Interactive, scalable, reproducible data science PARSL Data Movement Data Discovery Data Publication Automation Machine Learning HPC Data Interactivity Data Access ALCF Other services
  8. 8. Materials science as an initial testbed • Advanced materials are critical to economic security and competitiveness, national security, and human welfare. (MGI 2011 interagency effort DoD, DOE, NASA, NIST, and NSF) • Finding and understanding new materials is complex, expensive, and time consuming: often taking > 20 years from research to application • Materials scientists are key users of leadership class computing (20-30% at ALCF) • Community data tools and services to advance materials science are emerging Nicholas Brawand, University of Chicago; Larry Curtiss, Argonne National Laboratory
  9. 9. Modeling material stopping power Stopping Power: a “drag” force experienced by high speed protons, electrons, or positrons in a material Areas of Application • Nuclear reactor safety • Magnetic confinement / inertial containment for nuclear fusion • Solar cell surface adsorption • Medicine (e.g., proton therapy cancer treatment) • Critical to understanding material radiation damage André Schleife and Cheng-Wei Lee (UIUC) 2016 ALCF INCITE Project “Electronic Response to Particle Radiation in Condensed Matter” André Schleife, Yosuke Kanai, Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
  10. 10. Modeling material stopping power André Schleife and Cheng-Wei Lee (UIUC) 2016 ALCF INCITE Project “Electronic Response to Particle Radiation in Condensed Matter” André Schleife, Yosuke Kanai, Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306 Stopping Power: a “drag” force experienced by high speed protons, electrons, or positrons in a material Areas of Application • Nuclear reactor safety • Magnetic confinement / inertial containment for nuclear fusion • Solar cell surface adsorption • Medicine (e.g., proton therapy cancer treatment) • Critical to understanding material radiation damage
  11. 11. Computing stopping power with TD-DFT Stopping power (SP) can be accurately calculated by time-dependent density functional theory (TD-DFT)  Excellent agreement with experiment  Can vary orientation, projectile, material  Highly parallelizable But we need many results  Direction dependence  Effect of defects  Many more materials TD-DFT alone may not be sufficient André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306 Experiment TD-DFT 16k CPU-Hr
  12. 12. Computing stopping power with TD-DFT Stopping power (SP) can be accurately calculated by time-dependent density functional theory (TD-DFT)  Excellent agreement with experiment  Can vary orientation, projectile, material  Highly parallelizable But we need many results  Direction dependence  Effect of defects  Many more materials TD-DFT alone may not be sufficient André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306 Experiment TD-DFT 16k CPU-Hr Potential Solution: Machine Learning!
  13. 13. 13 What? Algorithms that generate computer programs Why? Create software too complex to write manually General Task: Given inputs, predict output 𝑦 = 𝑓(𝑥) Common Algorithms: Advantages:  Fast 104-107 evaluations/CPU/sec  Adaptable Limited need to know underlying physics  Self-correcting Improves with more data.  Parallelizable Can use large-scale resources x y 𝒇(𝒙) = 𝒎𝒙 + 𝒃 Linear Regression 𝒙 < 𝟒 𝒚 = 𝟐 𝒚 = 𝟔 Decision Trees Neural Networks Source: nature.com What is machine learning?
  14. 14. Computing stopping power with TD-DFT+ML We propose to use ML to create surrogate models for TD-DFT How do we replace TD-DFT? First, consider what it does Inputs:  Atomic-scale structure (atom types, position)  Electronic structure of system Outputs:  Energy of entire system  Forces on each atom  Time-derivates of electronic structure If successful, we can use the ML model – not TD-DFT – to compute SP Allow prediction of future state
  15. 15. Materials science and machine learning Collect Process Represent Learn 3 4 -1.0 3 5 -0.5 Δ𝐻𝑓 = −1.0 Δ𝐻𝑓 = −0.5 𝑋 𝑦 Δ𝐻𝑓 = 𝑓(𝑍 𝐴, 𝑍 𝐵)
  16. 16. PARSL Step 1: Data collection Collect Process Represent Learn 3 4 -1.0 3 5 -0.5 Δ𝐻𝑓 = −1.0 Δ𝐻𝑓 = −0.5 𝑋 𝑦 Cooley • AGNI Fingerprints • Ion-Ion force • Local Charge Density • Linear models • ANNs • RNNs
  17. 17. Stopping Power prediction: Our data We have simulation results for H in face-centered cubic Al on a random trajectory. André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
  18. 18. Stopping Power prediction: Our data We have simulation results for H in face-centered cubic Al on a random trajectory. For each of multiple velocities, we have: 1) A simulated SP: one red point André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
  19. 19. Stopping Power prediction: Our data We have simulation results for H in face-centered cubic Al on a random trajectory. For each of multiple velocities, we have: 1) A simulated SP: one red point 2) A trajectory for that point: André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
  20. 20. Stopping Power prediction: Our data We have simulation results for H in face-centered cubic Al on a random trajectory. For each of multiple velocities, we have: 1) A simulated SP: one red point 2) A trajectory for that point: 3) A ground-state calculation for that trajectory’s starting point (About 6GB in total, mostly Qbox output files) André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
  21. 21. Steps 2-3: Data processing/Representation Collect Process Represent Learn 3 4 -1.0 3 5 -0.5 Δ𝐻𝑓 = −1.0 Δ𝐻𝑓 = −0.5 𝑋 𝑦 Cooley • AGNI Fingerprints • Ion-Ion force • Local Charge Density • Linear models • ANNs • RNNs PARSL
  22. 22. Designing a training set Key Question: What are the inputs and outputs to our model? Consider those for TD-DFT: Inputs:  Atomic-scale structure (atom types, position)  Electronic structure of system Outputs:  Energy of entire system  Time-derivates of electronic structure  Forces on each atom the projectile Input: Atomic Structure, Output: Force on Particle Collect Process Represent Learn Requires TD-DFT to compute Reliant on entire history (hard to predict) Not needed to compute stopping power
  23. 23. Selecting a representation Key Questions: What determines force on projectile? How do we quantify it? Types of Features Ion-ion repulsion: Can be computed directly Electronic interactions: We approximate with two feature types Local charge density: Density of electrons at projectile position AGNI fingerprints*: Describe the atom positions around projectile Another need: History Dependence Approach: Use charge density at fixed points ahead/behind projectile Collect Process Represent Learn * Botu, et al. J. Phys. Chem. C 121, 511–522 (2017).
  24. 24. PARSL Step 4: Machine learning Collect Process Represent Learn 3 4 -1.0 3 5 -0.5 Δ𝐻𝑓 = −1.0 Δ𝐻𝑓 = −0.5 𝑋 𝑦 Cooley • AGNI Fingerprints • Ion-Ion force • Local Charge Density • Linear models • ANNs • RNNs
  25. 25. Selecting a machine learning algorithm Key Criterion: Prediction accuracy Beyond accuracy, the algorithm should… be feasible to train with >104 entries be quick to evaluate produce a differentiable model Standard Procedure: 1. Identify suitable algorithms (linear models, neural networks) 2. Evaluate performance using cross- validation 3. Validate the model vs. unseen data
  26. 26. Live Demo You won’t see the live demo here. But it was cool. We located Schleife simulation data previously published to MDF; assembled a workspace comprising Aluminum data plus four Jupyter notebooks comprising data processing, ML training, and SP modeling methods; deployed the workspace to ALCF; and ran the notebooks to process data, train a model, and predict SP values for many directions.
  27. 27. Summary of analysis results We compared a variety of ML algorithms We computed SP for other trajectories We evaluated data needed for training We calculated Stopping Power for many trajectories
  28. 28. Materials science and machine learning Collect Process Represent Learn 3 4 -1.0 3 5 -0.5 Δ𝐻𝑓 = −1.0 Δ𝐻𝑓 = −0.5 𝑋 𝑦 Cooley • AGNI Fingerprints • Ion-Ion force • Local Charge Density • Linear models • ANNs • RNNs PARSL
  29. 29. EP EP EP EP Deep indexing Web UI, Forge, or REST API • Query • Browse • Aggregate Publish Web UI or API • Mint DOIs • Associate metadata Databases Datasets APIs LIMS etc. Distributed data storage Data publication service Data discovery service Materials Data Facility to discover data 116 data sources 3.4M records 300 TB
  30. 30. Data ingest flow 1. Data are created at ALCF 2. Data are staged, published, and assigned a permanent identifier (DOI) 3. Results are indexed for easy discovery 4. Interactive analysis, modeling, and interrogation 1
  31. 31. Data ingest flow Data Publication Data Storage 2 2 1. Data are created at ALCF 2. Data are staged, published, and assigned a permanent identifier (DOI) 3. Results are indexed for easy discovery 4. Interactive analysis, modeling, interrogation
  32. 32. Data ingest flow Data Publication Data Storage 3 Indexing 1. Data are created at ALCF 2. Data are staged, published, and assigned a permanent identifier (DOI) 3. Results are indexed for easy discovery 4. Interactive analysis, modeling, interrogation
  33. 33. Data ingest flow Data Publication Data Storage Query Fetch PARSL4 Indexing 1. Data are created at ALCF 2. Data are staged, published, and assigned a permanent identifier (DOI) 3. Results are indexed for easy discovery 4. Interactive analysis, modeling, interrogation
  34. 34. Data collection 1. Find data through search index 2. Create BDBags for data reusability, staging, and sharing 3. Stage data and launch interactive environment on ALCF computers 4. Analyze data!
  35. 35. Data collection 1. Find data through search index 2. Create BDBags for data reusability, staging, and sharing 3. Stage data and launch interactive environment on ALCF computers 4. Analyze data!
  36. 36. Data staging 1. Find data through search index 2. Create BDBags for data reusability, staging, and sharing 3. Stage data and launch interactive environment on ALCF computers 4. Analyze data!
  37. 37. Interactive, scalable, reproducible data analysis Data science and learning applications require: - Interactivity - Scalability - You can’t run this on a desktop - Reproducibility - Publish code and documentation
  38. 38. Interactive, scalable, reproducible data analysis Data science and learning applications require: - Interactivity - Scalability - You can’t run this on a desktop - Reproducibility - Publish code and documentation Our solution: JupyterHub + Parsl  Interactive computing environment  Notebooks for publication  Can run on dedicated hardware PARSL parsl-project.orgjupyter.org • Python-based parallel scripting library • Tasks exposed as functions (Python or bash) • Python code used to glue functions together • Leverages Globus for auth and data movement @App('python', dfk) def compute_features(chunk): for f in featurizers: chunk = f.featurize_dataframe(chunk, 'atoms') return chunk chunks = [compute_features(chunk) for chunk in np.array_split(data, chunks)]
  39. 39. Interactive, scalable, reproducible data science TD-DFT Calculations Machine Learning Direction-Dependent Stopping Power Existing Data ALCF Data Facility New Capabilities
  40. 40. Interactive, scalable, reproducible data science Existing Data ALCF Data Facility New Capabilities Next Steps 1. Model multiple velocities 2. Model more materials 3. Model direction dependence 4. Transfer Learning Results so far • Indexed data from an ALCF INCITE project • Interactively built surrogate model using ALCF Data Service capabilities • Extended results to model SP direction dependence in Aluminum
  41. 41. Thanks to our sponsors! U . S . D E P A R T M E N T O F ENERGY ALCF DF Parsl Globus IMaD
  42. 42. Building an ALCF Data Service: Interactive, scalable, reproducible data science Ian Foster Rick Wagner, Nick Saint, Eric Blau Kyle Chard, Yadu Nand Babuji Logan Ward, Ben Blaiszik Mike Papka with André Schleife and Cheng-Wei Lee Aiichiro Nakano (USC - ALCF INCITE 2017), Maria Chan (ANL - ALCF INCITE 2016, André Schleife (UIUC - ALCF INCITE 2016)

×