Successfully reported this slideshow.
Your SlideShare is downloading. ×

Gladier: The Globus Architecture for Data Intensive Experimental Research (APS Workshop)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
GlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
Loading in …3
×

Check these out next

1 of 33 Ad
Advertisement

More Related Content

Slideshows for you (20)

Similar to Gladier: The Globus Architecture for Data Intensive Experimental Research (APS Workshop) (20)

Advertisement
Advertisement

Gladier: The Globus Architecture for Data Intensive Experimental Research (APS Workshop)

  1. 1. Gladier: The Globus Architecture for Data Intensive Experimental Research October 15, 2021
  2. 2. Agenda • Large scale experiments • Gladier • funcX • Globus Flows • Gladier toolkit • Building your own Gladier pipeline • Example deployments • Demo a real client
  3. 3. The brilliance arms race... K. Wille, The Physics of Particle Accelerators: An Introduction, Oxford University Press, Oxford, UK (2000). J. B. Parise and G. E. Brown, Jr., Elements, 2, 37-42 (2006).
  4. 4. Argonne Leadership Computing Facility Advanced Photon Source
  5. 5. Different facilities, different people.. • ALCF and APS have very distinct… – Research Statements – Entry Curve – Skill Requirements – Allocation System – Support Staff – Time Scales – Etc.. Yada.. Yada.. Yada..
  6. 6. Canonical research automation flow for instruments 6 Data Capture Data Analysis / Model in the Loop Publication Data Staging Metadata Extraction And Data Cataloging Data Staging Catalog Feedback Data Generation Examples • Serial X-Ray Crystallography • X-Ray Photon Correlated Spectroscopy • High energy diffraction microscopy • High throughput ptychography • High energy x- ray diffractions Which is only as simple as the amount of data you acquire
  7. 7. Local vs Distributed Acquire Process Visualize Experiment Location Normal experiments rely on having storage and processing units “close” to the acquisition machines. Distributed system allow the beamline to focus only on the experimental apparatus.
  8. 8. Local vs Distributed Acquire Process Visualize Transfer Raw Data Transfer Instructions Remote Location Transfer Results Process Experiment Location
  9. 9. Gladier: The Globus Architecture for Data-Intensive Experimental Research • Accelerate and simplify flow development and deployment • Combine tools into reliable, flexible, secure, distributed flows • Bridge instruments and computing facilities • Automate data collection and publication to create FAIR data
  10. 10. Gladier:Globus+ALCF framework for online, data-intensive, large- scale experiment science Gladier is a framework for combining instruments, storage, and compute using loosely coupled services Reference implementation to gather experiences Globus: Remote data management Flows: Workflows that span time and space funcX: Remote (scalable) execution on diverse HPC-edge systems ALCF Community Data Co-Op portals: Indexing and visualizing scientific data ALCF Eagle: User-managed storage
  11. 11. Globus Services for Research Data Management Unified Data Access Data Transfer Platform as a Service Auth Transfer Share Search … Distributed Automation Remote Execution Data Publication Globus Services for Research Data Management
  12. 12. funcX: managed and federated FaaS • Cloud-hosted service for managing compute • Register and share compute endpoints • Register and share Python functions • Reliably, scalable, securely execute functions on remote endpoints • Integrated with Globus Auth and data ecosystem 14
  13. 13. Transform laptops, clusters, clouds into function serving endpoints • Python-based agent and pip installable locally or in Conda • Elastically provisions resources from local, cluster, or cloud system • Manages concurrent execution on provisioned resources • Optionally manages execution in Docker, Singularity, Shifter containers • Share endpoints with collaborators 15 $ pip install funcx-endpoint $ funcx-endpoint configure myep $ funcx-endpoint start myep
  14. 14. Register and share functions Create funcX client (and authn) 16 def compute(input_args): # do something return results def compute(input_args): # do something return results def compute(input_args): # do something return results Define and register Python function
  15. 15. funcX Demo Try funcx on Binder https://funcx.org/binder
  16. 16. Data (and compute) automation • Flows: A platform service for defining, applying, and sharing distributed research automation flows • Flows comprise Actions • Action Providers: Called by Flows to perform tasks • Triggers*: Start flows based on events * In development
  17. 17. Extending the ecosystem: Action providers 19 • Action Provider is a service endpoint – Run – Status – Cancel – Release – Resume • Action Provider Toolkit action-provider- tools.readthedocs.io/en/latest Search Transfer Notificatio n ACLs Identifier Delete Ingest User Form Describe Xtract funcX Web Form Custom built Globus Provided
  18. 18. Applying the Globus platform to science at the APS 20 Advanced Photon Source Key: funcX agent Globus Connect Theta Bebop Cluster Argonne Leadership Computing Facility Laboratory Computing Research Center Petrel store APS Computing Orthros Cluster APS DM system Porta l serve r Porta l serve r Cooley Action 1 Action 2 Action 3 Action 4
  19. 19. Gladier Toolkit ● Function registration ● Flow registration ● Re-registration on file change ● Automate auth ● Input Validation ● Metadata Injection ● Interactive Progress Reporting ● Error handling Gladier provides structure for running Actions in Globus Automate flows by wrapping them as a reusable Tool Actions can be Funcx functions, Transfers, triggers or any HTTP action provider. Our toolbox provides two things, a set of common used experimental tools and a Client to orchestrate how they will run and interact with the experiments. - The Gladier Tools define the work to be done - The Gladier Base defines a collection of Gladier Tools, and ensures all of the requirements for using them have been met. https://github/globus-gladier/gladier Pip install gladier
  20. 20. Gladier Client Provides a concise configuration of Gladier tools to be used in a flow Tools • Automatically registers funcX functions • Automatically registers Automate flows • Watches for changes, and re-registers anything as needed
  21. 21. Lets try it! https://jupyter.demo.globus.org/
  22. 22. XPCS ALCF Data Portal Argonne JLSE Argonne Leadership Computing Facility APS Publication 5 Imaging 1 Lab Server 1 Acquisition 2 Plot results 4 XPCS-Eigen 3 Science! 6 With Suresh Narayanan et al. APS Sector 8-ID
  23. 23. XPCS Globus Search Globus Transfer Return results Globus Transfer Transfer input data Catalog Results Data Acquisition High-quality FAIR data funcX Analyze images funcX Visualize results Searchable Portal
  24. 24. Online XPCS • Integration with the APS DM system to trigger Globus Automate flow. • Flow moves data to ALCF, perform analysis, publish results • Metadata and plots are dynamicallys extracted and integrated into ALCF portal allowing users to monitor experiments and reprocess data
  25. 25. Serial Crystallography Automation With Andrzej Joachimiak, Darren Sherrell et al. APS Sector 19
  26. 26. Closing the loop
  27. 27. “These data services have taken the time to solve a structure from weeks to days and now to hours” Darren Sherrell, SBC beamline scientist APS Sector 19 4 structures available in PDB – Scientific paper forthcoming ALCF + APS capabilities were used to determine the room temperature structure of >4 viral surface proteins Next steps: Develop Nature Methods paper, continue running flow, provide DOE highlights ALCF Data Services in the DOE COVID19 Fight
  28. 28. Example: Rapid Training of Deep Neural Networks using Remote Resources • DNN at the edge for fast processing, filtering, QC • Requires tight coupling with simulation and training with real-time data • Globus Flow: 31 Zhengchun Liu, Jana Thayar, et al. – Globus to rapidly move data for training – funcX for simulation and model training – Globus to move models to the edge – (Future) funcX for inference at the edge
  29. 29. Ptychography Automated flows leveraging ThetaGPU for 2D and 3D reconstructions - Total size: 1.32 TB, 3082 scans - 100 iterations: 199GB,1602 scans - 500 iterations: 502GB, 383 scans - 1000 iterations: 616GB, 1097 scans - Inverse problem, ML iterations for reconstruction to converge run concurrently and faster processing sent to scientist immediately - Size depends on output frequency - Scans are reconstructed during ALCF reservation, however, additional data are acquired after reservation expires - Opportunistic reconstruction via backfill and standard queues - 3082 workflows executed - Single workflow: 3 transfers + 1 funcX call to run reconstruction on ThetaGPU
  30. 30. ALCF High-Energy X-ray Diffraction Microscopy Requirement to select where MIDAS analyses are executed: APS Orthros, ALCF ThetaGPU, or ALCF Cooley ● MIDAS: tomography reconstruction, near- field and far-field diffraction analysis Flow: ● Globus to transfer input data to destination ● funcX dynamically provisions resources and runs analysis at scale ● Deploy containers with MIDAS software to perform tasks ● Results assembled and returned to APS Extending the system to allow users to run analysis at home institute Hemant Sharma, et al.
  31. 31. Experiment integration with pyEPICS Data Acquisition Gladier resources can in the future influence experiments by directly controlling local experiments Return results Transfer input data Analyze images Experiment control Return results Search Viz Decision Searchable Portal
  32. 32. A framework for multiplication • Each new gladier repository lowers the entry barrier of the next one. • New experiments leveraging this work will allow us to scale the capabilities across the APS and other facilities. • Expertise and capabilities from this project allowed the team to play key roles in DOE’s COVID-19 response, and the new National Virtual Biotechnology Lab (NVBL) • Newly-funded DOE ASCR project (Braid) will allow new modular capabilities to be developed (e.g., Rule-based engine to support continuum computing concepts ) and added to Gladier that allow scientists to more easily access ALCF resources • Gladier permits to integrate with any experimental capability that is python based, i.e. pyEPICS, bluesky, automation.
  33. 33. Real life Gladier XPCS in real-time

×