More Related Content

Similar to Amy Walton - NSF’s Computational Ecosystem for 21st Century Science & Engineering(20)

More from Larry Smarr(20)

Amy Walton - NSF’s Computational Ecosystem for 21st Century Science & Engineering

  1. NSF’s Computational Ecosystem for 21st Century Science and Engineering Amy Walton, Deputy Director Office of Advanced Cyberinfrastructure National Science Foundation 1 Fourth National Research Platform (4NRP) Workshop September 9, 2023
  2. Topics • Looking Back – The Pacific Research Platform • A Productive Experiment • Moving Targets • Acknowledgements • Looking Forward – A National Research Ecosystem • Challenges • Opportunities • Resources 2
  3. NSF 15-534: Data, Networking, and Innovation 3 An initial – and productive– collaboration between two OAC programs: • Campus Cyberinfrastructure (CC*) • Data Infrastructure Building Blocks (DIBBs) Area 1: Multi-Campus/Multi-Institution Model Implementations Emphasis on integration of data and network infrastructure activities • Awards served as models for potential future national scale network-aware data- focused cyberinfrastructure. • Expected to be science-driven, demonstrating a strong and credible connection to the multi-campus, multi-institutional, and/or regional scientific communities they serve. • Emphasized the value of sharing data beyond a specific institution to the wider science, engineering, and education communities.
  4. Pacific Research Platform: Then and Now 4 • Goal: Expand the campus Science DMZ network systems model into a regional model for data-intensive science. • The PRP data-sharing architecture allowed region-wide virtual co-location of data with computing. • Endpoints of PRP sites -- devices called Flash I/O Network Appliances (FIONAs) -- were incorporated into a Kubernetes cluster of FIONAs called Nautilus. • Data can traverse multiple, heterogeneous networks with minimal performance degradation. Now uses 11 major regional/national networks: • 737 namespaces (projects) • >2,100 users • Researchers at 94 US campuses in 39 states
  5. Not Mentioned in the Original Proposal: 5 • Kubernetes • Containers • Automation • Jupyter • Ceph These technologies emerged and were integrated into what became Nautilus during the period of the PRP grant • Machine Learning • Artificial Intelligence • Neutrino Observatory • COVID • Wildfires While all applications listed in the original proposal were addressed, these applications became some of the largest PRP CPU/GPU application consumers
  6. 6 Acknowledgements: Many Contributors CHASE-CI [CISE/CNS] 2100237 and 2120019 Additional GPU nodes, expand community Expanse 1928224 (ACSS-I) NVIDIA GPUs, cloud integration, composable systems Voyager 2005369 (ACSS-II) AI-focused hardware, Intel/Habana tools Prototype NRP 2112167 (ACSS-II) Distributed across SDSC, U Nebraska – Lincoln, and MGHPCC PRP cyberinfrastructure has increased compute capacity through several sources: • Individual data-intensive research faculty at multiple campuses used their grant resources • This added ~1/4 of the total GPUs on Nautilus Today, Nautilus has nearly 20,000 CPU-cores and nearly 1500 GPUs T-NRP 1826967 (CC*) Connect Quilt Regional Networks using CENIC and Internet2 CHASE-CI [CISE/CNS] 1713149 – Cloud of GPUs for faculty to train AI algorithms
  7. Astronomy Physics Computational Bio Material Science Evolutionary Bio Climatology 7 Looking Forward: Cyberinfrastructure that Enables Research Across Science Disciplines Challenges: • Large instruments producing • Big data requiring • Big compute for • Highly collaborative scientists in • Different specializations across • Widely Distributed infrastructure that must be • Available, ensure • Workflow Integrity, and be • Easy to use while adhering to • Regulatory or policy requirements
  8. Data Cyberinfrastructure • Federal guidance on Open Science and Public Access presents new opportunities for an agile, scalable and equitable national data cyberinfrastructure to support data sharing. • Recent OAC CC* awards provided federated campus storage. • Required: Follow NSF data practices; sustainability plan; integrate into networks • Future Directions: How to capitalize on existing investments and achieve a national scale data CI to support equitable access to and use of data using FAIR principles? • Our proposed solution: A loose federated approach of existing and new repositories and infrastructure which adhere to basic agreed principles. • Repositories and other data projects that join the network gain benefit from shared resources and services.
  9. CI Professionals • A significant barrier to use of national resources is access to CI professionals who can provide expertise and support that are responsive to local needs. • The new ACCESS Computational Science Support Network (CSSN) provides a framework for engaging, training/mentoring, and coordinating a network of CI professionals • The new SCIPE Solicitation (NSF 23-574) supports CI professionals at the campus or regional level. • Enables engagement of CI professionals into ACCESS Computational Science Support Network • Requires: A plan for mentoring, professional development, and sustainability; and 20% of supported individual’s time be dedicated to national activities.
  10. Leadership-class Capacity Systems Distributed Services Cloud resources Innovative Prototypes/Testbeds NSF-supported Advanced CI Resources Anvil Purdue University Bridges 2 Carnegie-Mellon University Delta U of Illinois, Urbana-Champaign Expanse U of California, San Diego Jetstream 2 University of Indiana + Partners Stampede 2 U of Texas, Austin Frontera U of Texas, Austin Neocortex Carnegie-Mellon University Voyager U of California, San Diego Ookami Stonybrook University NRP U of California, San Diego ACES Texas A&M University Learn how to access resources at access-ci.org Cloudbank U of California, San Diego CloudLab University of Utah Chameleon University of Chicago PATh/OSG U of Wisconsin, Madison ACCESS Several Partners 10
  11. Democratizing Science through Cyberinfrastructure Broad, fair, and equitable access to advanced computing is essential to democratizing science in the 21st century • Significant barriers • Knowledge: Awareness, discovery, expertise, support • Technical: Allocation, access, on-ramps • Social: Awareness of the importance of access to CI, rewards structures • Complex tradeoffs / optimizations • Capacity vs. capability • Stability vs. innovation • Performance vs. ease of use • Expert vs. novice M. Parashar, "Democratizing Science Through Advanced Cyberinfrastructure" in Computer, vol. 55, no. 09, pp. 79-84, 2022. doi:10.1109/MC.2022.3174928
  12. Advanced Computing Ecosystem as a Strategic National Asset 12 National Strategic Computing Reserve (NSCR) • A coalition of experts and resource providers that could be mobilized quickly to provide critical computational resources in times of urgent need • Build on experiences from the COVID-19 HPC Consortium, responses to RFI • Aligns with the FACE Strategic plan NSF’s Advanced Cyberinfrastructure Ecosystem: Highly Accessible Computing • Network of advanced systems and services • Leadership and capacity systems, testbeds • Federation (PATh) and coordination services (ACCESS) • Scalable user support networks https://www.whitehouse.gov/wp- content/uploads/2021/10/National-Strategic- Computing-Reserve-Blueprint-Oct2021.pdf Democratized access to an advanced CI Ecosystem
  13. Realizing an Advanced CI Ecosystem for All • Integrated and user-friendly portals and gateways for discovering and accessing resources; • Access to local CI resources as part of a shared fabric of national CI resources reachable through high-speed frictionless data networking; • Diverse and flexible allocation and access modes that support a diversity of users and applications; • Agile, easily accessible, and scalable networks of experts providing embedded expertise and support that is responsive to local needs; and • Broadly accessible training targeting the spectrum of CI users and skills. The Missing Millions: Democratizing Computation and Data to Bridge Digital Divides and Increase Access to Science for Underrepresented Communities (A. Blatecky, EAGER) https://www.rti.org/publication/missing- millions/fulltext.pdf
  14. 14