TeraGrid and Physics Research


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

TeraGrid and Physics Research

  1. 1. Teragrid and Physics Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu March 20,2009 1
  2. 2. High Performance Computing is Transforming Physics Research • TeraGrid ties together the high end computational resources (supercomputing, storage, visualization, data collections, science gateways) provided by NSF for the nation’s researchers, • Supported by computing and technology experts, many who have science PhDs and speak the users’ language. • World-class facilities, on a much larger scale than ever before, present major new opportunities for physics researchers to carry out computations that would have been infeasible just a few years ago. 2
  3. 3. TeraGrid Map 3
  4. 4. Hardware must be heterogeneous • Different capabilities • Different vendors • Potential for great burden on people trying to use more than one system. 4
  5. 5. Integrated View for Users • Single signon, • Single application form for access (it’s free- more later) • Single ticket system (especially useful for problems between systems) • Coordinated user support (find experts at any site) • Simplified data movement; (e.g. compute in one place, analyze in another) • Makes data sharing easy 5
  6. 6. Diversity of Resources (not exhaustive) • Very Powerful Tightly Coupled Distributed Memory – Trk2a-Texas (TACC)- Ranger (62,976 cores, 579 Teraflops, 123 TB memory) – Trk2b-Tennessee (NICS)- Kraken (Cray XT5, 66,048 cores, 608 teraflops, over 1 petaflop later in 2009). • Shared Memory – NCSA- Cobalt, Altix, 8 TF, 3 TB shared memory – PSC- Pople- Altix, 5 Tflop, 1.5 TB shared memory • Clusters with Infiniband – NCSA-Abe- 90 Tflops – TACC-Lonestar- 61 Tflops – LONI-Queen Bee 51 Tflops • Condor Pool (Loosely Coupled) – Purdue- up to 22,000 cpus • Visualization Resources – Purdue-TeraDRE-48 node nVIDIA GPUs – TACC-Spur- 32 nVIDIA GPUs • Various Storage Resources 6
  7. 7. Resources to come • Recognize that science is being increasingly data driven (LHC, LSST, …) • PSC- large shared memory system • Track2D being competed – A data-intensive HPC system – An experimental HPC system – A pool of loosely coupled grid computing resources – An experimental, high-performance grid test-bed • Track1 System at NCSA- 10 Pflop peak, 1 Pflop sustained on serious applications in 2011 7
  8. 8. Some Example Impacts on Physics (not overlapping with the presentations to follow) 8
  9. 9. Lattice QCD- MILC collaboration • Improved precision on “standard model”, required to uncover new physics. • Need larger lattices, lighter quarks • Frequent algorithmic improvements • UseTeraGrid resources at NICS, PSC, NCSA, TACC; DOE resources at Argonne, NERSC, specialized QCD machine at Brookhaven, cluster at Fermilab Store results with The International Lattice Data Grid (ILDG), an international organization which provides standards, services, methods and tools that facilitates the sharing and interchange of lattice QCD gauge configurations among scientific collaborations (US, UK, Japan, Germany, Italy, France, and Australia) .http://www.usqcd.org/ildg/ 9
  10. 10. Astrophysics-Mike Norman et al UCSD • Small (1 part in 105) spatial inhomogeneities 380,000 years after the Big Bang, as revealed by WMAP Satellite data, get transformed by gravitation into the pattern of severe inhomogeneities (galaxies, stars, voids etc.) that we see today. • Uniform meshes won’t do, must zoom in on dense regions to capture the key physical processes- gravitation (including dark matter), shock heating and radiative cooling of gas. So need an adaptive mesh refinement scheme (they use 7 levels of mesh refinement). The filamentary structure in this simulation in a cube 1.5 billion light years on a side is also seen in real life observations such as the Sloan Digital Sky Survey. 10
  11. 11. Astrophysics (cont’d) • Need large shared memory capabilities for generating initial conditions, (adaptive mesh refinement is very hard to load- balance on distributed memory machines); then the largest distributed memory machines (Ranger & Kraken) for the simulation; shared memory again for data analysis and visualization; and need long term archival storage for configurations – so lots of data movement between sites. • TeraGrid helped make major improvements in the scaling and efficiency of the code (ENZO), and in the visualization tools which are being stressed at these volumes. 11
  12. 12. Nanoscale Electronic Structure (nanoHUB, Klimeck, Purdue) • Challenge of designing microprocessors and other devices with nanoscale components. Need quantum mechanics for quantum dots, resonant tunneling diodes, and nanowires. • Largest codes operate at the petascale (NEMO-3D, OMEN), using 32,768 cores of Ranger, and generally use resources at NCSA, PSC, IU,ORNL and Purdue. • Developing modeling and simulation tools and a simple user interface (Gateways) for non-expert users. nanoHUB.org hosts more than 90 tools, had >6200 users, ran>300,000 simulations, supported 44 classes, in 2008. • Will benefit from improved metascheduling capabilities to be implemented this year in TeraGrid because want interactive response for the simple calculations. • Communities develop the Gateways- TG helps interface that to TG resources. 12
  13. 13. Aquaporins - Schulten group,UIUC • Aquaporins are proteins which conduct large volumes of water through cell walls while filtering out charged particles like hydrogen ions. • Start with known crystal structure, simulate 12 nanoseconds of molecular dynamics of over 100,000 atoms, using NAMD • Water moves through aquaporin channels in single file. Oxygen leads the way in. At the most constricted point of channel, water molecule flips. Protons can’t do this.
  14. 14. Aquaporin Mechanism Animation pointed to by 2003 Nobel chemistry prize announcement for structure of aquaporins (Peter Agre) The simulation helped explain how the structure led to the function
  15. 15. Users and Usage 15
  16. 16. 2008 TeraGrid Usage By Discipline 16
  17. 17. If you’re not yet a TeraGrid user and constraining your research to fit into your local capabilities… • Consider TeraGrid. Getting time is easy. • It’s free • We’ll even help you with coding and optimization • See www.teragrid.org/userinfo/getting_started.php? • Don’t be constrained by what appears possible today. Think about your problem and talk to us. 17
  18. 18. Training (also free) March 12 - 13, 2009 Parallel Optimization and Scientific Visualization for Ranger March 19 - 20, 2009 OSG Grid Site Administrators Workshop March 23 - 26, 2009 PSC/Intel Multi-core Programming and Performance Tuning Workshop March 24, 2009 C Programming Basics for HPC (TACC) April 13 - 16, 2009 2009 Cray XT5 Quad-core Workshop (NICS) April 21, 2009 Fortran 90/95 Programming for HPC (TACC) June 22 - 26, 2009 TeraGrid '09 For fuller schedule see: http://www.teragrid.org/eot/workshops.php 18
  19. 19. Campus Champions Program • Campus advocate for TeraGrid and CI • TeraGrid ombudsman for local users • Training program for campus representatives • Quick start-up accounts for campus • TeraGrid contacts for problem resolution • Over 31 campuses signed on, more in discussions • We’re looking for interested campuses! –See Laura McGinnis