NICS, Adaptive Computing, and Intel: Leadership in HPC


Published on

In this video from Moabcon 2013, Troy Baer presents: NICS, Adaptive Computing, and Intel: Leadership in HPC.

An Appro Xtreme-X Supercomputer named Beacon, deployed by the National Institute for Computational Sciences (NICS) of the University of Tennessee, tops the current Green500 list, which ranks the world’s fastest supercomputers based on their power efficiency. To earn its number-one ranking, the supercomputer employed Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors to produce 112.2 trillion calculations per second using only 44.89 kW of power, resulting in world-record efficiency of 2.499 billion floating point operations per second per watt.”

View the talk here:

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

NICS, Adaptive Computing, and Intel: Leadership in HPC

  1. 1. NICS, AdaptiveComputing, and Intel: Leadership in HPC Troy BaerSenior HPC System Administrator NICS
  2. 2. Overview• Introduction to NICS• NICS and Adaptive Computing• NICS and Intel – SC12 Green 500 effort• Going Forward
  3. 3. National Institute for Computational Sciences: A University of Tennessee / ORNL Partnership• NICS is an NSF-funded HPC center – Founded in 2007 – Operated by the University of Tennessee, located at ORNL – XSEDE Partner and Service Provider• XSEDE Systems – Kraken (Cray XT5, 112,984 Opteron cores) – Nautilus (SGI UV, 1,152 Nehalem cores + 16 M2070 GPUs) – Keeneland final system (HP GPU cluster, 4,224 Sandy Bridge cores + 792 M2090 GPUs) in conjunction with Georgia Tech
  4. 4. Other Systems and Projects at NICS• Non-XSEDE Systems – Keeneland initial delivery system (HP GPU cluster) in conjunction with Georgia Tech – Ares (Cray XE/XK6) – Beacon (Appro/Cray cluster; more on this later...) – Darter (Cray XC30; more on this later...)• Associated Centers and Projects – Application Acceleration Center of Excellence (AACE) • Parent project for Beacon – Remote Data Analysis and Visualization (RDAV) project • Parent project for Nautilus
  5. 5. NICS and Adaptive Computing• NICS and Adaptive have been working together literally since the founding of the center• Achievements – Kraken: 90-95% utilization on a petaflop-class system for 3 years and counting! • Over 3 billion core-hours delivered in total, 965 million delivered in CY2012 • Delivering ~65% of all XSEDE computing cycles until very recently • Bi-modal scheduling for capability vs. capacity – Athena (Cray XT4): Dedicated access for COLA climate modeling group for ~6 months – Kraken/Athena: Annual OU CAPS Spring Experiment (storm forecasting) – Nautilus: NUMA+GPU scheduling – KIDS and KFS: GPU scheduling test bed
  6. 6. NICS and Intel• AACE was born of conversations between NICS, ORNL, and Intel in early 2011• Beacon project – Application readiness for Intel Xeon Phi – NSF STCI award provided people funding and initial hardware • 8 funded science teams • Open call for more science teams just ended – Second phase of hardware funded by the University of Tennessee system and the state of Tennessee • Data-intensive computing • Power efficiency research
  7. 7. BEACON Phase 1 Phase 2Compute Nodes 16 Appro Grizzly Pass 48 Appro GreenBlade GBN814NNode Processor 2x 8-core Sandy Bridge 2x 8-core Sandy BridgeMemory/Node (GB) 64 256SSD/Node (GB) 160 960Xeon Phis/Node 2 4Interconnect QDR Infiniband FDR InfinibandBandwidth to Storage (GB/s) ~2.5 ~15OS CentOS 6.2 CentOS 6.2Installation NFS-root DiskfulBatch Environment TORQUE/Moab TORQUE/Moab
  8. 8. SC12 Green 500 Effort• In the run-up to the Supercomputing 2012 conference, NICS, Intel, and Appro (now Cray) decided to take a shot at #1 on the Green 500 list• People worked on the system literally around the clock in Tennessee, California, India, and Germany for a month to make this happen!• Result: New record of 112.2 TF/s @ 44.89 kW (i.e. 2.499 GF/W)
  9. 9. Stupid Phi Tricks• Xeon Phis have a number of programming models – Offload (like GPUs) – Reverse offload (i.e. Phis offloading to the host) – Native mode (i.e. running MPI ranks on Phis) – Various hybrids thereof• Xeon Phis are basically embedded x86_64 Linux boxes, complete with SSH, NFS, etc... which allows you to do all sorts of clever and/or hilarious things in job prologues and epilogues – NFS-export Lustre and/or local scratch from host to Phis • The Phis BusyBox NFS client currently doesnt support NFS v3 locking – Intel is working on this – Provision the job owners uid (and only the job owners uid) on MICs at job start – Reboot Phis between jobs • A bit slower than one might like – Intel is working on this as well
  10. 10. Going Forward• New systems – Beacon Phase 2 (just accepted) – Darter (Cray XC30, just received and accepted) – Hopefully more in the future...• New architectures make for interesting challenges WRT allocations and accounting – With GPUs and MICs becoming more commonplace, the notion of a “CPU-hour” or “core-hour” is even less meaningful than it was before. – Should the new accounting unit be the “node- hour”?• Growing gap between capability/hero users and capacity/canned-code users needs to be addressed somehow
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.