Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TOP500 List November 2014

3,082 views

Published on

Slides from the presentation of the 44th TOP500 List during the SC14 conference in New Orleans.

Published in: Software
  • Be the first to comment

TOP500 List November 2014

  1. 1. Highlights of the 44th TOP500 List SC14, New Orleans, November 17, 2014 Erich Strohmaier
  2. 2. # Site M4an1ufaSctTurerLIST: TCHompEute rTOP10 Country Cores Rmax [Pflops] Power [MW] 1 National University of Defense Technology NUDT Tianhe-2 NUDT TH-IVB-FEP, Xeon 12C 2.2GHz, IntelXeon Phi China 3,120,000 33.9 17.8 2 Oak Ridge National Laboratory Cray Titan Cray XK7, Opteron 16C 2.2GHz, Gemini, NVIDIA K20x USA 560,640 17.6 8.21 3 Lawrence Livermore National Laboratory IBM Sequoia BlueGene/Q, Power BQC 16C 1.6GHz, Custom USA 1,572,864 17.2 7.89 4 RIKEN Advanced Institute for Computational Science Fujitsu K Computer SPARC64 VIIIfx 2.0GHz, Tofu Interconnect Japan 795,024 10.5 12.7 5 Argonne National Laboratory IBM Mira BlueGene/Q, Power BQC 16C 1.6GHz, Custom USA 786,432 8.59 3.95 6 Swiss National Supercomputing Centre (CSCS) Cray Piz Daint Cray XC30, Xeon E5 8C 2.6GHz, Aries, NVIDIA K20x Switzer-land 115,984 6.27 2.33 7 Texas Advanced Computing Center/UT Dell Stampede PowerEdge C8220, Xeon E5 8C 2.7GHz, Intel Xeon Phi USA 462,462 5.17 4.51 8 Forschungszentrum Juelich (FZJ) IBM JuQUEEN BlueGene/Q, Power BQC 16C 1.6GHz, Custom Germany 458,752 5.01 2.30 9 Lawrence Livermore National Laboratory IBM Vulcan BlueGene/Q, Power BQC 16C 1.6GHz, Custom USA 393,216 4.29 1.97 10 Government Cray Cray CS-Storm, Xeon E5 10C 2.2GHz, I-FDR, NVIDIDA K40 USA 72,800 3.58 1.50
  3. 3. 50 78 0 100 150 200 250 300 350 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 REPLACEMENT RATE
  4. 4. ANNUAL PERFORMANCE INCREASE OF THE TOP500 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 − TOP500 Trend − Moore’s Law 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
  5. 5. AVERAGE SYSTEM AGE 0 0.5 1 1.5 2 2.5 3 3.5 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Age [Years] 1.27 years
  6. 6. PERFORMANCE DEVELOPMENT 1 Eflop/s 1E+09 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000 1 Tflop/s 1000 100 Gflop/s 100 10 Gflop/s 10 1 0.1 1.17 TFlop/s 59.7 GFlop/s 400 MFlop/s 309 PFlop/s 33.9 PFlop/s 153 TFlop/s SUM N=1 N=500 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1 Gflop/s 100 Mflop/s
  7. 7. PERFORMANCE DEVELOPMENT 1 Eflop/s 1E+09 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000 1 Tflop/s 1000 100 Gflop/s 100 10 Gflop/s 10 1 0.1 1.17 TFlop/s 59.7 GFlop/s 400 MFlop/s 309 PFlop/s 33.9 PFlop/s 153 TFlop/s SUM N=1 N=500 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1 Gflop/s 100 Mflop/s June 2008
  8. 8. PERFORMANCE DEVELOPMENT 1 Eflop/s 1E+09 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000 1 Tflop/s 1000 100 Gflop/s 100 10 Gflop/s 10 1 0.1 1.17 TFlop/s 59.7 GFlop/s 400 MFlop/s 309 PFlop/s 33.9 PFlop/s 153 TFlop/s SUM N=1 N=500 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1 Gflop/s 100 Mflop/s June 2008 June 2013
  9. 9. PROJECTED PERFORMANCE DEVELOPMENT 1E+11 1E+10 1 Eflop/s 1E+09 100 Pflop/s 100000000 10 Pflop/s 10000000 1 Pflop/s 1000000 100 Tflop/s 100000 10 Tflop/s 10000 1 Tflop/s 1000 100 Gflop/s 100 10 Gflop/s 10 1 0.1 SUM N=1 N=500 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 1 Gflop/s 100 Mflop/s
  10. 10. PERFORMANCE FRACTION OF THE TOP5 SYSTEMS 18% 16% 14% 12% 10% 8% 6% 4% 2% 0% 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1 2 3 4 5
  11. 11. TOTAL PERFORMANCE IS ACCUMULATED 100 90 80 70 60 50 40 30 20 10 0 RANK AT WHICH HALF OF 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
  12. 12. BELL’S LAW (1972) “Bell's Law of Computer Class formation was discovered about 1972. It states that technology advances in semiconductors, storage, user interface and networking advance every decade enable a new, usually lower priced computing platform to form. Once formed, each class is maintained as a quite independent industry structure. This explains mainframes, minicomputers, workstations and Personal computers, the web, emerging web services, palm and mobile devices, and ubiquitous interconnected networks. We can expect home and body area networks to follow this path.” From Gordon Bell (2007), http://research.microsoft.com/~GBell/Pubs.htm
  13. 13. HPC COMPUTER CLASSES AND BELL’S LAW • Bell’s Law states, that: • Important classes of computer architectures come in cycles of about 10 years. • It takes about a decade for each phase – Early research – Early adoption and maturation – Prime usage – Phase out past its prime • Can we use Bell’s Law to classify computer architectures in the TOP500?
  14. 14. 500 450 400 350 300 250 200 150 100 50 0 Accelerated/E mbedded Commodity Cluster Custom Scalar Vector/SIMD BELL’S LAW
  15. 15. BELL’S LAW HPC Computer Classes Class Early Adoption starts: Prime Use starts: Past Prime Usage starts: Data Parallel Systems Mid 70’s Mid 80’s Mid 90’s Custom Scalar Systems Mid 80’s Mid 90’s Mid 2000’s Commodity Cluster Mid 90’s Mid 2000’s Mid 2010’s ??? Accelerators or Embedded Proc Mid 2000’s Mid 2010’s ?! Mid 2020’s ???
  16. 16. 80 70 60 50 40 30 20 10 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 Systems Kepler/Phi Intel Xeon Phi Clearspeed IBM Cell ATI Radeon Nvidia Kepler Nvidia Fermi ACCELERATORS
  17. 17. PERFORMANCE OF ACCELERATORS 120 100 80 60 40 20 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 Total Performance [Pflop/s] Kepler/Phi Clearspeed ATI Radeon IBM Cell Intel Xeon Phi Nvidia Fermi Nvidia Kepler
  18. 18. PERFORMANCE SHARE OF ACCELERATORS 40% 35% 30% 25% 20% 15% 10% 5% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 Fraction of Total TOP500 Performance
  19. 19. CORES PER SOCKET 500 450 400 350 300 250 200 150 100 50 0 2002 2004 2006 2008 2010 2012 2014 16 14 12 10 9 8 6 4 2 1
  20. 20. 17 56 231 57 88 7 44 4 cores 6 cores 8 cores 10 cores 12 cores 14 cores 16 cores CORES PER SOCKET
  21. 21. 1000000 100000 10000 1000 100 10 1 0.1 TECHNOLOGICAL TRENDS Average of Rmax Average of Sockets Average of Rmax/socket Average of Cores per Socket Average of Rmax/core
  22. 22. LINPACK EFFICIENCY 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0 100 200 300 400 500 Linpack Efficiency
  23. 23. United States 46% China 12% Korea, South Australia France 6% 2% United Kingdom Japan 6% 6% Germany 5% India 2% Russia 2% 2% Others, 53, 11% United States China Japan United Kingdom France Germany India Russia Australia Korea, South Others COUNTRIES / SYSTEM SHARE
  24. 24. 100,000 10,000 1,000 100 10 1 0 2000 2002 2004 2006 2008 2010 2012 2014 Total Performance [Tflop/s] US EU Japan China PERFORMANCE OF COUNTRIES
  25. 25. 0 20 40 60 80 100 120 140 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Others India China Korea, South Japan ASIAN COUNTRIES
  26. 26. 200 180 160 140 120 100 80 60 40 20 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Systems Others Poland Russia Spain Sweden Switzerland Netherlands Italy France United Kingdom Germany EUROPEAN COUNTRIES
  27. 27. HP, 180, 36% IBM, 153, 31% NUDT, 5, 1% Fujitsu, 8, 2% Dell, 9, 2% Cray Inc., 62, 12% Bull, 17, 3% SGI, 23, 4% Others, 43, 9% HP IBM Cray Inc. SGI Bull Dell Fujitsu NUDT Others VENDORS / SYSTEM SHARE
  28. 28. Cray, 16, 32% IBM, 13, 26% NUDT, 3, 6% SGI, 4, 8% Bull, 5, 10% Fujitsu, 3, 6% Others, 6, 12% Cray IBM Bull SGI Fujitsu NUDT Others VENDORS (TOP50) / SYSTEM SHARE
  29. 29. POWER CONSUMPTION 8 7 6 5 4 3 2 1 0 2008 2009 2010 2011 2012 2013 2014 Power [MW] TOP10 TOP50 TOP500 3.0 x in 5 y 2.6 x in 5 y 2.7 x in 5 y
  30. 30. POWER CONSUMPTION 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Power [MW] TOP10 TOP50 TOP500
  31. 31. POWER EFFICIENCY 2,500 2,000 1,500 1,000 500 0 2008 2009 2010 2011 2012 2013 2014 Linpack/Power [Gflops/kW] TOP10 TOP50 TOP500
  32. 32. POWER EFFICIENCY 4,500 4,000 3,500 3,000 2,500 2,000 1,500 1,000 500 0 AMD FirePro 2008 2009 2010 2011 2012 2013 2014 Linpack/Power [Gflops/kW] TOP10 TOP50 TOP500 Max-Efficiency BlueGene/Q Cell Mic Tsubame KFC NVIDIA K20x
  33. 33. MOST POWER EFFICIENT ARCHITECTURES Computer Rmax/ Power Tsubame KFC, NEC, Xeon 6C 2.1GHz, Infiniband FDR, NVIDIA K20x 4,272 Cray CS-Storm, Xeon 10C 2.2GHz, Infiniband FDR, NVIDIA K40m 3,879 iDataPlex DX360M4, IBM, Xeon 10C 2.8GHz, Infiniband, NVIDIA K20x 3,543 Cartesius, Bull B515, Xeon 8C 2.5GHz, InfiniBand 4×FDR, Nvidia K40m 3,459 Romeo, Bull Cluster, Xeon 8C 2.6GHz, Infiniband FDR, NVIDIA K20x 3,131 HA-PACS TCA, Cray Cluster, Xeon 10C 2.8GHz, QDX, NVIDIA K20x 2,980 SANAM, Adtech, ASUS, Xeon 8C 2.0GHz, Infiniband FDR, AMD FirePro 2,973 Piz Daint, Cray XC30, Xeon 8C 2.6GHz, Aries, NVIDIA K20x 2,697 iDataPlex DX360, Xeon 10C 2.8GHz, Infiniband FDR, NVIDIA K20x 2,629 Cypress, Dell, Xeon 10C 2.8GHz, 40Gb Ethernet, Xeon Phi 7120P 2,541 Shadow, Cray CS300-LC, Xeon 10C 2.8GHz, Infiniband FDR, Xeon Phi 5110P 2,495 [Mflops/Watt]

×