Why we need Exascale and why we won't get there by 2020 ?
Why we need Exascale and why wewon’t get there by 2020Horst SimonLawrence Berkeley National LaboratoryOptical Interconnects ConferenceSanta Fe, New MexicoMay 6, 2013
2 2Overview• Current state of HPC: petaflops firmlyestablished• Why we won’t get to exaflops (in 2020)• The case for exascale Science (old) Science (new) Technology and economic competitiveness DiscoveryThank you to the DOE Hexlab group, the TOP500 team, John Shalf, KathyYelick and many others.
3 3Current State of HPC: petaflops firmly established
5 5• Leading Technology Paths (Swim Lanes) Multicore: Maintain complex cores, and replicate (x86, SPARC,Power7) [#3, 6, and 10] Manycore/Embedded: Use many simpler, low power cores fromembedded (BlueGene) [ #2, 4, 5 and 9] GPU/Accelerator: Use highly specialized processors fromgaming/graphics market space (NVidia Fermi, Cell, Intel Phi(MIC) ), [# 1, 7, and 8]Technology Paths to Exascale
6 6Events of 2011/2012 and the Big Questionsfor the Next Three Years until 2015Blue Gene is the last of the line.Will there be no more large scale embedded multicore machines?Intel bought Cray’s interconnect technology and WhamCloud.Will Intel develop complete systems?Multicore:Manycore/Embedded:GPU/Accelerator:IBM cancels “Blue Waters” contract.Is this an indication that multicore with complex cores is nearingthe end of the line?✔Prediction: All Top 10 systems in 2015 will be GPU/Accelerator based
7 7• Upgrade of Jaguar from Cray XT5 to XK6• Cray Linux Environment operating system• Gemini interconnect 3-D Torus Globally addressable memory Advanced synchronization features• AMD Opteron 6274 processors (Interlagos)• New accelerated node design usingNVIDIA multi-core accelerators 2011: 960 NVIDIA x2090 “Fermi” GPUs 2012: 14,592 NVIDIA “Kepler” GPUs• 20+ PFlops peak system performance• 600 TB DDR3 mem. + 88 TB GDDR5 memTitan System at Oak Ridge NationalLaboratory (accelerator)
8 8• Fully deployed in June 2012• 16.32475 petaflops Rmax and 20.13266 petaflops Rpeak• 7.89 MW Power consumption• IBM Blue Gene/Q• 98,304 compute nodes• 1.6 million processor cores• 1.6 PB of memory• 8 or 16 core Power Architecture processorsbuilt on a 45 nm fabrication method• Lustre Parallel File SystemSequoia at Lawrence Livermore NationalLaboratory (Manycore/embedded)
9 9• #3 on the TOP500 list• Distributed memory architecture• Manufactured by Fujitsu• Performance: 10.51 PFLOPS Rmax and 11.2804 PFLOPS Rpeak• 12.65989 MW Power Consumption, 824.6 GFLOPS/Kwatt• Annual running costs - $10 millionK Computer at RIKEN Advanced Institute forComputational Science (Multicore)• 864 cabinets:− 88,128 8-core SPARC64 VIIIfx processors@ 2.0 GHz− 705,024 cores− 96 computing nodes + 6 I/O nodes percabinet• Water cooling system minimizesfailure rate and power consumption
10 10• Swim Lane choices – currently all three are equallyrepresented• Risk in Swim Lane switch for large facilities Select too soon: Users cannot follow Select too late: Fall behind performance curve Select incorrectly: Subject users to multiple disruptive technologychangeChoice of Swim Lanes
11 11Edison is the next production computingsystem dedicated to DOE Office of Science• First Cray Cascade System– Developed for the DARPA HPCSProgram– First Cray system with Intelprocessors– New Aries interconnect• Energy efficient design– Warm water cooling– Power capping– ~ 1.1MW per PF withconventional CPUs only5200 compute nodes> 100K processing cores333 Terabytes memory> 2 petaflops peak
13 13• IBM cancels “Blue Waters” contract Is this anindication that multicore with complex cores is nearingthe end of the line?• Blue Gene Q is the last of the line. Will there be nomore large scale embedded multicore machines?• Intel bought Cray’s interconnect technology andWhamCloud Will Intel develop complete systems?Events of 2011/2012 and the Big Question forthe Next Three Years until 2016• Will there be only manycore and accelerator basedsystems among the TOP10 in 2016?
15 15• Pflops computing fully established with more than 20machines• Three technology “swim lanes” are thriving• Interest in supercomputing is now worldwide, andgrowing in many new markets• Exascale projects in many countries and regions• Rapid growth predicted by IDC for the next three yearsState of Supercomputing in 2013
17 17The Power and Clock Inflection Point in 2004!""!"!!"#!!! "$%&"$!"$&"$$!"$$&(!!!(!!&(!"!(!"&(!(!)*++,-./012*/.34!"#$%&"() *"+,)- .%&- !"#$%&"() /0)$" .%&-123* !" 4-&5"!!) 5678+9259:;+ <5*+.=235.*/ >.*3+;/"?@A!""?@A!("?@A!B"?@A!C"D"D$("D"D$E"D"D!!"D"D!C"D"D!"D"D"("D"D"EF5;3GHI:JK6%7- 4&%-##%& .)%8 9/!:; 9!;6%7- 4&%-##%& .)%8 9/!:; 9<;6%7- 4&%-##%& .)%8 9/!:; 9/;=)-&($%& .%&- .)%8 9/!:; 9!;=)-&($%& .%&- .)%8 9/!:; 9<;=)-&($%& .%&- .)%8 9/!:; 9/;Source: Kogge and Shalf, IEEE CISE
18 18• Exascale has been discussed in numerous workshops,conferences, planning meetings for about six years• Exascale projects have been started in the U.S. andmany other countries and regions• Key challenges to exascale remainTowards Exascale
19 19The Exascale Challenge• Build a system before 2020 that will be #1 on the TOP500list with an Rmax > 1 exaflops• There is a lot of vagueness in the exascale discussionand what it means to reach exascale. Let me propose aconcrete and measurable goal.• Personal bet (with Thomas Lippert, Jülich, Germany) thatwe will not reach this goal by November 2019 (for $2,000or €2000)• Unfortunately, I think that I will win the bet. But I wouldrather see an exaflops before the end of the decade andloose my bet.
20 20At Exascale, even LINPACK is a ChallengeHPL results for #1 System on the TOP500 (15 machines in that club)Source: Jack Dongarra, ISC’12
21 21At Exascale HPL will take 5.8 daysSource: Jack Dongarra, ISC’12
26 26Power Efficiency over Time0 500 1,000 1,500 2,000 2,500 3,000 2008 2009 2010 2011 2012 Linpack/Power [Gﬂops/kW] TOP10 TOP50 TOP500 Accelerator and BGmulticoreData from: TOP500 November 2012
27 27Power Efficiency over Time0 500 1,000 1,500 2,000 2,500 3,000 2008 2009 2010 2011 2012 Linpack/Power [Gﬂops/kW] TOP10 TOP50 TOP500 0 500 1,000 1,500 2,000 2,500 3,000 2008 2009 2010 2011 2012 Linpack/Power [Gﬂops/kW] TOP10 TOP50 TOP500 One timetechnologyimprovement, nota change in trendrateData from: TOP500 November 2012
28 28• Cost to move a bit on copper wire:• energy = bitrate * Length2 / cross-section area• Wire data capacity constant as feature size shrinks• Cost to move bit proportional to distance• ~1TByte/sec max feasible off-chip BW (10GHz/pin)• Photonics reduces distance-dependence of bandwidthThe Problem with Wires:Energy to move data proportional to distanceCopper requires to signal amplificationeven for on-chip connectionsPhotonics requires no redriveand passive switch little power
29 291 10 100 1000 10000 now SMP The Cost of Data MovementMPI picojoules/bit
30 30110100100010000nowSMP The Cost of Data MovementMPICMP Cost of a FLOPpicojoules/bit
31 31The Cost of Data Movement in 2018!"!#"!##"!###"!####"!"#$%&"#()*+,(-#.//#01234*5#6//#01234*5#&7234*58!9:#;03<;#*1,(-3011(3,#=-0++#+>+,(/#"*30?0@;(+#10A#BC.D#FLOPs will cost lessthan on-chip datamovement! (NUMA)FLOPsNone of the fastforward projectsaddresses thischallengepicojoules/bit
32 32It’s the End of the World as We Know It!!"#!!"$#%"!!%"&#%"#!%"$#&"!!&"&#!"#!"#$%!"#!"#!!!"#!"#!&!"#!"#!!"#!"#"(()*(+,-.,,+/012(3456/478.16)*+, -./012#34 516+0 71893:;9 7<=093#39= 298 =189 ->?@4 >9*#7189 -.A4Source: Kogge and Shalf, IEEE CISESummary Trends
33 331. HPL at Exaflop/s level will take about a week to run –challenge for center directors and new systems.2. We realized a one time gain in power efficiency byswitching to accelerator/manycore. This is not asustainable trend in the absence of other newtechnology3. Data movement will cost more than flops (even onchip).4. Limited amount of memory, low memory/flop ratios(processing is free)Why I believe that I will win my bet
34 34The Logical ConclusionIf FLOPS are free, then why do we need an “exaflops”initiative?“Exaflops”“Exascale”“Exa”-anything has become a bad brand• Associated with buying big machines for the labs• Associated with “old” HPC• Sets up the community for “failure,” if “goal” can’tbe met
35 35It is not just “exaflops” – we are changing thewhole computational modelCurrent programming systems have WRONG optimization targets• Peak clock frequency as primarylimiter for performance improvement• Cost: FLOPs are biggest cost forsystem: optimize for compute• Concurrency: Modest growth ofparallelism by adding nodes• Memory scaling: maintain byte perflop capacity and bandwidth• Locality: MPI+X model (uniformcosts within node & between nodes)• Uniformity: Assume uniformsystem performance• Reliability: It’s the hardware’sproblemOld Constraints New Constraints!"!#"!##"!###"!####"$%"&(%")*+,-.*/"!00"12345,6"700"12345,6"(8345,69$):;"<14=<",2.*/4122*4.">/1--"-?-.*0"21@"A#!B"!"#$%"&$()*!+,&--."/012&"+3"405/6++(+,)*+0&--."/012&"+!"#%1"&$(7)*+,&--."/012&"+!"#$%$&()*!(+*,-(+./$0*Fundamentally breaks our current programming paradigmand computing ecosystem• Power is primary design constraintfor future HPC system design• Cost: Data movement dominates:optimize to minimize data movement• Concurrency: Exponential growth ofparallelism within chips• Memory Scaling: Compute growing2x faster than capacity or bandwidth• Locality: must reason about datalocality and possibly topology• Heterogeneity: Architectural andperformance non-uniformity increase• Reliability: Cannot count onhardware protection alone
37 37Climate change analysisSimulations Extreme data• Cloud resolution, quantifyinguncertainty, understandingtipping points, etc., will driveclimate to exascale platforms• New math, models, and systemssupport will be needed• “Reanalysis” projects need 100× morecomputing to analyze observations• Machine learning and other analyticsare needed today for petabyte datasets• Combined simulation/observation willempower policy makers and scientists
38 38Michael Wehner, Prabhat, Chris Algieri, Fuyu Li, Bill Collins, Lawrence Berkeley National Laboratory; Kevin Reed, University ofMichigan; Andrew Gettelman, Julio Bacmeister, Richard Neale, National Center for Atmospheric ResearchQualitative Improvement of Simulation withHigher Resolution (2011)
39 39Exascale combustion simulations• Goal: 50% improvement in engine efficiency• Center for Exascale Simulation of Combustionin Turbulence (ExaCT)• Combines M&S and experimentation• Uses new algorithms, programmingmodels, and computer science
40 40Warhead assessment and certification of a smallernuclear stockpileAssessment process: All aspectswill require exascale computingto support commitment never toreturn to underground testingModeling moltentantalum: 10Matoms required;modeling moltenplutonium: 1000×more computingpowerSelf-steering(a data analysischallenge):Essentialto minimizenumberof simulationsPredicting with confidence requiresprecise understanding of agingand individual life-extended weapons• High accuracy in individual weaponscalculations:• Advanced physical models• 3-D to resolve features and phenomena• Extreme resolution• Uncertainty quantification:• Massive numbers of high-resolution 3-Dsimulations to explore impacts of smallvariations in individual devices
42 42Materials genome!Today’s batteriesComputing 1000× today Data services for industry and science• Key to DOE’s Energy Storage Hub• Tens of thousands of simulationsused to screen potential materials• Need more simulations and fidelityfor new classes of materials, studiesin extreme environments, etc.• Results from tens of thousandsof simulations web-searchable• Materials Project launched in October2012, now has >3,000 registered users• Increase U.S. competitiveness; cut inhalf 18 year time from discovery tomarketVoltage limitInterestingmaterialsBy 2018:• Increase energy density (70 miles → 350 miles)• Reduce battery cost per mile ($150 → $30)
43 43DOE Systems Biology Knowledgebase: KBaseMicrobes Communities Plants• Integration and modeling for predictive biology• Knowledgebase enabling predictive systems biology Powerful modeling framework Community-driven, extensible and scalable open-source software andapplication system Infrastructure for integration and reconciliation of algorithms and data sources Framework for standardization, search, and association of data Resource to enable experimental design and interpretation of results
44 44• Brain Research through AdvancingInnovative Neurotechnologies• Create real-time traffic maps toprovide new insights into braindisorders• “There is this enormous mysterywaiting to be unlocked, and theBRAIN Initiative will change that bygiving scientists the tools they needto get a dynamic picture of thebrain in action and betterunderstand how we think and howwe learn and how we remember,”Barack ObamaObama Announces BRAIN Initiative –Proposes $100 million in his FY2014 budgetSee Alivisatos et al., Neuron 74, June 2012, pg 970
The Cat is out of the Bag:Cortical Simulations with 109 neurons, 1013 synapses
LLNL DawnBG/PMay, 2009Human22 x 109220 x 1012Rat56 x 106448 x 109Mouse16 x 106128 x 109N:S:Monkey2 x 10920 x 1012Cat763 x 1066.1 x 1012AlmadenBG/LDecember, 2006WatsonBG/LApril, 2007WatsonShaheenBG/PMarch, 2009Modha Group at IBM AlmadenLatest simulations in 2012 achieve unprecedented scale of65*109 neurons and 16*1012 synapsesNew results for SC12LLNL SequoiaBG/QJune, 2012
49 49• Straight forward extrapolation results in a real timehuman brain scale simulation at about 1 - 10 Exaflop/swith 4 PB of memory• Current predictions envision Exascale computers in2020 with a power consumption of at best 20 - 30 MW• The human brain takes 20W• Even under best assumptions in 2020 our brain will stillbe a million times more power efficientTowards Exascale – the Power Conundrum
50 50Exascale is Critical to the Nation to Develop FutureTechnologies and Maintain Economic Competitiveness
51 51• Digital design and prototyping at exascale enable rapiddelivery of new products to market by minimizing the need forexpensive, dangerous, and/or inaccessible testing• Potential key differentiator for American competitiveness• Strategic partnerships between DOE labs and industrialpartners to develop and scale applications to exascale levelsU.S. competitive advantage demands exascaleresources
52 52“I think that what we’ve seen is that they may be somewhatfurther ahead in the development of that aircraft than ourintelligence had earlier predicted.”—Defense Secretary Robert M. Gates,(The New York Times, Jan. 9, 2011)Exascale computing is key for national securityPotential adversaries are not unaware of this
53 53The U.S. is not the only country capable of achieving exascale computing.The country that is first to exascale will have significant competitiveintellectual, technological, and economic advantages.Achievement of the power efficiency and reliability goals needed forexascale will have enormous positive impacts on consumer electronicsand business information technologies and facilities.Exascale technologies are the foundation forfuture leadership in computingThe gap in available supercomputing capacity between the United States and the rest ofthe world has narrowed, with China gaining the most ground.Data from TOP500.org; Figure from Science; Vol 335; 27 January 2012
55 55Admiral Zheng He’s Voyages in 1416http://en.wikipedia.org/wiki/Zheng_He
56 56The Rise of the West“Like the Apollo moon missions,Zheng He’s voyages had been aformidable demonstration of wealthand technological sophistication.Landing … on the East African coastin 1416 was in many ways …comparable to landing an Americanastronaut on the moon in 1969. Byabruptly canceling oceanicexploration, Yongle’s successorsensured that the economic benefitsremained neglible.”
57 57Exascale is the next step in a long voyage ofdiscovery, we should not give up now and retreat
59 59Exascale rationale• Exascale computing means performing computer simulations on a machine capable of 10 to the 18thpower floating point operations per second – ~1,000× the current state-of-the-art.• The U.S. is capable of reaching exascale simulation capability within ten years with cost-effective,efficient, computers. An accelerated program to achieve exascale is required to meet this goal, and doingso will enable the U.S. to lead in key areas of science and engineering.• The U.S. is not the only country capable of achieving exascale computing. The country that is first toexascale will have significant competitive intellectual, technological, and economic advantages.• Exascale systems and applications will improve predictive understanding in complex systems (e.g.climate, energy, materials, nuclear stockpile, emerging national security threats, biology, …)• Tightly coupled exascale systems make possible fundamentally new approaches to uncertaintyquantification and data analytics – uncertainty quantification is the essence of predictive simulation.• American industry is increasingly dependent on HPC for design and testing in a race to get products tomarket. Exascale computing and partnerships with DOE laboratories could be THE key differentiator, awinning US strategy for advanced manufacturing and job creation.• The scale of computing developed for modeling and simulation will have a revolutionary impact on dataanalysis and data modeling . Much of the R&D required is common to big data and exascale computing.• Exascale provides a ten year goal to drive modeling and simulation developments forward and to attractyoung scientific talent to DOE in a variety of scientific and technology fields of critical importance to thefuture of the U.S.• Achievement of the power efficiency and reliability goals needed for exascale will have enormous positiveimpacts on consumer electronics and business information technologies and facilities.
60 60• There is progress in Exascale with many projects nowfocused and on their way, e.g. FastForward, Xstack, and Co-Design Centers in the U.S.• HPC has moved to low power processing, and the processorgrowth curves in energy-efficiency could get us in the rangeof exascale feasibility• Memory and data movement are still more open challenges• Programming model needs to address heterogeneous,massive parallel environment, as well as data locality• Exascale applications will be challenge just because theirsheer size and the memory limitationsSummary
61 61Sustained Petaflops Performance a reality –congratulations to NCSA!