2010 in Review: Nvidia Tesla GPU Document Transcript
Tesla – Year in Review 2010The growth of GPU Computing in HPC has continued unabated this year with many new milestonesachieved. Hard to believe that it’s only been three and a half years since Tesla launched.At the end of last year, we talked about how it felt like we had reached a “tipping point” with Tesla, alevel at which momentum for change seemed unstoppable. If I had to find two words to summarize thisyear, I would say that it feels like Tesla has reached “escape velocity”, the required speed one needs tobreak free of a gravitational field, or in the case of Tesla, a stage of momentum where the question onmany of our customers’ lips is no longer “if” we deploy GPUs, it‘s “when”.Our own GPU Technology Conference and this year’s SC’10 conference really cemented our position andgave us an epic end to an incredible year.These are our Top 10 takeaways for the year:1. CUDA by the numbers. There a lot of metrics we use internally to track the progress of CUDA, but however you cut it, we’ve seen stellar growth across the board this year in terms of developer adoption, education and community momentum. 2009 2010 % Increase Attendees at GPU Technology Conference (GTC) 1423 2166 52% (ind. av. = ~20%) Universities Teaching CUDA 270 350 30% CUDA related videos on YouTube 800 1250 56% Submissions to CUDA Zone 670 1235 85% Cumulative downloads of CUDA SDK 293,000 668,000 127% CUDA-related citations on Google Scholar 2700 7000+ 160% Submissions to speak at GTC 67 334 398%2. The Computational Laboratory - In January, we launched a new initiative for the bio-informatics and computational chemistry community, called Tesla Bio Workbench. The initiative brought together more than 20 prominent computational research codes, such as AMBER, VMD and LAMMPS, enabling scientists who rely on these codes to turn their standard PCs into "computational laboratories" capable of doing science more than 10-20 times faster through the use of Tesla. In the case of AMBER, one of the most widely used applications for biochemists, performance increases of up to 100X are being seen and more importantly, critical research that once required a supercomputer could now be done on a desktop workstation. The Tesla Bio Workbench site saw more than 10,000 visitors in the first two weeks alone and since then, more than half of the 150,000+ visitors have clicked through to the specific pages belonging to the research codes.
3. “Build it and they will come” – when I wrote this recap last year, there was 1 OEM with a Tesla SKU as a part of their line-up. Today, this number is up to 9, with a total of 19 Tesla-specific SKUs now available, many using the Tesla M2050 GPU Computing Module. The list includes all the major players such as Cray, Dell, HP and SGI, but perhaps most notable is IBM, who in May became the first major OEM to offer a Tesla -based server solution in its iDataPlex line. For IBM, it was a sign that GPU Computing was mature enough to warrant their entry into the space. Dave Turek IBM VP of Deep Computing said: "I think whats changed is that customers have been experimenting for a long time and now theyre getting ready to buy. It wasnt the technology that drove us to do this. It was the maturation of the marketplace and the attitude toward using this technology. Its as simple as that."4. To the Nebulae and Beyond - At the International Supercomputing Conference in June, the world’s first Tesla GPU-enabled petaflop supercomputer made its debut. Equipped with 4640 Tesla “Fermi” GPUs, Nebulae at the National Supercomputer Center in Shenzhen China, made its mark on the Top500 by entering at number 2, with sustained performance of 1.27 petaflops. Another system from the Chinese Academy of Sciences also entered the chart and number 19. This marked the beginning of what was to be an impressive year for China. As a relative newcomer to the supercomputing space, China is unrestricted by the need to support legacy software and systems, so it has been fearless in its adoption of GPU computing. The country has shown that it understands the significance of supercomputing, as it seeks to evolve from being a manufacturing powerhouse to become a global leader in science and technology.5. The Beginning of the Race for Better Science - Following the June list of the Top500, the Undersecretary for Science at the DOE, Steve Koonin, wrote an OpEd for the San Francisco Chronicle. In this piece he voiced his concern about Nebulae, stating that “these challenges to U.S. leadership in supercomputing and chip design threaten our countrys economic future.” Undersecretary Koonin’s concern is that without the latest technologies, the U.S. will fall behind the rest of the world in critical areas of industry, such as simulation for product design. Leadership here enables the U.S. to continue to push the envelope in terms of technology while encouraging innovation. The sentiment was echoed by others, such as Senator Mark Warner and NVIDIA’s own Andy Keane whose piece on AllThingsD encouraged a lot of lively discussion, such as this comment from insideHPC: “I agree with Andy on this one; the Senate should get behind Senator Mark Warner (D-VA) and his amendment to the reauthorization of the America Competes Act. If we as an HPC community, or as a country for that matter, aren’t agile enough to adapt, we could find ourselves being trounced by our own inventions.”
The use of GPUs to further science was a topic covered in a recent pilot of a documentary series that NVIDIA produced, entitled “The 3rd Pillar of Science”. In this pilot, we spoke to leading medical experts who are using GPUs for ground-breaking medical methods, such as advanced cancer treatment and real-time open heart surgery.6. 2200 Geniuses and a Self-Driving Car - After the success of last year’s GPU Technology Conference, we were pretty excited to host our 2nd event in September this year. Our attendee numbers grew more than 50%, well above average for a technical conference, and submissions from eager CUDA developers wanting to present their work grew nearly 400%. In fact, we had so many that we doubled the number of sessions at the conference to 280, all of which are online for your viewing and listening pleasure It was pretty interesting to see the difference in the show since last year. The sheer breadth of topics covered made the show unlike any other – from astrophysics to video processing, from computational fluid dynamics to neuroscience and from energy exploration to designing autonomous cars. Tables were filled with engineers, scientists, developers, students and researchers, all sharing experiences and ideas. We’ll be staying in San Jose, California for GTC 2011, and we hope to see you all there. Here are a few of my favorite quotes from members of the press that attended: “Absolutely one of the best - and most important conferences in the technology and advanced computing sector” - The Exascale Report “What we are seeing here is like going from propellers to jet engines.” – insideHPC “…GTC is growing even as it specializes on just one aspect of NVIDIAs business, the CUDA platform for GPU computing. Thats just one of many signals that point to an undeniable trend: the use of GPUs for non-graphics computation is on the rise, led largely by NVIDIAs efforts.” - Tech Report “NVIDIA’s GTC is a blast. The demos, keynotes, exhibits, technical papers, and emerging companies’ presentations are first class, interesting and informative. Well worth the price of admission. There was no heavy product messaging, no call to action to buy something other than the idea that parallel processing is here and it’s important—and by our observations it was mission accomplished.” - Tech Watch7. Turbocharged Tools - This year we saw GPU-enabled, production releases of some of the most important applications in the technical and scientific computing space. ACUSIM Software launched a GPU-enabled version of its CFD software AcuSolve, delivering double the performance for its users. Tom Lange, director of Modeling and Simulation at P&G said:
"GPU-accelerated CFD allows for more realism, helping us replace slow and expensive physical learning cycles with virtual ones. This transforms engineering analysis from the study of failure to true virtual trial and error, and design optimization." ANSYS released performance data on its CUDA implementation of ANSYS Mechanical, revealing that CUDA helps cut turnaround times for complex simulations in half. Wolfram Research released the latest version of Mathematica, delivering for its users, in some cases, speed increases of more than 100X from within the familiar confines of the Mathematica programming environment. Check out the video here of their demo earlier this year at Siggraph. And finally, NVIDIA and Mathworks collaborated on its latest release of MATLAB 2010b, to include support for GPU acceleration for users of Parallel Computing Toolbox and MATLAB Distributed Computing Server.8. Cloudy with a chance of GPUs – this year saw the first GPU deployments to the Cloud, from Peer1 in July and Amazon Web Services (AWS) in November. Developing for the CUDA architecture of NVIDIA GPUs already offers the lowest cost of entry for any HPC architecture, but with these new services, you don’t even need to buy the hardware yourself. Through AWS for example, you can now get access to 2 Tesla 20-series GPUs and 2 CPUs for just $2.10 an hour. Businesses of all sizes can now run heavy duty simulations and more with simple on-demand pricing , and no large up front capital investment. GigaOm Pro had this to say about the announcement: “Performance (of Amazon’s Cluster Compute Instances) was high already, and the addition of GPUs just ups the octane level. According to a benchmark test by HPC cloud-resource middleman Cycle Computing, GPU Instances outperform in-house GPU clusters in certain cases.” Amazon’s CTO, Werner Vogels, published an interesting blog, and one of Amazon’s technical specialists, Jeff Barr, gave a great technical overview of the new service.9. Lean, Mean & Green – The year ended with a bang for the Tesla business at SC’10 in New Orleans. The final Top500 and Green500 lists of the year were announced and Tesla had its best showing yet. Just prior to SC’10 commencing, the National Supercomputer Center in Tianjin announced Tianhe-1A which, with a Linpack score of 2.57 petaflops, secured the #1 spot on the list. Two other Tesla GPU- enabled systems made the Top5; the aforementioned Nebulae, and Tsubame 2.0 from Tokyo Tech. Tsubame 2.0 was ranked at #2 in the Green500, but more notably it was the only petaflop system in the entire Top 10. Equipped with 4200 Tesla GPUs, yet consuming just 1.340 megawatts, it is, by far, the most power efficient petaflop system the world has ever seen ands an incredible achievement from Prof. Satoshi Matsuoka and his team. NVIDIA and its customers were also recognized in a number of industry awards at the show. GPUs were highlighted in two Gordon Bell awards. The best student paper went the way of Tokyo and
Purdue Universities who collaborated on a new interface to make parallel programming on the GPU even more accessible. And perhaps most exciting, we saw some major organizations receiving honors for their work with GPUs, including Citadel Investment Group, Schlumberger and Weta Digital.10. Some Years in Review for my Year in Review – while writing this, a couple of other year in review articles caught my eye, and included some quotes that I thought would make a fitting end to this recap. HPCwire released their biggest trends of the year podcast last week and pronounced GPU Computing as the #1 Trend of the Year. They commented: “This year, it (GPU Computing) hit the mainstream, deployed by all the major vendors.” They also added that: “If NVIDIA hadn’t been there, this wouldn’t have happened. AMD was only lukewarm about this. NVIDIA put energy and money into it. They changed the trajectory of GPU computing, without a doubt. NVIDIA CUDA made this possible.” Another article recently appeared on O’Reilly Media, who produce a wealth of books, online services, magazines, research, and conferences for the technical computing community. Their summary was that GPUs coupled with CPUs is the architecture of choice for the processing of computationally heavy data “You wont get the processing power you need at a price you want just by enabling traditional multicore CPUs; you need the dedicated computational units that GPUs provide.”We couldn’t agree more And so 2011 is upon us. From everyone in the NVIDIA Tesla and CUDA teams, we wish you a happy andsuccessful New Year.