Successfully reported this slideshow.

Amd future of gp us - campus party


Published on

  • Be the first to comment

  • Be the first to like this

Amd future of gp us - campus party

  1. 1. Apresentações AMD<br />19 de janeiro 10:00 – 12:00 – O Futuro das GPUs<br />20 de janeiro 10:00 – 12:00 – Computaçãoacelerada<br />Roberto Brandão<br />AMD Latin America<br />
  2. 2. The Future of GPU<br />Roberto Brandão<br />AMD Latin America<br />
  3. 3. Today’s GPUs focused on<br />GAMING<br />ENTERTAINMENT<br />PRODUCTIVITY<br />
  4. 4. Today’s GPUs focused on<br />GAMING<br />ENTERTAINMENT<br />PRODUCTIVITY<br />
  5. 5. DirectX® 11 Tessellation<br />DirectX® 10<br />DirectX® 11<br />No Tessellation<br />Tessellation<br />Images courtesy of Unigine Corp.<br />5<br />
  6. 6. DirectX® 11 Multi-Threading<br /><ul><li>Application, DirectX runtime, and DirectX driver can each run in separate threads
  7. 7. Tasks like loading a texture or compiling a shader can execute in parallel with main rendering thread</li></ul>DirectX® 10<br />DirectX® 11<br />6<br />
  8. 8. DirectX® 11 Tessellation<br />DirectX® 10<br />DirectX® 11<br />No Tessellation<br />Tessellation<br />Images courtesy of Unigine Corp.<br />7<br />
  9. 9. DirectX® 11 Tessellation<br />DirectX® 10<br />DirectX® 11<br />No Tessellation<br />Tessellation<br />Images courtesy of Unigine Corp.<br />8<br />
  10. 10. Order Independent Transparency (OIT)<br /><ul><li>Efficient rendering of many overlapping transparent objects
  11. 11. Smoke, fire, hair, foliage, fences, water, glass
  12. 12. Rendering transparent objects correctly requires sorting
  13. 13. Blending is an order dependent operation
  14. 14. DirectCompute 11 simplifies OIT by sorting transparent pixels in one shader pass
  15. 15. Uses atomic operations and append buffers</li></ul>9<br />
  16. 16. DirectX® 11 OIT in Action<br />Skeletonexposed<br />Arm bleedsthrough body<br />Order-Independent <br />Transparency<br />Simple Alpha Blending<br />10<br />
  17. 17. Render Post-Processing<br />Apply filter kernel to every pixel in rendered image<br />Depth of field, motion blur, tone mapping, edge detection, smoothing, sharpening<br />Requires data from neighbouring pixels<br />Example: constant time filter spreading<br />Accurately simulates certain lenseffects such as depth of field<br />Novel processing techniquedeveloped at AMD in conjunctionwith UC Berkeley <br />DirectCompute greatly simplifiesimplementation while increasingperformance and visual fidelity<br />Alpha buffer tricks no longer needed –fewer artifacts<br />Shared memory optimizations –better performance<br />11<br />
  18. 18. DirectX® 11 Depth of Field in Action<br />Noticeable halos<br />Hard silhouette<br />Filter Spreading<br />Legacy Method<br />12<br />
  19. 19. Shadow Rendering<br />HDAO (High Definition Ambient Occlusion)<br />Detects “valleys” in scene geometry and darkens them according to depth<br />Contact hardened shadows<br />Sharpens shadow edges where they contact casting object, make edges increasingly blurry as they get farther away<br />13<br />
  20. 20. DirectX® 11 Shadows in Action<br />DirectX 10.1 Shadows<br />DirectX 11<br />Contact Hardened Shadows<br />Images from S.T.A.L.K.E.R.: Call of Prypiat (GSC Gameworld)<br />14<br />
  21. 21. Lighting Post Effects<br />Realistic Nighttime lighting<br />HDR bloom<br />Lens flare<br />Atmospheric scattering<br />Light trails<br />3D color grading<br />Motion blur<br />
  22. 22. Anti-Aliasing<br />No AA<br />Smoothes jagged edges around objects<br />More obvious in moving images and at lower resolutions<br />Takes multiple samples of image<br />More samples = higher quality, but also much more work<br />Radeon products support 2x, 4x, and 6x sample modes<br />2x AA<br />4x AA<br />6x AA<br />
  23. 23. EQAA Modes<br />= Color Sample Location<br />= Coverage Sample Location<br />2x MSAA<br />4x MSAA<br />8x MSAA<br />= Pixel Boundary<br />No AA<br />2x EQAA<br />4 coverage samples<br />4x EQAA<br />8 coverage samples<br />8x EQAA<br />16 coverage samples<br />
  24. 24. Tessellating the Right Way<br />Can add significant detail to a scene while effectively compressing geometry<br />But excessive use of tessellation can be inefficient for today’s GPUs<br />Poor utilization of rasterizers<br />Overshading<br />Too many polygon edges for MSAA<br />Brute force approach is wasteful<br />Overshade per pixel<br />8<br />7<br />6<br />5<br />4<br />3<br />21<br />25 pixel triangles<br />5 pixel triangles<br />1 pixel triangles<br />15 pixel triangles<br />16 pixel triangle<br />100% rasterizer utilization<br />1 pixel triangle<br />6.25% rasterizer utilization<br />
  25. 25. Morphological Anti-Aliasing<br />No AA<br />Morphological AA<br />Post-process filtering technique accelerated with DirectCompute<br />Delivers full-scene anti-aliasing<br />Not limited to polygon edges, alpha-tested surfaces, etc.<br />Faster than super-sampling<br />Performance similar to edge-detect CFAA, but applies to all edges<br />Compatible with any DirectX® 9/10/11 application<br />Including games with no AA support<br />Enabled via AMD Catalyst Control Center™<br />Images captured from Aliens vs. Predator by Rebellion<br />
  26. 26. Morphological Anti-Aliasing<br />MLAA<br />No AA<br />4xMSAA<br />MSAA + MLAA<br />
  27. 27. Today’s GPUs focused on<br />GAMING<br />ENTERTAINMENT<br />PRODUCTIVITY<br />
  28. 28. Power savings improvment<br />
  29. 29. We are visual beings<br />Visual perception<br />Verbal perception<br />Words are processedat only 150 wordsper minute<br />Pictures and video <br />are processed 400 to 2000 times faster<br /><ul><li>Consumers are looking for better visual experience in an evironment with variable content
  30. 30. Content formats and sources have more diversity than ever
  31. 31. New applications will demand for computing power that is impossible on today’s hardware</li></ul>23<br />
  32. 32. Enhanced Multimedia Capabilities<br />Enhanced UVD2Hardware acceleration decodeof dual 1080p HD video streams9<br />Windows® Aero Mode<br />Playback of HD videos in high qualitywith Windows® Aero mode enabled10<br />Video Gamma<br />Independent from Windows® desktopfor a superior user experience<br />Brighter Whites<br />“Blue Stretch” processing increases theblue value of white colors for brighter videos<br />Dynamic Video Range<br />Control of levels of black and white during playback<br />Power Management<br />Enables new customers for all levels of graphics<br />24<br />
  33. 33. Superior HDMI Audio and Video Features<br />Enhanced Home Theatre Audio Experience<br /><ul><li>HDMI 1.3a Dolby TrueHD & DTS-HD Master Audio
  34. 34. Full support for premium Blu-ray audio formats Dolby TrueHD , DTS-HD Master Audio, AC-3 and DTS
  35. 35. High quality surround soundUp to 8 channels of 192kHz / 24-bit audio</li></ul>Advanced Display Quality<br /><ul><li>HDMI 1.3a Deep Color & x.v.Color
  36. 36. Over 1 billion colors output through HDMI12-bpc output, 10-bpc (4:4:4) meaningfully derived11
  37. 37. Wide range of colorsFull support for wide-gamut x.v. color video signals</li></ul>25<br />
  38. 38. Improvements already reached consumers<br />ATI <br />Stream<br />Processor utilization<br />Adobe Flash plugin used by<br /><ul><li> Better image quality and video smoothness
  39. 39. Lower processor usage</li></li></ul><li>Convert your DVD videos into near HD quality with DVD Upscaling<br />Designed to help dramatically improve the quality of your movies<br />Take Your DVD’s to Near HD Quality<br />
  40. 40. Better video quality from a DVD (DVD Upscaling)<br />Better definition and sharpness of video streams based on MPEG-2 (DVD) for high definition displays<br />DVD<br />Upscaled DVD<br />
  41. 41. Dramatically Improve Online Video Quality<br />Watch online videos with smooth playback and sharper, vibrant image quality<br />Make online video come to life!<br />
  42. 42. Today’s GPUs focused on<br />GAMING<br />ENTERTAINMENT<br />PRODUCTIVITY<br />
  43. 43. Introducing Next-Gen Desktop Configurations for …<br />CAD<br />Image courtesy Todd Daniele<br />DCC<br />Image courtesy of StudioGPU<br />Driver version 8.66 (ATI Catalyst™ 9.10) or above is required to support ATI Eyefinity technology. To enable a third display requires one panel with a DisplayPort connector. <br />
  44. 44. Introducing Next-Gen Desktop Configurations for …<br />Oil & Gas<br />Medical<br />Image courtesy Barco Medical Systems<br />Driver version 8.66 (ATI Catalyst™ 9.10) or above is required to support ATI Eyefinity technology. To enable a third display requires one panel with a DisplayPort connector. <br />
  45. 45. Maximum Flexibility in Display Configuration*<br />3x1 Landscape Display Group<br />1x3 Portrait Display Group<br /> Screen Images courtesy Todd Daniele<br />6x1 Portrait Display Group<br />Screen Image courtesy Todd Daniele<br />3x1 Landscape Display Group Plus 3 Extended<br />Screen Images courtesy University of Hertforshire<br />3x1 Display Group Plus 1 Extended<br />Image courtesy University of Hertfordshire<br />Screen Images courtesy Todd Daniele<br />
  46. 46. Single GPU 4K Output for CAD and DCC*<br />Image courtesy University of Hertfordshire, D.Atkins<br />*Planned features, specifications, and/or capabilities of top sku of upcoming ATI FirePro™ professional graphics cards.  Subject to change without notice.<br />Image courtesy Todd Daniele<br />
  47. 47. Distinctive Features - High Quality Rendering<br />Images courtesy Studio GPU<br />Full 30-bit display pipeline produces more than one billion colors and enables you to see more of your data*<br />Images courtesy Barco Medical Systems<br />Up to 1600 Stream Processors enable you to push visual effects farther than ever before<br />* Requires 30-bit monitor for true 30-bit color display. <br />
  48. 48. AMD Support for 30-bit Color* in Adobe® Photoshop®<br />8-bit per color component<br />16.7 million colors**<br />10-bit per color component<br />Over 1 billion colors**<br />* Requires 30-bit monitor for true 30-bit color display. **Simulated images.<br />
  49. 49. AMD Stream TechnologyUsing the GPU to Enhance the Notebook PC Experience<br />Performance and Battery Life<br />Massively parallel, programmable GPU architecture enables dramatic performance and power efficiency<br />Open Standards<br />Industry-standard OpenCL™and DirectCompute 11 enablecross-platform development<br />Balanced Platform<br />Developers leverage AMD GPUs and CPUsfor enhanced application performance and user experience<br />Gaming<br />Productivity<br />Entertainment<br />* ATI Stream technology requires both enabled graphics and an enabled application<br />37<br />
  50. 50. ATI Stream-Enabled Applications & Games<br />SimHD™ Plug-infor TotalMedia Theatre<br />Roxio Creator™ 2010<br />Roxio Creator™ 2010 Pro<br />Aliens vs, Predator<br />STALKER Call of Pripyat<br />DiRT 2<br />MediaShow 5<br />MediaShow Espresso<br />PowerDirector 8<br />PowerDirector 7<br />38<br />
  51. 51. Video Transcoding SampleNo GPU Acceleration<br />CPU Usage: 100%<br />Frames<br />Frames<br />Using four<br />CPU Cores<br />GPU Usage: 1%<br />39<br />
  52. 52. Video Transcoding SampleATI GPU Acceleration<br />CPU Usage: 45%<br />Control<br />Control<br />Frames<br />Frames<br />GPU Usage: 35%<br />Using hundreds of<br />Stream Processors<br />40<br />
  53. 53. CONECTIVITY<br />
  54. 54. AMD Radeon™ HD 6000 Series Graphics<br />2 x DVI (DL-DVI+ SL-DVI)<br />2 x miniDP<br />Designed for Displayport 1.2<br />HDMI 1.4a<br />Get amazing Eye-Definition graphics with DirectX® 11<br />Get fast applications and incredible video with AMD EyeSpeed technology<br />Get immersed with AMD Eyefinity technology<br />
  55. 55.
  56. 56. POWER MANAGEMENT<br />
  57. 57. Performance por watt<br /><ul><li> US datacenters consume more power than five 1000 megawatt nuclear power plants – at a cost of almost $3 billion
  58. 58. This is 150% more than the consumption in 2001</li></li></ul><li>Power savings improvment<br />
  59. 59. AMD PowerTune Technology<br />Clamps GPU TDP to a pre-determined level<br />Integrated control processor monitors GPU activity real time<br />GPU includes counters across all blocks which are monitored and applied to an algorithm to infer power draw<br />Dynamically adjusts clock to enforce TDP<br />Provides direct control over GPU power draw (as opposed to indirect via clock/voltage tweaks)<br />Algorithmic approach guarantees consistent performance across each product variant<br />No longer need to constrain default clock speeds to allow for outlier applications<br />User controllable via AMD OverDrive Utility<br />
  60. 60. PowerTune – Game Power Draw<br />Games consistently operate at lower power than peak apps<br />With PowerTune, each product variant is tuned to maximize game performance<br />Outlier applications are still handled gracefully<br />Accommodates future application power draw<br />
  61. 61. AMD PowerTune Technology<br />
  62. 62. THE FUTURE OF GPUs<br />
  63. 63. The future of GPUs<br />More performance<br />Better power management<br />GPU Everywhere<br />
  64. 64. One Design, Fewer Watts, Massive Capability <br />“Zacate” AMD Fusion APU <br />Discrete-level DirectX® 11 GPU <br />Dual-Core CPU<br />+<br />+<br />=<br />Northbridge<br /><ul><li>75 sq. mm
  65. 65. 18 watts
  66. 66. 59 sq. mm
  67. 67. 8 watts
  68. 68. 66 sq. mm
  69. 69. 13 watts
  70. 70. 117 sq. mm
  71. 71. 25 watts </li></li></ul><li>Graphics and Media Processing Efficiency Improvements<br />2011 APU-based Platform <br />2010 IGP-based Platform<br />~17 GB/sec<br />~17 GB/sec<br />CPU Cores<br />DDR3 DIMM<br />Memory<br />CPU Cores<br />DDR3 DIMM<br />Memory<br />CPU Chip<br />APU Chip<br />UVD<br />UNB / MC<br />MC<br />UNB<br />GPU<br />~27 GB/sec<br />~7 GB/sec<br />Graphics requires memory bandwidth to bring full capabilities to life<br />GPU<br />UVD<br />~27 GB/sec<br />PCIe<br />SB Functions<br />3X bandwidth between GPU and memory<br />Even the same sized GPU is substantially more effective in this configuration<br />Eliminate latency and power associated with the extra chip crossing<br />Substantially smaller physical foot print<br />PCIe<br />Bandwidth pinch points and latency hold back the GPU capabilities<br />
  72. 72. “Ontario” & “Zacate” Architecture<br />APU<br /><ul><li>2 x86 CPU Cores (40nm “Bobcat” core – 1 MB L2, 64-bit FPU)
  73. 73. C6 and power gating
  74. 74. Array of SIMD Engines
  75. 75. DX11 graphics performance
  76. 76. Industry leading 3D and graphics processing
  77. 77. 3rd Generation Unified Video Decoder
  78. 78. H.264, VC1, DixX/Xvid format
  79. 79. DDR3 800-1066, 2 DIMMs, 64 bit channel
  80. 80. BGA package</li></ul>Display and I/O<br /><ul><li>Two dedicated digital display interfaces
  81. 81. Configurable externally as HDMI, DVI, and/or Display Port
  82. 82. Also supports a single link LVDS for internal panels
  83. 83. Integrated VGA
  84. 84. 5x8 PCIe®
  85. 85. “Hudson” Fusion Controller Hub</li></li></ul><li>Summary<br />More realistic graphics<br />Perfect power management and energy efficiency<br />Used by all kind of applications<br />55<br />Everywhere<br />
  86. 86.<br /><br />Obrigado!<br />