AMD heterogeneous Uniform Memory AccessPHIL ROGERS, CORPORATE FELLOWJOE MACRI, CORPORATE VICE PRESIDENT & PRODUCT CTOSASA ...
ABOUT HSA
3AMD Confidential, under embargo until Apr 30, 12:01 AM EST10 YEARS AGO…Memory Controller onthe chipHyperTransport64-bit e...
4AMD Confidential, under embargo until Apr 30, 12:01 AM EST050010001500200025003000350040004500200220032004200520062007200...
5AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWHAT IS HSA?SERIALWORKLOADSPARALLELWORKLOADShUMA (MEMORY)APUACC...
6AMD Confidential, under embargo until Apr 30, 12:01 AM ESTHSA EVOLUTIONUniform memory accessfor CPU and GPUGPU can access...
7AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWHAT IS hUMA?heterogeneousUNIFORMMEMORYACCESS
8AMD Confidential, under embargo until Apr 30, 12:01 AM ESTUNDERSTANDING UMAOriginal meaning of UMA is Uniform Memory Acce...
9AMD Confidential, under embargo until Apr 30, 12:01 AM ESTINTRODUCING hUMACPUAPUAPUwithHSAMemoryCPU CPU CPU CPUUMACPU Mem...
10AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA KEY FEATURESBI-DIRECTIONAL COHERENT MEMORYAny updates mad...
11AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA KEY FEATURESPhysical MemoryGPUHWCoherencyVirtual MemoryCP...
12AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWITHOUT POINTERS* AND DATA SHARING*A Pointer is a named variab...
13AMD Confidential, under embargo until Apr 30, 12:01 AM ESTGPUWith hUMA:• CPU simply passes a pointer to GPU• GPU complet...
14AMD Confidential, under embargo until Apr 30, 12:01 AM ESTTOP 10 REASONS TO GO FULLY HARDWARE COHERENT ON GPU/APU1. Much...
15AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA FEATURESAccess to Entire Memory SpacePageable memoryBi-d...
hUMA BENEFITS
17AMD Confidential, under embargo until Apr 30, 12:01 AM ESTBENEFITS OF HSA
18AMD Confidential, under embargo until Apr 30, 12:01 AM ESTUNIFORM MEMORY BENEFITS TO DEVELOPERSEASE AND SIMPLICITY OF PR...
19AMD Confidential, under embargo until Apr 30, 12:01 AM ESTBETTER EXPERIENCESRadically different user experiencesLONGER B...
20AMD Confidential, under embargo until Apr 30, 12:01 AM ESTSUPPORT FROM MAJOR INDUSTRY PLAYERS For more information go t...
21AMD Confidential, under embargo until Apr 30, 12:01 AM ESTHSANov 11 – 14, 2013San JoseMcEnery Convention Center14 Differ...
THANK YOU
23AMD Confidential, under embargo until Apr 30, 12:01 AM ESTGFLOPSYear CPU CPU GFLOPS GPU (RADEON) GPU GFLOPS2002 Pentium ...
24AMD Confidential, under embargo until Apr 30, 12:01 AM ESTPOTENTIAL MARKET IS HUGENotebooksServersDesktopsEmbeddedGame C...
25AMD Confidential, under embargo until Apr 30, 12:01 AM ESTDISCLAIMERThe information presented in this document is for in...
Upcoming SlideShare
Loading in …5
×

AMD Heterogeneous Uniform Memory Access

13,885
-1

Published on

Published in: Technology
4 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total Views
13,885
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
266
Comments
4
Likes
2
Embeds 0
No embeds

No notes for slide
  • Instead, it fell to AMD to take the logical step and simply extend the x86 architecture to include 64-bit instructions, resulting in what the firm called AMD64.At a stroke, AMD was offering businesses the chance to give a performance boost to existing applications, with a seamless upgrade path to 64-bit software when this became available.Until that point, multiple processor chips simply shared a bus connection with the rest of the system. This arrangement created a memorybandwidth bottleneck that effectively made PC servers with more than four sockets impractical without a specialist chipset.AMD solved this problem by giving each Opteron chip its own directly connected pool of memory, and introduced a high-speed point-to-point interconnect called HyperTransport to link the chips to each other and the rest of the system in a switched fabric."By moving the memory controller onto the processor die, it runs at core frequency, and each processor added [to the system] adds another memory controller, so memory bandwidth scales with the number of processors," explained Mark Tellez, then AMD's server/workstation marketing manager.
  • HSA will empower software developers to easily innovate and unleash new levels of performance and functionality on all your modern devices and lead to powerful new experiences such as visually rich, intuitive, human-like interactivity.   
  • Power efficient – running parallel code on the GPU; eliminate copiesEasy to program – High level languages, task parallel runtimesReady for tomorrow’s workloads, increasingly dominated by parallel processingBuilt from established technology and operating principles. We take the architecture that has worked for SMP and multicore systems and extend it to the GPUHSA is open. This is so important. Multivendor architectures spawn large ecosystemsBroadly supported across multiple vendors. HSA is an architecture that spans from the smart phone to the super computer. Phones, tablets, fanless notebooks, desktops, all in ones, workstations, cloud servers, HPC. Go to hsafoundation.com to see the list of industry leaders who have joined us already.
  • In 2012, there was 2012 billion smart connected devices. HSA Foundation members made up approx. 800 million of those, Intel about 300 million and nvidia around 100 or less. If you assume similar mamber share by these companies in 2016, HSA Foundation member companies will be in over 1.6 billion devices
  • $37 Billion for existing PC markets and 10 Billion for new markets. (Game consoles represent $1 billion)Profit pool analysis highlights market attractiveness by contrasting processor revenue TAM with near term estimated market growth rates. In the near term, existing profit pool markets are estimated at 2X as large as the that of new entrants but caution should be exercised as the new market entrants are enjoying fast adoption and growth highlighting the need to investigate strategic choices and enhancements. Question: Even though ARM-based SOCs for tablets/smartphones are BSOs, should we make a trade-off of core market investments to deliver one? Why?If we had $50M-100M incremental R&D dollars, what would the next priority SOC be? ARM? Client-X86? Server?
  • AMD Heterogeneous Uniform Memory Access

    1. 1. AMD heterogeneous Uniform Memory AccessPHIL ROGERS, CORPORATE FELLOWJOE MACRI, CORPORATE VICE PRESIDENT & PRODUCT CTOSASA MARINKOVIC, SENIOR MANAGER, PRODUCT MARKETINGAMD Confidential, under embargo until Apr 30, 12:01 AM EST
    2. 2. ABOUT HSA
    3. 3. 3AMD Confidential, under embargo until Apr 30, 12:01 AM EST10 YEARS AGO…Memory Controller onthe chipHyperTransport64-bit extensionsAMD Opteron
    4. 4. 4AMD Confidential, under embargo until Apr 30, 12:01 AM EST05001000150020002500300035004000450020022003200420052006200720082009201020112012CPU GFLOPS GPU GFLOPSHOW DO WE UNLOCK THISPERFORMANCE?GPU COMPUTE CAPABILITY IS MORE THAN THAT OF THE CPU See slide 24 for details10X
    5. 5. 5AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWHAT IS HSA?SERIALWORKLOADSPARALLELWORKLOADShUMA (MEMORY)APUACCELERATED PROCESSING UNITAn intelligent computing architecture that enables CPU, GPU and other processors to work in harmonyon a single piece of silicon by seamlessly moving the right tasks to the best suited processing element
    6. 6. 6AMD Confidential, under embargo until Apr 30, 12:01 AM ESTHSA EVOLUTIONUniform memory accessfor CPU and GPUGPU can access CPUmemoryIntegrate CPU and GPUin siliconCapabilitiesSimplifieddata sharingImproved computeefficiencyUnified powerefficiencyBenefits
    7. 7. 7AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWHAT IS hUMA?heterogeneousUNIFORMMEMORYACCESS
    8. 8. 8AMD Confidential, under embargo until Apr 30, 12:01 AM ESTUNDERSTANDING UMAOriginal meaning of UMA is Uniform Memory Access• Refers to how processing cores in a system view and access memory• All processing cores in a true UMA system share a single memory address spaceIntroduction of GPU compute created systems with Non-Uniform Memory Access (NUMA)• Require data to be managed across multiple heaps with different address spaces• Add programming complexity due to frequent copies, synchronization, and address translationHSA restores the GPU to Uniform memory Access• Heterogeneous computing replaces GPU Computing
    9. 9. 9AMD Confidential, under embargo until Apr 30, 12:01 AM ESTINTRODUCING hUMACPUAPUAPUwithHSAMemoryCPU CPU CPU CPUUMACPU MemoryCPU CPU CPU CPUNUMAGPUGPUGPUGPUGPU MemoryMemoryCPU CPU CPU CPUhUMAGPUGPUGPUGPU
    10. 10. 10AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA KEY FEATURESBI-DIRECTIONAL COHERENT MEMORYAny updates made by one processing element will be seen by all other processing elements -GPU or CPUPAGEABLE MEMORYGPU can take page faults, and is no longer restricted to page locked memoryENTIRE MEMORY SPACECPU and GPU processes can dynamically allocate memory from the entire memory space
    11. 11. 11AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA KEY FEATURESPhysical MemoryGPUHWCoherencyVirtual MemoryCPUEntire memory space:Both CPU and GPU can access and allocate anylocation in the system’s virtual memory spaceCacheCacheCoherent Memory:Ensures CPU and GPUcaches both seean up-to-date view of dataPageable memory:The GPU can seamlesslyaccess virtual memoryaddresses that are not (yet)present in physical memory
    12. 12. 12AMD Confidential, under embargo until Apr 30, 12:01 AM ESTWITHOUT POINTERS* AND DATA SHARING*A Pointer is a named variable that holds a memory address. It makes it easy to reference data or code segments by a name and eliminates the needfor the developer to know the actual address in memory. Pointers can be manipulated by the same expressions used to operate on any other variableGPUCPUCPU Memory GPU Memory| | | | || | | | || | | | || | | | |Without hUMA:• CPU explicitly copies data to GPU memory• GPU completes computation• CPU explicitly copies result back to CPU memoryOnly the data arraycan be copied since GPUcannot follow embeddeddata-structure links
    13. 13. 13AMD Confidential, under embargo until Apr 30, 12:01 AM ESTGPUWith hUMA:• CPU simply passes a pointer to GPU• GPU completes computation• CPU can read the result directly – no copying needed!CPUCPU / GPU Uniform Memory| | | | || | | | |*A Pointer is a named variable that holds a memory address. It makes it easy to reference data or code segments by a name and eliminates the needfor the developer to know the actual address in memory. Pointers can be manipulated by the same expressions used to operate on any other variableCPU can pass a pointer toentire data structure sincethe GPU can now followembedded linksWITH POINTERS* AND DATA SHARING
    14. 14. 14AMD Confidential, under embargo until Apr 30, 12:01 AM ESTTOP 10 REASONS TO GO FULLY HARDWARE COHERENT ON GPU/APU1. Much easier for programmers2. No need for special APIs3. Move CPU multi-core algorithms to the GPU without recoding for absence of coherency4. Allow finer grained data sharing than software coherency5. Implement coherency once in hardware, rather than N times in different software stacks6. Prevent hard to debug errors in application software7. Operating systems prefer hardware coherency – they do not want the bug reports to the platform8. Probe filters and directories will maintain power efficiency9. Full coherency opens the doors to single source, native and managed code programming for heterogeneous platforms10. Optimal architecture for heterogeneous computing on APUs and SOCsAMD Confidential, under embargo until Apr 30, 12:01 AM EST
    15. 15. 15AMD Confidential, under embargo until Apr 30, 12:01 AM ESThUMA FEATURESAccess to Entire Memory SpacePageable memoryBi-directional CoherencyFast GPU access to system memoryDynamic Memory Allocation
    16. 16. hUMA BENEFITS
    17. 17. 17AMD Confidential, under embargo until Apr 30, 12:01 AM ESTBENEFITS OF HSA
    18. 18. 18AMD Confidential, under embargo until Apr 30, 12:01 AM ESTUNIFORM MEMORY BENEFITS TO DEVELOPERSEASE AND SIMPLICITY OF PROGRAMMINGSingle, standard computing environmentsLOWER DEVELOPMENT COSTMore efficient architecture enables less people to do the same workSUPPORT FOR MAINSTREAM PROGRAMING LANGUAGESPython, C++, Java
    19. 19. 19AMD Confidential, under embargo until Apr 30, 12:01 AM ESTBETTER EXPERIENCESRadically different user experiencesLONGER BATTERY LIFELess power at the same performanceMORE PERFORMANCEGetting more performance from the same form factorBENEFITS TO CONSUMERS
    20. 20. 20AMD Confidential, under embargo until Apr 30, 12:01 AM ESTSUPPORT FROM MAJOR INDUSTRY PLAYERS For more information go to: http://hsafoundation.com/  Source http://pinterest.com/pin/193021534001931884/
    21. 21. 21AMD Confidential, under embargo until Apr 30, 12:01 AM ESTHSANov 11 – 14, 2013San JoseMcEnery Convention Center14 Different Tracks with over 140 Individual Presentations
    22. 22. THANK YOU
    23. 23. 23AMD Confidential, under embargo until Apr 30, 12:01 AM ESTGFLOPSYear CPU CPU GFLOPS GPU (RADEON) GPU GFLOPS2002 Pentium 4 (Northwood) 12.24 9700 Pro 31.22003 Pentium 4 (Northwood) 12.8 9800 XT 36.482004 Pentium 4 (Prescott 15.2 X850 XT 103.682005 15.2 X1800 XT 134.42006 Core 2 Duo 23.44 X1950 3752007 Core 2 Quad 48 HD 2900 XT 473.62008 Q9650 96 HD 4870 12002009 Core i7 960 102.4 HD 5870 27202010 Core i7 970 153.6 HD 6970 27032011 Core i7 3960X 316.8 HD7970 37892012 Core i7 3970X 336 HD 7970 GHz Edition 4301
    24. 24. 24AMD Confidential, under embargo until Apr 30, 12:01 AM ESTPOTENTIAL MARKET IS HUGENotebooksServersDesktopsEmbeddedGame ConsolesTablets
    25. 25. 25AMD Confidential, under embargo until Apr 30, 12:01 AM ESTDISCLAIMERThe information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, componentand motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmwareupgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and tomake changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANYINACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TOANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IFAMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.ATTRIBUTION© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, and combinations thereof are trademarks ofAdvanced Micro Devices, Inc. Other names and logos are used for informational purposes only and may be trademarks of their respectiveowners.
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×