FUSION APU AND TRENDS/
CHALLENGES IN FUTURE
SOC (PROCESSOR) DESIGN
Pankaj Singh,
Acknowledgement:
Denis Foley. Sr. Fellow,...
2 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
TODAY’S TOPICS
 Trends:
– Three Eras of Processor Performance
– Evolution...
3 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
TRENDS: THREE ERAS OF PROCESSOR PERFORMANCE
Single-Core
Era
Single-threadP...
4 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
TRENDS: EVOLUTION OF HETEROGENEOUS COMPUTINGArchitectureMaturity&Programme...
5 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
FSA & OPEN STANDARD: ENTER FUSION
Dual Core CPU Northbridge DirectX®11 GPU...
6 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
FSA & OPEN STANDARD: WHY FUSION?
6
 Integrating CPUs, Northbridge and GPU...
7 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
COMMITTED TO OPEN STANDARDS
 AMD drives open and de-facto
standards
– Com...
8 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
OPENCL™ AND FSA
 FSA is an optimized platform
architecture for OpenCL™
– ...
POWER & PERFORMANCE
10 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
POWER-THERMAL EFFECTS IN SYSTEMS ON CHIPS
¡ Local failures !
Part not wor...
11 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
OPTIONS FOR POWER SAVINGS
 Convergence of Performance and Low Power
– No...
12 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
PERFORMANCE AND POWER
S3 idle Static
Screen
MM07 Media
Playback
Full
Comp...
HIGH SPEED, SCALABLE
INTERCONNECT
14 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
NOC’S: FROM BUSES TO NETWORKS:
[Friedman Harel:10]
Note: This slide prese...
15 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
NOC CHALLENGES: CAD TOOLS
 Capturing application traffic.
 Which Topolo...
16 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
Synchronous Delay Insensitive
Global None
Timing Assumptions
Less Detecti...
3-D STACKING
18 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
3-D STACKING
 Supporting Heterogeneous computing: high density, high per...
SOC TRENDS &
CHALLENGES:
1. VERIFICATION EFFORT
2. IP INTEGRATION
3. TLM-RTL CO-SIMULATION
20 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
WHAT’S NEW IN SOC DESIGN?
 Larger and more complex chips with heavy use ...
21 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
VERIFICATION EFFORT
 Debugging
– Seamless debug across
h/w and software[...
22 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
VERIFICATION EFFORT
 Creating/Running Testcase:
– Direct & Random
– Run ...
23 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
Emulation Focus Areas:
1. Tests/regression run with Long run time
2. Corn...
24 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
IP INTEGRATION CHALLENGE
 Integration of IP :
– Multiple IP’s, various c...
25 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
IP INTEGRATION CHALLENGE COMPARISON OF CHOICES
Direct
Instantiation
SV Bi...
26 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
IP INTEGRATION CHALLENGE: GAPS WITH ANALOG IP
INTEGRATION IN SOC
Table1. ...
27 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
 Verification Environment Bring-up
– Automated Assertions for early chec...
28 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
TLM, RTL Co-simulation
 Traditional use of System level models : Archite...
29 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
THANK YOU!
30 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
REFERENCES
[1] Wilson Research Group-MGC study blog 2011.
[2] AMD Coolchi...
31 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
GLOSSARY
 GPU – Graphics processing unit
 APU: Accelerated Processing U...
32 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
BACKUP
33 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011
Disclaimer
The information presented in this document is for informationa...
Upcoming SlideShare
Loading in …5
×

FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

1,558 views

Published on

9th International SoC Conference 2011
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,558
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
51
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

  1. 1. FUSION APU AND TRENDS/ CHALLENGES IN FUTURE SOC (PROCESSOR) DESIGN Pankaj Singh, Acknowledgement: Denis Foley. Sr. Fellow, AMD 9th International SoC Conference 2nd & 3rd November 2011
  2. 2. 2 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 TODAY’S TOPICS  Trends: – Three Eras of Processor Performance – Evolution of Heterogeneous Computing  FSA and Open Standard: – Why Fusion ? – Open Standard, Open CL  Power, Performance  High Speed, Scalable Interconnect: NoC’s  3-D Stacking  SoC Trends & Challenges – Verification Effort – IP Integration – TLM, RTL Co-simulation challenges.
  3. 3. 3 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 TRENDS: THREE ERAS OF PROCESSOR PERFORMANCE Single-Core Era Single-threadPerformance ? Time we are here o Enabled by:  Moore’s Law  Voltage Scaling  MicroArchitecture Constrained by: Power Complexity Multi-Core Era ThroughputPerformance Time (# of Processors) we are here o Enabled by:  Moore’s Law  Desire for Throughput  20 years of SMP arch Constrained by: Power Parallel SW availability Scalability Heterogeneous Systems Era TargetedApplication Performance Time (Data-parallel exploitation) we are here o Enabled by:  Moore’s Law  Abundant data parallelism  Power efficient GPUs Currently constrained by: Programming models Communication overheads
  4. 4. 4 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 TRENDS: EVOLUTION OF HETEROGENEOUS COMPUTINGArchitectureMaturity&ProgrammerAccessibility PoorExcellent 2012 - 20202009 - 20112002 - 2008 Graphics & Proprietary Driver-based APIs Proprietary Drivers Era  “Adventurous” programmers  Exploit early programmable “shader cores” in the GPU  Make your program look like “graphics” to the GPU  CUDA™, Brook+, etc OpenCL™, DirectCompute Driver-based APIs Standards Drivers Era  Expert programmers  C and C++ subsets  Compute centric APIs , data types  Multiple address spaces with explicit data movement  Specialized work queue based structures  Kernel mode dispatch Fusion™ System Architecture GPU Peer Processor Architected Era  Mainstream programmers  Full C++  GPU as a co-processor  Unified coherent address space  Task parallel runtimes  Nested Data Parallel programs  User mode dispatch  Pre-emption and context switching More uptodate information on FSA: http://developer.amd.com/afds/pages/keynote.aspx#/Dev_AFDS_Reb_2
  5. 5. 5 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 FSA & OPEN STANDARD: ENTER FUSION Dual Core CPU Northbridge DirectX®11 GPU FUSION APU (Accelerated Processing Unit) Heterogeneous compute engine combining x86 compute and parallel processing capabilities of the GPU on a single die
  6. 6. 6 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 FSA & OPEN STANDARD: WHY FUSION? 6  Integrating CPUs, Northbridge and GPU enables: – Unified Memory – High-bandwidth, low latency access by GPU – Saves on interface power and PHY area – Shared Power Control and TDP envelope Potential bandwidth bottleneck Relatively long memory latency
  7. 7. 7 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 COMMITTED TO OPEN STANDARDS  AMD drives open and de-facto standards – Compete on the best implementation  Open standards are the basis for large ecosystems  Open standards always win over time – SW developers want their applications to run on multiple platforms from multiple hardware vendors DirectX®
  8. 8. 8 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 OPENCL™ AND FSA  FSA is an optimized platform architecture for OpenCL™ – Not an alternative to OpenCL™  OpenCL™ on FSA will benefit from – Avoidance of wasteful copies – Low latency dispatch – Improved memory model – Shared pointers  FSA also exposes a lower level programming interface, for those that want the ultimate in control and performance  Optimized libraries may choose the lower level interface
  9. 9. POWER & PERFORMANCE
  10. 10. 10 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 POWER-THERMAL EFFECTS IN SYSTEMS ON CHIPS ¡ Local failures ! Part not working  Complex SoCs: High power density  Non-uniform power dissipation: Hotspots  Spatial gradients: Cause malfunctions  High on-chip temperatures cause malfunctions affecting reliability.  Power consumption depends on frequency  Setting frequencies to control power and temperature
  11. 11. 11 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 OPTIONS FOR POWER SAVINGS  Convergence of Performance and Low Power – Notebook->Netbook-> Tablet Tablet<-Smartphone
  12. 12. 12 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 PERFORMANCE AND POWER S3 idle Static Screen MM07 Media Playback Full Compute APU Power vs. Use Case Performance Power  Performance versus Power Efficiency  Power Management versus Power reduction  Performance & Thermal Design Power
  13. 13. HIGH SPEED, SCALABLE INTERCONNECT
  14. 14. 14 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 NOC’S: FROM BUSES TO NETWORKS: [Friedman Harel:10] Note: This slide presents industry specific information does not relate to AMD NoC status
  15. 15. 15 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 NOC CHALLENGES: CAD TOOLS  Capturing application traffic.  Which Topology ?  Mapping? Routes to use?  Fixing communication architecture : parameters.  Verification for correctness, performance.  Build models.  QoS under un-reliable conditions. Key to success: Automate & integrate the steps. Mesh Topology homogeneous systems, with regular tiles Customized Topology heterogeneous systems, with different cores & irregular FP Software Services Mapping, QoS, middleware... Architecture Packeting, buffering, flow control... Physical Implementation Synchronization, wires, power... CAD Tools
  16. 16. 16 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 Synchronous Delay Insensitive Global None Timing Assumptions Less Detection Local Clocks, Interaction with data (becoming aperiodic)  A complete spectrum of approaches to system-timing exist [Mullins06-07] NOC CHALLENGES: BEYOND GLOBAL SYNCHRONY Delay Insensitive
  17. 17. 3-D STACKING
  18. 18. 18 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 3-D STACKING  Supporting Heterogeneous computing: high density, high performance, high memory B.W requirement.  3-D NoC’s option  Futuristic view: Integrating Bio-sensor Note: This slide presents industry specific information does not relate to AMD 3-D stacking status
  19. 19. SOC TRENDS & CHALLENGES: 1. VERIFICATION EFFORT 2. IP INTEGRATION 3. TLM-RTL CO-SIMULATION
  20. 20. 20 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 WHAT’S NEW IN SOC DESIGN?  Larger and more complex chips with heavy use of pre-existing cores.  Heavy use of multi core processors and DSPs.  Complex Interconnect.  Shorter time to market and Smaller design teams.  … and software.  Leads to: – Increased verification effort: Debugging is harder. – Integration is more difficult. – Need for scalable and high speed interconnect. – SW / HW co-simulation is a major issue. – Power –Performance challenge. – How do we treat the system software?
  21. 21. 21 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 VERIFICATION EFFORT  Debugging – Seamless debug across h/w and software[especially SW]  Testbench Development: – Several methodologies  VMM,OVMUVM.  New developments [Unified strategy] – UCIS,UVM TLM2.0 – Coverage trend  Address Gaps in VHDL, System C coverage
  22. 22. 22 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 VERIFICATION EFFORT  Creating/Running Testcase: – Direct & Random – Run time improvement Save-restore. Verification Cycle per second instead of Cycles per second: Configuring environment to dynamically select relevant design/core. Alternate options
  23. 23. 23 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 Emulation Focus Areas: 1. Tests/regression run with Long run time 2. Corner case bugs that may escape traditional verification 3. Replicating System level scenarios Ongoing Initiatives/Need: 1.Seemless support for assertions. 2.Improve portability between Simulation & Emulation 3. Common model from TLM-HDL-Emulation VERIFICATION EFFORT  Alternate Options
  24. 24. 24 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 IP INTEGRATION CHALLENGE  Integration of IP : – Multiple IP’s, various configurations, design languages – IP’s to be in Sync: macro’s , libraries. – Complexity increases with mixed language designs SYSTEM C SVLO G VERILOG VHDL Unique Strengths of Languages Diversity of Design Teams Importing Existing IP Legacy Testbench Environment
  25. 25. 25 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 IP INTEGRATION CHALLENGE COMPARISON OF CHOICES Direct Instantiation SV Bind Construct SystemC Control/Observe SCV- Connect() SC-DPI Source Code Available Yes Yes Yes Yes Yes One IP Compiled Yes Yes Yes Yes Yes Both IP Compiled No Yes No No No Performance ++++ (3) +++ (2) + (1) + (1) +++++(4) Delta Delay Yes Yes No No No Languages Supported SV, SC, VHDL SV, SC, VHDL SC + SV/VHDL SC + SV/VHDL SC + SV Gap: No standardized automated methodology for integration. Recommended Approach: • Understand IP blocks: language, source code availability. • Understand connection: 1-1, distributed, method port • Option for optimized solution to quickly build a system
  26. 26. 26 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 IP INTEGRATION CHALLENGE: GAPS WITH ANALOG IP INTEGRATION IN SOC Table1. Gaps with Analog IP Integration in SoC Gaps Root Cause Testchip setup -Testchip scenario is different -Tester used for testchip differs Inbuilt debug -Incomplete inbuilt SoC test/debug capability or derisk option for basic functionality such as PLL clock IP I/F verification -Incomplete test setup Review process -No common detailed review process between IP and SoC team. Incorrect assumption based on past analog IP working silicon IP Modelling -Mismtach in version between IP simulation model and spice netlist -Limitations of behavioral model to replicate actual analog IP functionality -Timing issue -DFT issue EDA tools -Gaps in analog and digital simulation environment
  27. 27. 27 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011  Verification Environment Bring-up – Automated Assertions for early checks. – Review forces, tie-off and relevant checkers from IP to SoC – Bottleneck for SoC team to get started with verification: Option to use fake model for initial bring up. Usage of system model. – Super Block Concept: pre-verified IP blocks at similar frequency & interface  Requirement:  Current solution: In-house methodology and process. No clear solution from EDA vendors. IP INTEGRATION CHALLENGE IP Block1 IP Block2 Minimum Manual Effort Hookup Using ICU No BUGS!
  28. 28. 28 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 TLM, RTL Co-simulation  Traditional use of System level models : Architecture profiling & Performance Analysis  Increasing Demand for Co-simulation: Tradeoff between Accuracy and Performance.  Open Challenges  Different level of Abstraction.  Need for improvement in Integration methodology and Test bench development  Seamless Debug and Coverage methodology.  Using System Level model for HDL generation  Legacy system model not written with conversion in mind.  Current limitation: Incomplete translation.  Lack of reliable Equivalence Check tool.  Need: Merge top down (SystemC) and bottom-up (System Verilog) methodology/flow.  Gaps/Work to do: How to do Power analysis
  29. 29. 29 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 THANK YOU!
  30. 30. 30 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 REFERENCES [1] Wilson Research Group-MGC study blog 2011. [2] AMD Coolchip2011 presentation. Denis Foley, AMD Sr. Fellow. [3] Fusion Processors and HPC-2011, Chuck Moore, AMD Corporate Fellow & Technology Group CTO [3] AMD Fusion Developer Summit 2011. Phil Rogers, AMD Corporate Fellow [4] Fully Asynchronous framework for GALS network on chip. Friedman H [5]Future of EE, NoC’s presentation. Dr. Srinivasan Murali [6] Analog IP integration in SoC, IP reuse’09. Mixed language IP integration DVCoN 2010. Extending Fucntional coverage to SystemC, VHDL-IP’10. Pankaj S
  31. 31. 31 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 GLOSSARY  GPU – Graphics processing unit  APU: Accelerated Processing Unit  Open CL: Open Computing Language  TDP – Thermal Design power – a measure of a design infrastructure’s ability to cool a device  NoC: Network On Chip  TLM: Transaction Level Modeling  Turbo Core – AMD boost mechanism  QoS: Quality of Service  UVM: Universal Verification Methodology  UCIS: Unified Coverage Interoperability Standard
  32. 32. 32 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 BACKUP
  33. 33. 33 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011 Disclaimer The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD makes no representations or warranties with respect to the contents hereof and assumes no responsibility for any inaccuracies, errors or omissions that appear in this information. AMD specifically disclaims any implied warranties of merchantability or fitness for any particular purpose. In no event will AMD be liable to any person for any direct, indirect, special or other consequential damages arising from the use of any information contained herein, even if AMD is expressly advised of the possibility of such damages. Trademark Attribution AMD, the AMD Arrow logo, AMD Athlon, AMD Phenom, AMD Turion, AMD Radeon, and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. Microsoft, Windows and DirectX are registered trademarks of Microsoft Corporation in the United States and/or other jurisdictions. PCIe is a registered trademark of PCI-SIG. Other names used in this presentation are for identification purposes only and may be trademarks of their respective owners. ©2011 Advanced Micro Devices, Inc. All rights reserved.

×