Your SlideShare is downloading. ×
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

AMD Hot Chips Bulldozer & Bobcat Presentation

12,535

Published on

AMD will be revealing details of two new core architectures - Bulldozer and Bobcat - at Hot Chips 22

AMD will be revealing details of two new core architectures - Bulldozer and Bobcat - at Hot Chips 22

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
12,535
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Before we start: There is a lot of technical detail available below what we are about to show you, this presentation is intended to give you a high level overview of both designs and AMD’s expectations for each. The engineering detail will be presented by the two chief architects for the designs at the upcoming HotChips conference on the Stanford Campus next week. Please feel free to ask detailed questions along the way if you would like to hear more about a specific feature or operation. At a higher level, this shows innovation at AMD remains alive and well. Please think of these core architectures within the context of the new, revitalized AMD built around our focus as a design company since the spin-off of GlobalFoundries, our new VISION platforms and marketing program, and our Fusion APU strategy. “Bobcat” and “Bulldozer” are the latest chapters in that story and form a solid foundation for AMD products for years to come.
  • The two cores, although both x86 compatible, are completely different for a reason. The workloads, end equipment markets and usage scenarios require different approaches and that’s what AMD recognized at the onset of this effort. Think of “Bulldozer”, just as the name implies, as the heavy lifter. It will appear in server, as well as mainstream and high performance client products. “Bobcat” is small and highly efficient. It utilizes those characteristics to address the highly portable netbook / notebook markets.So, 2 different designs, with different goals in mind.
  • So starting with Bulldozer, here’s a block diagram that shows its distinguishing features. We are taking 2 of the most frequently used parts of processor, the integer cores and adding a hefty, shared floating point capability to deliver 2 robust threads much more efficiently than Hyper-threading where a single integer core is used.We have also added a number of instruction set extensions to increase the design’s capabilities and done extensive work on power management to improve performance per watt even further.The 32nm process technology delivers additional savings in terms of area and power consumption; this our first process technology to utilize high-K metal gate.
  • The previous slide hinted to a key differentiator of Bulldozer that bears more explanation.A big conversation in the industry these last few years is how to continue to increase processing performance as we reach plateaus in clock speed.Essentially there have been two approaches used – SMT, which stands for Simultaneous Multi-Threading and CMP, which stands for Core Multi-Processing. CMP is probably the easiest to understand, because it can be described as “if one core is good, two must be better” and it is.. So CMP architectures take a complete core and replicate it.SMT is a little more complex to picture, but because of the way instructions are decoded and executed, it’s possible to have two concurrently tasks running on a single core.Bulldozer takes a third approach..
  • On the first Bulldozer slide we mentioned “true core functionality” – so what exactly does that mean. There are two complete integer units in the Bulldozer design for the most common type of compute tasks, so it functions like a dual-core design allowing maximum performance rather than pushing two threads through a single core. However, we don’t replicate everything on the core like a CMP either. Floating point operations on Bulldozer use a shared scheduler and two 128-bit Multiply and Accumulate Units. Extensive research went into analyzing workloads ahead of this design, so we feel the division between shared and discrete components is the right one. And by the way, the idea of sharing hardware is hardly new, right? Shared Cache, the Northbridge, etc. have been shared across multi-core designs for years already.
  • You can see that larger view of shared hardware components here as we raise our view up to the chip level. On an 8 core Bulldozer design you can see how Bulldozer “modules” are grouped together to share L3 cache and Northbridge, and combined with a memory controller and Northbridge controller to form the major components of the chip. And again, the OS and applications see true cores; the shared floating point components and L2 cache are transparent to the code.
  • So that covers Bulldozer, now let’s cover AMD’s new core design specifically for the low-power x86 market. “Bobcat” is small and highly efficient. It utilizes those characteristics to address the highly portable netbook / notebook markets.
  • Bobcat is a little bit more straight-forward to understand than Bulldozer, but it too, has some highly differentiated features to it. And these were stated from the very beginning because of AMD’s understanding of the final products requirements.
  • So those were the goals. Where did we end up? Bobcat can operate below one-watt (with a resulting reduction in performance) – that’s not a statement about any resulting products, but it does give you some sense of the core’s power envelope. The next bullets here are critical – out-of-order execution means higher performance than an in-order execution core like Atom, pure and simple. Synthesizeable means it uses few custom logic arrays that are more dependent on the specifics of the underlying manufacturing technology for optimal performance and that it can be more easily integrated into SoC designs for faster turnaround of new variations.No limitations on the instruction set either, including support for virtualization.AMD estimates 90% of today’s mainstream CPU performance in less than half the silicon area and a fraction of the power.Will appear early next year in Ontario, which is ahead of schedule.
  • Technical details if needed.
  • The need for optimal energy-efficient balance of CPU and GPU represents the beginning of a new era of computing in 2011, the era of the accelerated processing unit or APU, which combines both on a single piece of silicon.The Fusion of CPU and GPU compute power is what the next chapter in visual computing requires – a powerful visual computing experience at home or on the go without compromise. Our AMD Fusion™ design is driven by mobility and is based on a low-power visual compute architecture that will enhance active and resting battery life while increasing both CPU and GPU performance. This is the culmination of the vision of ‘One AMD’ and only AMD can deliver the GPU and CPU combination that will be the future of computing
  • Transcript

    • 1. “Bulldozer” and “Bobcat”
      AMD’s Latest x86 Core Innovations
      HotChips22
    • 2. Two x86 Cores Tuned for Target Markets
      Mainstream Client and Server Markets
      “Bulldozer”
      Performance & Scalability
      Low PowerMarkets
      Small
      Die Area
      Cloud Clients Optimized
      “Bobcat”
      Flexible, Low Power & Small
    • 3. The Bulldozer Architecture
      “Bulldozer”
      An innovative design that delivers true core functionality by pairing two integer execution cores with components that can be shared as needed
      Instruction Set extensions to increase capability of the design
      Extensive new power efficiency innovations
      Manufactured on the latest 32nm SOI technology
      Fetch
      Decode
      IntegerScheduler
      IntegerScheduler
      FP Scheduler
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      128-bitFMAC
      128-bitFMAC
      L1 DCache
      L1 DCache
      Shared L2 Cache
    • 4. Approaches for Supporting Multiple Threads
      SMT
      • Force two threads into one core
      • 5. Threads compete for resources
      • 6. Relies on under- utilization
      CMP
      • Dedicated cores for each thread
      • 7. Traditional brute force approach
      • 8. Each core is over- provisioned
      However, there is another way . . .
    • 9. Bulldozer: Two Strong Threads
      Hyperthreaded, single-core chip
      “Bulldozer”
      Fetch
      Fetch
      Decode
      Decode
      IntegerScheduler
      IntegerScheduler
      IntegerScheduler
      FP Scheduler
      FP Scheduler
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      CORE 1
      128-bitFMAC
      128-bitFMAC
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      128-bitFMAC
      128-bitFMAC
      L1 DCache
      L1 DCache
      L1 DCache
      Shared L2 Cache
      L2 Cache
    • 10. DedicatedComponents
      Shared at the module level
      Shared at the chip level
      Sharing Resources
      Fetch
      The Bulldozer architecture has shared and dedicated components
      The shared components:
      Help reduce power consumption
      Help reduce die space (cost)
      The dedicated components:
      Help increase performance and scalability
      Bulldozer dynamically switches between shared and dedicated components to maximize performance per watt
      Decode
      FP Scheduler
      IntScheduler
      IntScheduler
      Core 1
      Core 2
      L1 DCache
      L1 DCache
      128-bit FMAC
      128-bit FMAC
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Shared L2 Cache
      Shared L3 Cache and NB
    • 11. Building a Bulldozer-Based Chip
      Fetch
      Decode
      IntScheduler
      IntScheduler
      FP Scheduler
      Shared L3 Cache and NB
      Integrated Memory Controller
      Integrated Northbridge Controller
      Each chip is composed of multiple bulldozer modules
      Module divisions are transparent to shared hardware, operating system or application
      The modular architecture speeds chip development and increases product flexibility
    • 12. Bulldozer Summary
      “Bulldozer”
      Bulldozer is the next generation of AMD high-performance processor core technology
      This new core is a completely new design from the ground up
      Bulldozer will be utilized in client and server designs in 2011
      AMD delivers 33% more cores and an estimated 50% increase in throughput in the same power envelope as Magny-Cours*
      Fetch
      Decode
      IntegerScheduler
      IntegerScheduler
      FP Scheduler
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      Pipeline
      128-bitFMAC
      128-bitFMAC
      L1 DCache
      L1 DCache
      Shared L2 Cache
      *Based on internal AMD modeling using benchmark simulations
    • 13. Two x86 Cores Tuned for Target Markets
      Mainstream Client and Server Markets
      “Bulldozer”
      Performance & Scalability
      Low PowerMarkets
      Small
      Die Area
      Cloud Clients Optimized
      “Bobcat”
      Flexible, Low Power & Small
    • 14. Bobcat Design Goals
      A small, efficient, low power x86 core
      Excellent performance
      Synthesizable with small number of custom arrays
      Easily Portable across process technologies
    • 15. “Bobcat” x86 Core: Small, Efficient and Strong
      “Bobcat” Core
      • Sub one-watt capable core
      • 16. Out-of-order execution engine
      • 17. Synthesizable / Easy to Reuse
      • 18. Complete ISA support
      • 19. SSE1-3 and virtualization
      • 20. Estimated90% of today’s mainstream performance in less than half of the silicon area*
      • 21. 2011 / notebook APU / “Ontario”
      L1 Icache
      Fetch
      Decode
      IntScheduler
      FP Scheduler
      I-Pipe
      I-Pipe
      Ld-Pipe
      St-Pipe
      A-Pipe
      M-Pipe
      L1 DCache
      L2 Cache
      *Based on internal AMD modeling using benchmark simulations
    • 22. Bobcat Core Overview
      Advanced Micro-architecture
      Dual x86 Decode
      Advanced Branch Predictor
      Full OOO instruction execution
      Full OOO load/store engine
      High Performance Floating Point
      AMD64 64-bit ISA
      SSE1,2,3, SSSE3 ISA
      Secure Virtualization
      32kb L1s
      Low Power Design
      Power Optimized Execution
      Micro-architecture that minimizes data movement and unnecessary reads
      Clock gating, Power gating
      System Low Power States
      Small Core
      Area efficient balance of high performance and low power
      ICACHE
      L2
      Bobcat
      Low
      Power
      Core
      Fetch
      BU
      Decode
      FP
      Scheduler
      Address
      Scheduler
      Integer
      Scheduler
      A
      Pipe
      M
      Pipe
      I
      Pipe
      Store
      Pipe
      I
      Pipe
      Load
      Pipe
      DCACHE
    • 23. Entering the AMD Fusion Processor Era
      • Bobcat is the CPU on “Ontario”, AMD’s first APU
      APU:
      • Combination of CPU and programmable GPU architectures for high-performance heterogeneous compute capability
      • 24. High-speed bus architecture
      • 25. Shared, low-latency memory model
      • 26. Single die design
      System Memory
      SIMD
      Engine
      Array
      X86 CPU Cores
      High Performance Bus&Memory Controller
      Unified Video Decoder
      Platform Interfaces
    • 27. Bobcat Summary
      Bobcat is the CPU engine for AMD’s first APU
      Estimate 90% of the performance of AMD’s current mainstream notebook CPU in less than half the area and a fraction of the power*
      Highly portable across designs and manufacturing technologies
      Sub-one watt capable core
      *Based on internal AMD modeling using benchmark simulations
    • 28. Disclaimer & Attribution
      DISCLAIMER
      The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
      The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to changes to the AMD Fusion Partner Program. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.
      AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
      AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
      ATTRIBUTION
      © 2010 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, ATI, the ATI logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names are for informational purposes only and may be trademarks of their respective owners.

    ×