Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The past and the next 20 years? Scalable computing as a key evolution

1,084 views

Published on

Published in: Technology
  • Be the first to comment

The past and the next 20 years? Scalable computing as a key evolution

  1. 1. The Past & The Next 20 Years. Scalable Computing As A Key Evolution Haydn Povey, Director Product Marketing Processor Division, ARM
  2. 2. 1991
  3. 3. ARM Founded 27 th Nov 1990 <ul><li>A barn, some energy, experience and belief: “We’re going to be the Global Standard” </li></ul><ul><li>“ I gave ARM two things for success – no staff </li></ul><ul><li>and no money” – Sir Robin Saxby </li></ul><ul><li>Originally 12 employees </li></ul><ul><li>Two decades of Partnership success </li></ul><ul><li>8 Partners at first Partner meeting </li></ul><ul><li>>500 Partners at 2011 Partner meeting </li></ul>
  4. 4. A 1991 View of the Industry
  5. 5. The Early Market for 32-Bit 1989 1995 Embedded Control Revenue in $M. 32-bit Growth >45% per annum Early ARM Design Win: ACORN Archimedes Polygon Pushing at ARM
  6. 6. The 20 Year Journey <ul><li>ARM1 3  6k gates </li></ul><ul><li>7mm x 7mm = 49mm 2 </li></ul>December 2010 . Cortex M0 20nm 8k gates 0.07mm x 0.07mm M0 1/10,000 th size Cortex-M0 Subsystem Phenomenal Power, Performance & Area Improvements
  7. 7. 2011
  8. 8. Our Increasingly Connected World <ul><li>Faster data rates can increase complexity, power and cost </li></ul><ul><li>Devices are becoming more multi-purpose, open, general computing platform </li></ul><ul><li>All devices are becoming energy constrained </li></ul>
  9. 9. Increasing Demands On Chip Design <ul><li>Hardware and software reuse </li></ul><ul><li>Power efficient processing </li></ul><ul><li>Optimized implementation </li></ul><ul><li>Heterogeneous design </li></ul><ul><li>Simplified software integration </li></ul>Today The Chip Is The System
  10. 10. Power efficiency and Performance <ul><li>Mobile SoC’s have experience of balancing power with performance </li></ul><ul><li>Optimized processing units designed for specific tasks </li></ul><ul><li>Today’s SoC contains many diverse components </li></ul>
  11. 11. The Chip is the System <ul><li>Heterogeneous hardware: </li></ul><ul><ul><li>Optimum power efficiency requires HW perfect for each task </li></ul></ul><ul><ul><li>Implies a demand for multiple HW accelerators </li></ul></ul><ul><ul><li>Leads to a proliferation of engines, more levels of parallelism </li></ul></ul><ul><ul><li>Benefits from HW coherency </li></ul></ul><ul><li>Homogenous software: </li></ul><ul><ul><li>Application software and OS efficiency will increasingly rely on a unified memory model </li></ul></ul><ul><ul><li>Aligning memory systems (page tables, address spaces, coherency) between the different units becomes critical for high performance </li></ul></ul>
  12. 12. Cortex-A15: The New Market Standard <ul><li>Performance enables new product types </li></ul><ul><ul><li>Large-screen, connected, slim-profile, light </li></ul></ul><ul><li>All your compute needs in a superphone </li></ul><ul><ul><li>Expect innovative mobile MP platforms </li></ul></ul><ul><li>Advanced capabilities </li></ul><ul><ul><li>Support for OS virtualization, larger memory </li></ul></ul>Cortex-A15 measurements on equivalent system. Frequency varies dependent on process Relative performance <ul><li>First silicon undergoing test </li></ul><ul><li>Linux and browsing optimizations reviewed and upstreamed </li></ul><ul><li>Optimized tool-chains available </li></ul>Cortex-A15 Available now
  13. 13. Cortex-A7: Redefining Energy-Efficiency <ul><li>Most energy-efficient applications processor </li></ul><ul><ul><li>5x the energy efficiency of mainstream phones </li></ul></ul><ul><li>Performance to handle common workloads </li></ul><ul><ul><li>>2x the performance of mainstream phone </li></ul></ul><ul><li>Feature set and software compliant with Cortex-A15 </li></ul><ul><ul><li>Full backward compatibility </li></ul></ul><ul><ul><li>Scalable and extensible </li></ul></ul>Browsing workload comparison Today’s dual-core high-end smartphones Relative Performance 1 GHz 1.2 GHz 1.2 GHz Energy Efficiency 45nm 28nm
  14. 14. Introducing big.LITTLE Processing <ul><li>Uses the right processor for the right job </li></ul><ul><li>Up to 70% energy savings on common workloads </li></ul><ul><li>Flexible and transparent to apps – importance of seamless software handover </li></ul>big LITTLE Cortex-A15 MPCore L2 Cache CPU Cortex-A7 MPCore L2 Cache CCI-400 Coherent Interconnect CPU CPU CPU Interrupt Control
  15. 15. Performance AND Energy efficiency <ul><li>Simple, in-order, 8 stage pipeline </li></ul><ul><li>Performance better than today’s mainstream, high-volume smartphones </li></ul><ul><li>Complex, out-of-order, multi-issue pipeline </li></ul><ul><li>Up to 5x the performance of today’s mainstream, high-volume smartphones </li></ul>Cortex-A7 Cortex-A15 LITTLE big Most energy-efficient applications processor from ARM Highest performance in mobile power envelope Queue Issue Integer
  16. 16. The Right Processor for the Right Job Processing energy saved versus today’s high-end multicore phones Cortex-A15 provides the high-end performance Cortex-A7 is ideal for Low to mid-range tasks * Dual Cortex-A15 + Dual Cortex-A7 big.LITTLE system estimate in 32/28nm compared with a dual-Cortex-A9 system estimate in 40nm LITTLE cluster activity dominates Cortex-A7 Big cluster activity dominates Cortex-A15
  17. 17. Cortex ™ -A Series: Optimum Performance Scalable performance with low power for broad application scope Mobile Internet Smart TV Automotive Infotainment Network Infrastructure Servers Cortex-A5 MPCore <ul><li>Most efficient ARM processor </li></ul><ul><li>Big.LITTLE with Cortex-A15 </li></ul><ul><li>First superscalar design </li></ul><ul><li>Market proven, wide adoption </li></ul>Cortex-A7 MPCore Cortex-A8 Cortex-A9 MPCore <ul><li>High-efficiency multicore </li></ul><ul><li>High-performance hard macro </li></ul>Cortex-A15 MPCore <ul><li>Unprecedented performance </li></ul><ul><li>Broad application capability </li></ul><ul><li>64-bit architecture </li></ul><ul><li>ARMv7-A compatibility </li></ul>ARMv8-A Architecture <ul><li>Low-cost internet </li></ul><ul><li>Migration from classic ARM </li></ul>Wide Application Range Cortex-A High Performance Scalable Efficient BROAD PORTFOLIO WIDELY ADOPTED MARKET PROVEN
  18. 18. Bringing Visual Computing to Life <ul><li>Visual & graphical expectations continue to grow </li></ul><ul><ul><li>Scaling to all resolutions from VGA to 1080p </li></ul></ul>Samsung Galaxy SII Hardkernel ODROID-A WinAccord PTT 1026 Ramos W10 Samsung Smart TV Skyworth Smart TV TomTom GO LIVE 1000
  19. 19. Evolving Processing Demands <ul><li>OpenGL ® ES ‘Halti’ and Microsoft ® DirectX ® 11 enabling advanced content </li></ul><ul><ul><li>Content keeps advancing, look at history to predict the future </li></ul></ul><ul><li>GPU computing – OpenCL ™ , Renderscript, DirectCompute </li></ul><ul><ul><li>Expectations of a common user experience across any consumer product leading to ever-higher performance demands in low-power portable devices </li></ul></ul>25x increase in complexity Polarbit – Raging Thunder Unigine Corp – Heaven Unity – Sixits ExoVerse
  20. 20. System Design Scalability 400 Series Coherency Virtualization External Memory Subsystem Rest of SoC Interconnect <ul><ul><li>CCI-400 </li></ul></ul><ul><ul><ul><ul><li>Big.LITTLE coherency </li></ul></ul></ul></ul><ul><ul><ul><ul><li>I/O coherency </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Prioritization and utilization </li></ul></ul></ul></ul><ul><ul><li>MMU-400 </li></ul></ul><ul><ul><ul><ul><li>OS level virtualization </li></ul></ul></ul></ul><ul><ul><li>GIC-400 </li></ul></ul><ul><ul><ul><ul><li>Virtual interrupts </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Multicore support </li></ul></ul></ul></ul><ul><ul><li>External Memory Subsystem </li></ul></ul><ul><ul><li>DMC-400 </li></ul></ul><ul><ul><ul><ul><li>DDR utilization </li></ul></ul></ul></ul><ul><ul><ul><ul><li>PHY integration </li></ul></ul></ul></ul><ul><ul><li>NIC-400 </li></ul></ul><ul><ul><ul><ul><li>Routing efficiency </li></ul></ul></ul></ul>
  21. 21. Advanced Physical IP 14nm - 32nm 40nm - 65nm 90nm - 250nm 8 Physical IP Platforms 15 Physical IP Platforms 69 Physical IP Platforms
  22. 22. The Next 20 Years 2010s Mobiles Soon Pervasive Devices Ubiquitous Environments Heterogeneous Compute Engines Functionality Energy × $ Functionality Available Energy × $ Functionality $ Breakthroughs? Silicon technology Non-volatile memory tech Battery technology Charging speed ?
  23. 23. Enabling Scalability - From 1mm 3 to 1km 3 8.75mm 3 platform solar cell 0.18 µm Cortex™-M3 12 µ Ah Li-ion battery University of Michigan 1mm 3 platform 1km 1km 3 platform 4200 ARM Neutrino Detectors 70 bore holes 2.5km deep 60 detectors per bore hole supported by the National Science Foundation and University of Wisconsin-Madison
  24. 24. Thank You

×