Q4.11: ARM Technology Update Plenary


Published on

Resource: Q4.11
Name: ARM Technology Update Plenary
Date: 28-11-2011
Speaker: David Brash

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Q4.11: ARM Technology Update Plenary

  1. 1. Linaro/UDS plenary Orlando, 03-Nov-2011 David Brash ARM Technology Update
  2. 2. Agenda  ARMv7-A update  Cortex-A7 announcement  Energy efficient processing  big-LITTLE: Cortex-A15 & Cortex-A7  Eco-system development  The architecture roadmap: ARMv7 => ARMv8  ARMv8-A announcement at TechCon 2011
  3. 3. ARM Cortex-A15 Momentum  Expanding list of ARM Partners with designs in progress  …and 5 other ARM partners  Products expected in 2012
  4. 4. Introducing the Cortex-A7  A highly efficient core for future smartphones  Entry-level, some mainstream workloads ...and more  Redefines mobile computing  big.LITTLE processing model Power Performance Cortex-A15 Cortex-A7 Cortex-A7 is ~1/6th the power, but half the performance, at the nominal operating point Highest Cortex-A15 Operating Point Highest Cortex-A7 Operating Point Lowest Cortex-A15 Operating Point Lowest Cortex-A7 Operating Point Overdrive Condition  Full backward compatibility with Cortex-A processors  Feature set and software compliant with Cortex-A15  Virtualization  Large Address Extensions  Scalable and Extensible  Multi-processor  System Coherency  Small  <0.5mm2 in 28nm process ARM Cortex-A7 RTL available Now
  5. 5. Cortex-A15/7 big.LITTLE Processing Cortex-A15 MPCore L2 Cache CPU Cortex-A7 MPCore L2 Cache CCI-400 Coherent Interconnect CPU CPU CPU Interrupt Control  Uses the right processor for the right job  Up to 70% energy savings on common workloads  Flexible and transparent to apps – importance of seamless software handover big “Demanding tasks” LITTLE “Always on, always connected tasks”
  6. 6. Performance and Energy-Efficiency  Simple, in-order, 8 stage pipeline  Performance better than today’s mainstream, high-volume smartphones Most energy-efficient applications processor from ARM  Complex, out-of-order, multi-issue pipeline  Up to 5x the performance of today’s mainstream, high-volume smartphones Highest performance in mobile power envelope Cortex-A7 Cortex-A15 LITTLEbig Q u e u e I s s u e I n t e g e r
  7. 7. big.LITTLE Cluster Migration Mechanics Migration Stimulus Received Save State Normal Operation Snooping Allowed Outbound Processor (s): Cluster B Cache Invalidate Ready for migration Switch State (Snoop Outbound Processor) Inbound Processor(s): Cluster A Outbound Processor OFF Stimulus from OS/Virtualizer via system firmware interface Enable Snooping Restore State Normal Operation Power Down Power On & Reset Disable Snooping Clean Cache Less than 100-cycles Less than 20 micro-seconds This is the “critical period” where no work is being done on either cluster Cycle count is OS dependent
  8. 8. Leading Software Ecosystem  Broad support for Cortex-A processors  100,000s of apps already optimized  Increasing ARM focus on the platform  1TB of physical address space (Cortex-A7/A15 systems) meets a wide spectrum of developer needs  a vehicle for software development and sharing  Linaro key to Linux and other open-source software and tools deployment Virtualization and Firmware OS Power Management Software Applications and Middleware Many ARMv7-A software developments logically extend into ARMv8-A
  9. 9.  Focus for ARM system and software development  Cortex-A15 cluster  Cortex-A7 cluster  Mali graphics support + Memory, IO, debug etc...  Increasing use of “models-first”: processor, memory & IO Cortex-A15/A7/MALI platform CPU 0 L2Cache Cortex-A15 Cluster LPDDR2/DDR3 Controller DMC-400 System Power Debug & Trace 2012 Compute Subsystem AMBA Extensions Interface (Slave) AMBA Extensions Interface (Master) JTAG & Trace PMIC/ APB Bus CPU 2 CPU 1 CPU 3 CPU 0 L2Cache Cortex-A7 Cluster CPU 2 CPU 1 CPU 3 Shader Core 0 Mali T600 series GPU Shader Core 1 Shader Core 2 Shader Core 3 Cache Coherent Interconnect (CCI-400) DDR PHY or DDR Memory NIC 400 CoreSight Resources Mgt SMMU L2Cache NIC 400 On-Chip Memories (RAM, ROM) Base Peripheral
  10. 10. ARMv8-A (announced 27-Oct-2011)
  11. 11. What is ARMv8?  Next version of the ARMv8 architecture  First release covers the Applications profile only: ARMv8-A  Addition of a 64-bit operating capability  Introduction of new 64-bit execution state – AArch64  Maintain low power heritage – critique features against PPA* impact  ARMv7-A compatibility a critical consideration – AArch32  Interprocessing: defined relationship between 32- and 64-bit execution  Maintain ARMv7-A (AArch32) momentum alongside AArch64  Strong compatibility plus ongoing evolution *PPA: Power Performance Area
  12. 12. ARMv8-A – Context • ARMv8 • A-profile only (at this time) • 64-bit architecture support
  13. 13. AArch64 - registers X0 X8 X16 X24 X1 X9 X17 X25 X2 X10 X18 X26 X3 X11 X19 X27 X4 X12 X20 X28 X5 X13 X21 X29 X6 X14 X22 X30* X7 X15 X23 EL0 EL1 EL2 EL3 Stack Ptr SP_EL0 SP_EL1 SP_EL2 SP_EL3 (PC) Exception Link Register ELR_EL1 ELR_EL2 ELR_EL3 Saved/Current Process Status Register SPSR_EL1 SPSR_EL2 SPSR_EL3 (CPSR) * procedure_ LR V0 V8 V16 V24 V1 V9 V17 V25 V2 V10 V18 V26 V3 V11 V19 V27 V4 V12 V20 V28 V5 V13 V21 V29 V6 V14 V22 V30 V7 V15 V23 V31 64-bit registers {32-bit SP, 64-bit DP} scalar FP / 128-bit vectors
  14. 14. Exception model overview EL2 AArch32 AArch64 EL0 EL1 User IF EL3 is 64-bit Svc Abt Und FIQ IRQ Sys Hyp User Svc Abt Und FIQ IRQ Sys EL3 EL0 EL1h EL1t EL3h EL3t EL2h EL2t SecureNon-secure SecureNon-secure EL0 EL1h EL1t „h‟andler & „t‟hread stack options Svc Abt Und FIQ IRQ Sys Mon IF EL3 is 32-bit ARMv7-A compatibility Interprocessing: • EL3: Secure Monitor => EL2: Hypervisor) => EL1: OS = EL0: Application • AArch64 → AArch32 transition can occur on a transition down the hierarchy (EL3 → EL0) • AArch32 → AArch64 transition can occur on a transition up the hierarchy (EL0 → EL3)
  15. 15. Interprocessing & AArch32 save/restore R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 (SP) R14 (LR) SP_svc LR_svc SP_irq LR_irq SP_und LR_und SP_fiq LR_fiq SP_abt LR_abt SP_hyp R8_fiq R9_fiq R10_fiq R11_fiq R12_fiq SP_mon LR_mon R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R0 R1 R2 R3 R4 R5 R6 R7 X16  R14_irq X17  R13_irq X18  R14_svc X19  R13_svc X20  R14_abt X21  R13_abt X22  R14_und X23  R13_und X24  R8_fiq X25  R9_fiq X26  R10_fiq X27  R11_fiq X28  R12_fiq X29  R13_fiq X0  R0 X1  R1 X2  R2 X3  R3 X4  R4 X5  R5 X6  R6 X7  R7 X8  R8usr X9  R9usr X10  R10usr X11  R11usr X12  R12usr X13  R13usr X14  R14usr X15  R13_hyp X30  R14_fiq PC A/CPSR SPSR_svc SPSR_abt SPSR_und SPSR_irq SPSR_hyp ELR_hyp SPSR_fiq SPSR_mon AArch32 AArch64 SP_EL0 PSTATE PC SP_EL1-3 ELR_EL1-3 SPSR_EL1-3
  16. 16. Summary  Cortex-A7 a highly efficient application processor  Cortex-A7 enables big.LITTLE Processing to expand performance and battery-life  Seamless and transparent to application software  ARM increasing its platform software investments  A catalyst for many activities  The ARM architecture roadmap is now clearer  ARMv8-A architecture development is well advanced (Specification release expected 2H-2012)