Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

LCE12: Handling bigLITTLE Core and Cluster Shutdown on ARM

906 views

Published on

Resource: LCE12
Name: Handling bigLITTLE Core and Cluster Shutdown on ARM
Date: 29-10-2012
Speaker: Nicolas Pitre / Dave Martin

Published in: Technology
  • Be the first to comment

LCE12: Handling bigLITTLE Core and Cluster Shutdown on ARM

  1. 1. Nicolas Pitre Dave Martin Linaro Connect Q4.12 October 2012 Nicolas Pitre Dave Martin Linaro Connect Q4.12 October 2012 Handling big.LITTLE Core and Cluster Shutdowns on ARM Handling big.LITTLE Core and Cluster Shutdowns on ARM
  2. 2. TopicsTopics ● Why ● Problems ● Solutions ● Implementation ● Why ● Problems ● Solutions ● Implementation
  3. 3. big.LITTLE Activitiesbig.LITTLE Activities ● Current big.LITTLE projects: ● big.LITTLE switcher ● big.LITTLE “full MP” ● Goal: optimize performance and save power on big.LITTLE SoCs ● Current big.LITTLE projects: ● big.LITTLE switcher ● big.LITTLE “full MP” ● Goal: optimize performance and save power on big.LITTLE SoCs
  4. 4. Power SavingPower Saving ● Save power by: ● turning off individual CPUs; ● shutting down a whole cluster ● Opportunistic cluster shutdown is key. ● Much more complex than it may seem at first glance. ● Save power by: ● turning off individual CPUs; ● shutting down a whole cluster ● Opportunistic cluster shutdown is key. ● Much more complex than it may seem at first glance.
  5. 5. Typical Hardware SystemTypical Hardware System Cluster0 CPU0 CPU1 CPU2 ... cache Cluster1 CPU0 CPU1 CPU2 ... cache Cache-coherent interconnect (CCI) memory peripherals
  6. 6. down up CPU Life-CycleCPU Life-Cycle ● up: powered on, running normally ● going down: shutdown in progress ● down: powered off ● coming up: powered on, setup in progress ● up: powered on, running normally ● going down: shutdown in progress ● down: powered off ● coming up: powered on, setup in progress going downcoming up
  7. 7. Cluster ShutdownCluster Shutdown ● All CPUs shutting down must: 1)disable allocation into L1 2)flush dirty L1 content 3)disable CPU-level coherency 4)power itself down ● When all CPUs are shut down, we can shut down the cluster: ● The Last Man must perform steps 1-3, and: 5)flush cluster-level (L2) cache 5)disable CCI snooping for the cluster 6)power the cluster down. ● All CPUs shutting down must: 1)disable allocation into L1 2)flush dirty L1 content 3)disable CPU-level coherency 4)power itself down ● When all CPUs are shut down, we can shut down the cluster: ● The Last Man must perform steps 1-3, and: 5)flush cluster-level (L2) cache 5)disable CCI snooping for the cluster 6)power the cluster down.
  8. 8. Last Man ChallengesLast Man Challenges ● Last Man has to perform a sequence of actions without interference from other CPUs. ● Problems: ● Other CPUs can be at various stages of shutdown. ● CPUs might wake up at any time. ● Flushing L2 can take quite some time. ● LDREX and STREX only work with cached memory. ● Concurrency is a hard problem. ● Last Man has to perform a sequence of actions without interference from other CPUs. ● Problems: ● Other CPUs can be at various stages of shutdown. ● CPUs might wake up at any time. ● Flushing L2 can take quite some time. ● LDREX and STREX only work with cached memory. ● Concurrency is a hard problem.
  9. 9. ...and yet more challenges...and yet more challenges ● Concurrency: ● Which CPU is the Last Man? ● How does the Last Man know the other CPUs are really down? ● How to avoid races with one or more incoming CPUs? ● How the incoming CPU knows if the cluster needs to be set up. ● Races are everywhere! ● Last Man can't flush L2 until all the other CPUs are done flushing their L1 caches. ● Incoming CPUs might power up at any time. ● Incoming CPUs can’t proceed safely if CCI snooping is disabled. ● Memory might be cached on some CPUs and uncached on others... ● Concurrency: ● Which CPU is the Last Man? ● How does the Last Man know the other CPUs are really down? ● How to avoid races with one or more incoming CPUs? ● How the incoming CPU knows if the cluster needs to be set up. ● Races are everywhere! ● Last Man can't flush L2 until all the other CPUs are done flushing their L1 caches. ● Incoming CPUs might power up at any time. ● Incoming CPUs can’t proceed safely if CCI snooping is disabled. ● Memory might be cached on some CPUs and uncached on others...
  10. 10. Cluster Life-Cycle (simplified)Cluster Life-Cycle (simplified) ● Similar to CPU life-cycle, but... ● Need to manage cluster caches etc. safely ● Cluster power-down may be preempted ● Need to avoid races when tracking cluster state. ● Similar to CPU life-cycle, but... ● Need to manage cluster caches etc. safely ● Cluster power-down may be preempted ● Need to avoid races when tracking cluster state. down up going downcoming up
  11. 11. Actual cluster life-cycleActual cluster life-cycle down, not coming up up, not coming up going down, not coming up going down, coming up up, coming up down, coming up (preempt) actions taken by last man during cluster shutdown actions taken by first man during cluster wake-up
  12. 12. Platform Code Helper FunctionsPlatform Code Helper Functions ● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster) Signal that the CPU is shutting down. ● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int cluster) Safely begin cluster shutdown, ensuring all other CPUs are down (last man only) ● void __bL_outbound_leave_critical(unsigned int cluster, int state) End cluster shutdown (last man only) ● void __bL_cpu_down(unsigned int cpu, unsigned int cluster) Signal that the CPU has finished shutting down. ● Fast models example code in arch/arm/mach-vexpress/dcscb.c. ● Equivalent operations for CPU and cluster stat-up handled by common code in arch/arm/common/bL_head.S. ● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster) Signal that the CPU is shutting down. ● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int cluster) Safely begin cluster shutdown, ensuring all other CPUs are down (last man only) ● void __bL_outbound_leave_critical(unsigned int cluster, int state) End cluster shutdown (last man only) ● void __bL_cpu_down(unsigned int cpu, unsigned int cluster) Signal that the CPU has finished shutting down. ● Fast models example code in arch/arm/mach-vexpress/dcscb.c. ● Equivalent operations for CPU and cluster stat-up handled by common code in arch/arm/common/bL_head.S.
  13. 13. Managing Cluster Start-UpManaging Cluster Start-Up ● When powering up, the “First Man” must: ● invalidate cluster-level (L2) cache (if needed), ● enable CCI snooping for the cluster, ● resume execution of the kernel. ● Other CPUs must: ● wait until the first man has set up the cluster, ● resume execution of the kernel. The kernel deals with local CPU setup. ● When powering up, the “First Man” must: ● invalidate cluster-level (L2) cache (if needed), ● enable CCI snooping for the cluster, ● resume execution of the kernel. ● Other CPUs must: ● wait until the first man has set up the cluster, ● resume execution of the kernel. The kernel deals with local CPU setup.
  14. 14. Choosing the First ManChoosing the First Man ● Lightweight mutual exclusion using “vlocks” ● A CPU “votes” for itself by storing its ID to a common location: STR cpu_id, [ballot_box] ● Memory atomicity ensures a single winner. ● The winner sets up the cluster. ● Lightweight mutual exclusion using “vlocks” ● A CPU “votes” for itself by storing its ID to a common location: STR cpu_id, [ballot_box] ● Memory atomicity ensures a single winner. ● The winner sets up the cluster. election in progress election started? power-on submit vote election finished? yes no no did I win? yes set up cluster wait for winner to set up cluster no boot or resume OS
  15. 15. Kernel APIKernel API A convenient interface is provided to hide hardware specifics from the kernel. ● Make given CPU in given cluster runnable: bL_cpu_power_up(int cpu, int cluster) ● Power the calling CPU down: bL_cpu_power_down(void) ● For self housekeeping: bL_cpu_powered_up(void) A convenient interface is provided to hide hardware specifics from the kernel. ● Make given CPU in given cluster runnable: bL_cpu_power_up(int cpu, int cluster) ● Power the calling CPU down: bL_cpu_power_down(void) ● For self housekeeping: bL_cpu_powered_up(void)
  16. 16. Targeted UsersTargeted Users ● the in-kernel switcher module (IKS) ● the cpuidle driver ● CPU hotplug ● secondary CPU booting. ● the in-kernel switcher module (IKS) ● the cpuidle driver ● CPU hotplug ● secondary CPU booting.
  17. 17. Code AvailabilityCode Availability ● http://git.linaro.org/gitweb? p=people/nico/linux.git; a=shortlog;h=refs/heads/bL_cluster_pm ● example implementation for ARM Fast Model ● Still vaildating on ARM TC2 hardware. ● Should be headed upstream soon... ● http://git.linaro.org/gitweb? p=people/nico/linux.git; a=shortlog;h=refs/heads/bL_cluster_pm ● example implementation for ARM Fast Model ● Still vaildating on ARM TC2 hardware. ● Should be headed upstream soon...
  18. 18. Questions? Thanks for listening

×