SlideShare a Scribd company logo
1 of 18
Download to read offline
Nicolas Pitre
Dave Martin
Linaro Connect Q4.12
October 2012
Nicolas Pitre
Dave Martin
Linaro Connect Q4.12
October 2012
Handling big.LITTLE Core and
Cluster Shutdowns on ARM
Handling big.LITTLE Core and
Cluster Shutdowns on ARM
TopicsTopics
● Why
● Problems
● Solutions
● Implementation
● Why
● Problems
● Solutions
● Implementation
big.LITTLE Activitiesbig.LITTLE Activities
● Current big.LITTLE projects:
● big.LITTLE switcher
● big.LITTLE “full MP”
● Goal: optimize performance and save
power on big.LITTLE SoCs
● Current big.LITTLE projects:
● big.LITTLE switcher
● big.LITTLE “full MP”
● Goal: optimize performance and save
power on big.LITTLE SoCs
Power SavingPower Saving
● Save power by:
● turning off individual CPUs;
● shutting down a whole cluster
● Opportunistic cluster shutdown is key.
● Much more complex than it may seem at
first glance.
● Save power by:
● turning off individual CPUs;
● shutting down a whole cluster
● Opportunistic cluster shutdown is key.
● Much more complex than it may seem at
first glance.
Typical Hardware SystemTypical Hardware System
Cluster0
CPU0 CPU1 CPU2
...
cache
Cluster1
CPU0 CPU1 CPU2
...
cache
Cache-coherent interconnect (CCI)
memory peripherals
down
up
CPU Life-CycleCPU Life-Cycle
● up: powered on,
running normally
● going down:
shutdown in
progress
● down: powered off
● coming up:
powered on, setup
in progress
● up: powered on,
running normally
● going down:
shutdown in
progress
● down: powered off
● coming up:
powered on, setup
in progress
going downcoming up
Cluster ShutdownCluster Shutdown
● All CPUs shutting down must:
1)disable allocation into L1
2)flush dirty L1 content
3)disable CPU-level coherency
4)power itself down
● When all CPUs are shut down, we can shut down the
cluster:
● The Last Man must perform steps 1-3, and:
5)flush cluster-level (L2) cache
5)disable CCI snooping for the cluster
6)power the cluster down.
● All CPUs shutting down must:
1)disable allocation into L1
2)flush dirty L1 content
3)disable CPU-level coherency
4)power itself down
● When all CPUs are shut down, we can shut down the
cluster:
● The Last Man must perform steps 1-3, and:
5)flush cluster-level (L2) cache
5)disable CCI snooping for the cluster
6)power the cluster down.
Last Man ChallengesLast Man Challenges
● Last Man has to perform a sequence of actions without
interference from other CPUs.
● Problems:
● Other CPUs can be at various stages of shutdown.
● CPUs might wake up at any time.
● Flushing L2 can take quite some time.
● LDREX and STREX only work with cached memory.
● Concurrency is a hard problem.
● Last Man has to perform a sequence of actions without
interference from other CPUs.
● Problems:
● Other CPUs can be at various stages of shutdown.
● CPUs might wake up at any time.
● Flushing L2 can take quite some time.
● LDREX and STREX only work with cached memory.
● Concurrency is a hard problem.
...and yet more challenges...and yet more challenges
● Concurrency:
● Which CPU is the Last Man?
● How does the Last Man know the other CPUs are really down?
● How to avoid races with one or more incoming CPUs?
● How the incoming CPU knows if the cluster needs to be set up.
● Races are everywhere!
● Last Man can't flush L2 until all the other CPUs are done flushing
their L1 caches.
● Incoming CPUs might power up at any time.
● Incoming CPUs can’t proceed safely if CCI snooping is disabled.
● Memory might be cached on some CPUs and uncached on others...
● Concurrency:
● Which CPU is the Last Man?
● How does the Last Man know the other CPUs are really down?
● How to avoid races with one or more incoming CPUs?
● How the incoming CPU knows if the cluster needs to be set up.
● Races are everywhere!
● Last Man can't flush L2 until all the other CPUs are done flushing
their L1 caches.
● Incoming CPUs might power up at any time.
● Incoming CPUs can’t proceed safely if CCI snooping is disabled.
● Memory might be cached on some CPUs and uncached on others...
Cluster Life-Cycle (simplified)Cluster Life-Cycle (simplified)
● Similar to CPU life-cycle,
but...
● Need to manage cluster
caches etc. safely
● Cluster power-down may
be preempted
● Need to avoid races
when tracking cluster
state.
● Similar to CPU life-cycle,
but...
● Need to manage cluster
caches etc. safely
● Cluster power-down may
be preempted
● Need to avoid races
when tracking cluster
state.
down
up
going downcoming up
Actual cluster life-cycleActual cluster life-cycle
down,
not coming up
up,
not coming up
going down,
not coming up
going down,
coming up
up,
coming up
down,
coming up
(preempt)
actions taken by last man during cluster shutdown
actions taken by first man during cluster wake-up
Platform Code Helper FunctionsPlatform Code Helper Functions
● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster)
Signal that the CPU is shutting down.
● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int
cluster)
Safely begin cluster shutdown, ensuring all other CPUs are down (last man only)
● void __bL_outbound_leave_critical(unsigned int cluster, int state)
End cluster shutdown (last man only)
● void __bL_cpu_down(unsigned int cpu, unsigned int cluster)
Signal that the CPU has finished shutting down.
● Fast models example code in arch/arm/mach-vexpress/dcscb.c.
● Equivalent operations for CPU and cluster stat-up handled by common code in
arch/arm/common/bL_head.S.
● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster)
Signal that the CPU is shutting down.
● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int
cluster)
Safely begin cluster shutdown, ensuring all other CPUs are down (last man only)
● void __bL_outbound_leave_critical(unsigned int cluster, int state)
End cluster shutdown (last man only)
● void __bL_cpu_down(unsigned int cpu, unsigned int cluster)
Signal that the CPU has finished shutting down.
● Fast models example code in arch/arm/mach-vexpress/dcscb.c.
● Equivalent operations for CPU and cluster stat-up handled by common code in
arch/arm/common/bL_head.S.
Managing Cluster Start-UpManaging Cluster Start-Up
● When powering up, the “First Man” must:
● invalidate cluster-level (L2) cache (if needed),
● enable CCI snooping for the cluster,
● resume execution of the kernel.
● Other CPUs must:
● wait until the first man has set up the cluster,
● resume execution of the kernel.
The kernel deals with local CPU setup.
● When powering up, the “First Man” must:
● invalidate cluster-level (L2) cache (if needed),
● enable CCI snooping for the cluster,
● resume execution of the kernel.
● Other CPUs must:
● wait until the first man has set up the cluster,
● resume execution of the kernel.
The kernel deals with local CPU setup.
Choosing the First ManChoosing the First Man
● Lightweight mutual
exclusion using “vlocks”
● A CPU “votes” for itself by
storing its ID to a common
location:
STR cpu_id, [ballot_box]
● Memory atomicity ensures
a single winner.
● The winner sets up the
cluster.
● Lightweight mutual
exclusion using “vlocks”
● A CPU “votes” for itself by
storing its ID to a common
location:
STR cpu_id, [ballot_box]
● Memory atomicity ensures
a single winner.
● The winner sets up the
cluster.
election in progress
election
started?
power-on
submit vote
election
finished?
yes
no
no
did I win?
yes
set up cluster
wait for
winner to
set up
cluster
no
boot or
resume
OS
Kernel APIKernel API
A convenient interface is provided to hide
hardware specifics from the kernel.
● Make given CPU in given cluster runnable:
bL_cpu_power_up(int cpu, int cluster)
● Power the calling CPU down:
bL_cpu_power_down(void)
● For self housekeeping:
bL_cpu_powered_up(void)
A convenient interface is provided to hide
hardware specifics from the kernel.
● Make given CPU in given cluster runnable:
bL_cpu_power_up(int cpu, int cluster)
● Power the calling CPU down:
bL_cpu_power_down(void)
● For self housekeeping:
bL_cpu_powered_up(void)
Targeted UsersTargeted Users
● the in-kernel switcher module (IKS)
● the cpuidle driver
● CPU hotplug
● secondary CPU booting.
● the in-kernel switcher module (IKS)
● the cpuidle driver
● CPU hotplug
● secondary CPU booting.
Code AvailabilityCode Availability
● http://git.linaro.org/gitweb?
p=people/nico/linux.git;
a=shortlog;h=refs/heads/bL_cluster_pm
● example implementation for ARM Fast
Model
● Still vaildating on ARM TC2 hardware.
● Should be headed upstream soon...
● http://git.linaro.org/gitweb?
p=people/nico/linux.git;
a=shortlog;h=refs/heads/bL_cluster_pm
● example implementation for ARM Fast
Model
● Still vaildating on ARM TC2 hardware.
● Should be headed upstream soon...
Questions?
Thanks for listening

More Related Content

Viewers also liked

LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLinaro
 
LCE12: LCE12 ARMv8 Plenary
LCE12: LCE12 ARMv8 PlenaryLCE12: LCE12 ARMv8 Plenary
LCE12: LCE12 ARMv8 PlenaryLinaro
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingLinaro
 
The Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted CoresThe Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted Coresyeokm1
 
Q4.11: Sched_mc on dual / quad cores
Q4.11: Sched_mc on dual / quad coresQ4.11: Sched_mc on dual / quad cores
Q4.11: Sched_mc on dual / quad coresLinaro
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement Linaro
 

Viewers also liked (6)

LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
 
LCE12: LCE12 ARMv8 Plenary
LCE12: LCE12 ARMv8 PlenaryLCE12: LCE12 ARMv8 Plenary
LCE12: LCE12 ARMv8 Plenary
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
The Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted CoresThe Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted Cores
 
Q4.11: Sched_mc on dual / quad cores
Q4.11: Sched_mc on dual / quad coresQ4.11: Sched_mc on dual / quad cores
Q4.11: Sched_mc on dual / quad cores
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

LCE12: Handling bigLITTLE Core and Cluster Shutdown on ARM

  • 1. Nicolas Pitre Dave Martin Linaro Connect Q4.12 October 2012 Nicolas Pitre Dave Martin Linaro Connect Q4.12 October 2012 Handling big.LITTLE Core and Cluster Shutdowns on ARM Handling big.LITTLE Core and Cluster Shutdowns on ARM
  • 2. TopicsTopics ● Why ● Problems ● Solutions ● Implementation ● Why ● Problems ● Solutions ● Implementation
  • 3. big.LITTLE Activitiesbig.LITTLE Activities ● Current big.LITTLE projects: ● big.LITTLE switcher ● big.LITTLE “full MP” ● Goal: optimize performance and save power on big.LITTLE SoCs ● Current big.LITTLE projects: ● big.LITTLE switcher ● big.LITTLE “full MP” ● Goal: optimize performance and save power on big.LITTLE SoCs
  • 4. Power SavingPower Saving ● Save power by: ● turning off individual CPUs; ● shutting down a whole cluster ● Opportunistic cluster shutdown is key. ● Much more complex than it may seem at first glance. ● Save power by: ● turning off individual CPUs; ● shutting down a whole cluster ● Opportunistic cluster shutdown is key. ● Much more complex than it may seem at first glance.
  • 5. Typical Hardware SystemTypical Hardware System Cluster0 CPU0 CPU1 CPU2 ... cache Cluster1 CPU0 CPU1 CPU2 ... cache Cache-coherent interconnect (CCI) memory peripherals
  • 6. down up CPU Life-CycleCPU Life-Cycle ● up: powered on, running normally ● going down: shutdown in progress ● down: powered off ● coming up: powered on, setup in progress ● up: powered on, running normally ● going down: shutdown in progress ● down: powered off ● coming up: powered on, setup in progress going downcoming up
  • 7. Cluster ShutdownCluster Shutdown ● All CPUs shutting down must: 1)disable allocation into L1 2)flush dirty L1 content 3)disable CPU-level coherency 4)power itself down ● When all CPUs are shut down, we can shut down the cluster: ● The Last Man must perform steps 1-3, and: 5)flush cluster-level (L2) cache 5)disable CCI snooping for the cluster 6)power the cluster down. ● All CPUs shutting down must: 1)disable allocation into L1 2)flush dirty L1 content 3)disable CPU-level coherency 4)power itself down ● When all CPUs are shut down, we can shut down the cluster: ● The Last Man must perform steps 1-3, and: 5)flush cluster-level (L2) cache 5)disable CCI snooping for the cluster 6)power the cluster down.
  • 8. Last Man ChallengesLast Man Challenges ● Last Man has to perform a sequence of actions without interference from other CPUs. ● Problems: ● Other CPUs can be at various stages of shutdown. ● CPUs might wake up at any time. ● Flushing L2 can take quite some time. ● LDREX and STREX only work with cached memory. ● Concurrency is a hard problem. ● Last Man has to perform a sequence of actions without interference from other CPUs. ● Problems: ● Other CPUs can be at various stages of shutdown. ● CPUs might wake up at any time. ● Flushing L2 can take quite some time. ● LDREX and STREX only work with cached memory. ● Concurrency is a hard problem.
  • 9. ...and yet more challenges...and yet more challenges ● Concurrency: ● Which CPU is the Last Man? ● How does the Last Man know the other CPUs are really down? ● How to avoid races with one or more incoming CPUs? ● How the incoming CPU knows if the cluster needs to be set up. ● Races are everywhere! ● Last Man can't flush L2 until all the other CPUs are done flushing their L1 caches. ● Incoming CPUs might power up at any time. ● Incoming CPUs can’t proceed safely if CCI snooping is disabled. ● Memory might be cached on some CPUs and uncached on others... ● Concurrency: ● Which CPU is the Last Man? ● How does the Last Man know the other CPUs are really down? ● How to avoid races with one or more incoming CPUs? ● How the incoming CPU knows if the cluster needs to be set up. ● Races are everywhere! ● Last Man can't flush L2 until all the other CPUs are done flushing their L1 caches. ● Incoming CPUs might power up at any time. ● Incoming CPUs can’t proceed safely if CCI snooping is disabled. ● Memory might be cached on some CPUs and uncached on others...
  • 10. Cluster Life-Cycle (simplified)Cluster Life-Cycle (simplified) ● Similar to CPU life-cycle, but... ● Need to manage cluster caches etc. safely ● Cluster power-down may be preempted ● Need to avoid races when tracking cluster state. ● Similar to CPU life-cycle, but... ● Need to manage cluster caches etc. safely ● Cluster power-down may be preempted ● Need to avoid races when tracking cluster state. down up going downcoming up
  • 11. Actual cluster life-cycleActual cluster life-cycle down, not coming up up, not coming up going down, not coming up going down, coming up up, coming up down, coming up (preempt) actions taken by last man during cluster shutdown actions taken by first man during cluster wake-up
  • 12. Platform Code Helper FunctionsPlatform Code Helper Functions ● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster) Signal that the CPU is shutting down. ● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int cluster) Safely begin cluster shutdown, ensuring all other CPUs are down (last man only) ● void __bL_outbound_leave_critical(unsigned int cluster, int state) End cluster shutdown (last man only) ● void __bL_cpu_down(unsigned int cpu, unsigned int cluster) Signal that the CPU has finished shutting down. ● Fast models example code in arch/arm/mach-vexpress/dcscb.c. ● Equivalent operations for CPU and cluster stat-up handled by common code in arch/arm/common/bL_head.S. ● void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster) Signal that the CPU is shutting down. ● bool __bL_outbound_enter_critical(unsigned int this_cpu, unsigned int cluster) Safely begin cluster shutdown, ensuring all other CPUs are down (last man only) ● void __bL_outbound_leave_critical(unsigned int cluster, int state) End cluster shutdown (last man only) ● void __bL_cpu_down(unsigned int cpu, unsigned int cluster) Signal that the CPU has finished shutting down. ● Fast models example code in arch/arm/mach-vexpress/dcscb.c. ● Equivalent operations for CPU and cluster stat-up handled by common code in arch/arm/common/bL_head.S.
  • 13. Managing Cluster Start-UpManaging Cluster Start-Up ● When powering up, the “First Man” must: ● invalidate cluster-level (L2) cache (if needed), ● enable CCI snooping for the cluster, ● resume execution of the kernel. ● Other CPUs must: ● wait until the first man has set up the cluster, ● resume execution of the kernel. The kernel deals with local CPU setup. ● When powering up, the “First Man” must: ● invalidate cluster-level (L2) cache (if needed), ● enable CCI snooping for the cluster, ● resume execution of the kernel. ● Other CPUs must: ● wait until the first man has set up the cluster, ● resume execution of the kernel. The kernel deals with local CPU setup.
  • 14. Choosing the First ManChoosing the First Man ● Lightweight mutual exclusion using “vlocks” ● A CPU “votes” for itself by storing its ID to a common location: STR cpu_id, [ballot_box] ● Memory atomicity ensures a single winner. ● The winner sets up the cluster. ● Lightweight mutual exclusion using “vlocks” ● A CPU “votes” for itself by storing its ID to a common location: STR cpu_id, [ballot_box] ● Memory atomicity ensures a single winner. ● The winner sets up the cluster. election in progress election started? power-on submit vote election finished? yes no no did I win? yes set up cluster wait for winner to set up cluster no boot or resume OS
  • 15. Kernel APIKernel API A convenient interface is provided to hide hardware specifics from the kernel. ● Make given CPU in given cluster runnable: bL_cpu_power_up(int cpu, int cluster) ● Power the calling CPU down: bL_cpu_power_down(void) ● For self housekeeping: bL_cpu_powered_up(void) A convenient interface is provided to hide hardware specifics from the kernel. ● Make given CPU in given cluster runnable: bL_cpu_power_up(int cpu, int cluster) ● Power the calling CPU down: bL_cpu_power_down(void) ● For self housekeeping: bL_cpu_powered_up(void)
  • 16. Targeted UsersTargeted Users ● the in-kernel switcher module (IKS) ● the cpuidle driver ● CPU hotplug ● secondary CPU booting. ● the in-kernel switcher module (IKS) ● the cpuidle driver ● CPU hotplug ● secondary CPU booting.
  • 17. Code AvailabilityCode Availability ● http://git.linaro.org/gitweb? p=people/nico/linux.git; a=shortlog;h=refs/heads/bL_cluster_pm ● example implementation for ARM Fast Model ● Still vaildating on ARM TC2 hardware. ● Should be headed upstream soon... ● http://git.linaro.org/gitweb? p=people/nico/linux.git; a=shortlog;h=refs/heads/bL_cluster_pm ● example implementation for ARM Fast Model ● Still vaildating on ARM TC2 hardware. ● Should be headed upstream soon...