4. 4
Symptom II
• The system does not crash, but one or more CPUs
are locked up in kernel mode (CPU usage rate is very
high, even users can not get any response via the
keyboard).
• NMI watchdog kernel message, such as
[ 816.032003] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [mdadm:2126]
• Affect other programs to get a chance to run, bring
some weird problems.
6. 6
Process scheduling
• Time sharing system, switch one process to another in
a very short time frame.
nice()/setpriority()
• Scheduling policy
sched_setscheduler()
• Process switch
schedule()
• Timer interrupt/Process preemption
(TIF_NEED_RESCHED thr flag)
7. 7
Improper kernel programming
• Cause the kernel to loop in kernel mode for more than
20 seconds with disable process preemption, aka
softlockup.
e.g. take too long time with holding a spin_lock.
• Causes the CPU to loop in kernel mode for more than
10 seconds, without letting other interrupts have a
chance to run, aka hardlockup.
e.g. take too long time during local interrupts are
disabled.
8. 8
Softlockup and hardlockup detector
(nmi_watchdog)
• A periodic hrtimer runs to generate interrupts and kick the
watchdog task. An NMI perf event is generated every
"watchdog_thresh"(compile-time initialized to 10 and configurable
through sysctl of the same name) seconds to check for
hardlockups. If any CPU in the system does not receive any
hrtimer interrupt during that time the 'hardlockup detector' (the
handler for the NMI perf event) will generate a kernel warning or
call panic, depending on the configuration.
• The watchdog task is a high priority kernel thread that updates a
timestamp every time it is scheduled. If that timestamp is not
updated for 2*watchdog_thresh seconds (the softlockup
threshold) the 'softlockup detector' (coded inside the hrtimer
callback function) will dump useful debug information to the
system log, after which it will call panic if it was instructed to do
so or resume execution of other kernel code.
• Code: /usr/src/linux/kernel/watchdog.c
12. 12
Bug 1049126
• Cluster md: cluster node hangs after complain leaving
the lockspace group
• https://bugzilla.suse.com/show_bug.cgi?id=1049126
• Test script:
13. 13
Setup System Kdump
• Edit /boot/grub2/grub.cfg
linux /boot/vmlinuz-4.4.70-2-default root=UUID=83f8e4f0-e145-4019-8834-ae5a4fe1b64e
resume=/dev/vda1 splash=silent quiet showopts crashkernel=117M,high
• Enable kdump service
# systemctl enable kdump.service
• Reboot the system to take effect
# reboot
• Test if kdump works
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger
14. 14
Setup Serial Console
• Edit /boot/grub2/grub.cfg
linux /boot/vmlinuz-4.4.70-2-default root=UUID=83f8e4f0-e145-4019-8834-ae5a4fe1b64e
resume=/dev/vda1 splash=silent quiet showopts crashkernel=117M,high console=tty0
console=ttyS0,115200
• Add a Serial Device for VM
# virt-manager
• Reboot the system to take effect
# reboot
• Login the VM via serial port
# virsh console ‘VM domain’
15. 15
Run the test script
• The system hang on one node
• See the output from serial port console
[ 784.032004] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mdadm:2126]
[ 816.032003] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [mdadm:2126]
[ 844.032003] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mdadm:2126]
… …
• Reboot the system, set kernel to panic in softlockup
case
echo 1 > /proc/sys/kernel/softlockup_panic
• Run the test script to reproduce the problem
• Catch system kernel dump file