AnalyzingAnalyzing
Linux kernelLinux kernel
crash dumpscrash dumps
Marian Marinov <mm@1h.com>
hackman @ irc.freenode.net
https://github.com/hackman
Who am I?
What will I cover?
➢ How to gather crash data
➢ How to analyze crash dumps
Ways to gather crash data
➢ Serial console, netconsole
➢ Kmsg dumpers: ramoops, mtdoops
➢ Kdump: core dump of the whole kernel
➢ Pstore: persistent store filesystem
➢ NVRAM: Non-Volatile RAM (in progress)
➢ MCE: hardware errors
Gather the OOPS
➢ Serial console
➢ it is not wide spread
➢ it is limited to a several meters from the machine
➢ Netconsole
➢ allows for sending oopses over the network
➢ if compiled as a module, allows reconfiguration
➢ relies on UDP
➢ if the network is broken or the network stack is the
one experiencing issues - IT DOES NOT WORK :)
Gather the OOPS
➢ Serial console
➢ it is not wide spread
➢ it is limited to a several meters from the machine
➢ Netconsole
➢ allows for sending oopses over the network
➢ if compiled as a module, allows reconfiguration
➢ relies on UDP
➢ if the network is broken or the network stack is the
one experiencing issues - IT DOES NOT WORK :)
Pstore
➢ Pstore: persistent store filesystem
➢ Relies on APEI or UEFI
➢ ACPI Platform Error Interface(APEI)
➢ Provides a generic FS layer for lower persistent
storage
➢ Relies on platform drivers
➢ Available since 2010
# dmesg|grep persistent
pstore: Registered erst as persistent store backend
# mount -t pstore none /sys/fs/pstore
Pstore
# ls -l /sys/fs/pstore
total 0
-r--r--r--. 1 root root 1016 May 13 07:46 dmesg-efi-1
-r--r--r--. 1 root root 1012 May 13 07:46 dmesg-efi-10
-r--r--r--. 1 root root 948 May 13 07:46 dmesg-efi-11
-r--r--r--. 1 root root 943 May 13 07:46 dmesg-efi-2
-r--r--r--. 1 root root 677 May 13 07:46 dmesg-efi-3
-r--r--r--. 1 root root 993 May 13 07:46 dmesg-efi-4
-r--r--r--. 1 root root 1010 May 13 07:46 dmesg-efi-5
-r--r--r--. 1 root root 999 May 13 07:46 dmesg-efi-6
-r--r--r--. 1 root root 976 May 13 07:46 dmesg-efi-7
-r--r--r--. 1 root root 1006 May 13 07:46 dmesg-efi-8
-r--r--r--. 1 root root 949 May 13 07:46 dmesg-efi-9
Pstore
# cat dmesg-efi-4
cat /sys/fs/pstore/dmesg-efi-4
Panic#2 Part4
<1>[ 306.271891] IP: [<ffffffff813ba3e6>] sysrq_handle_crash+0x16/0x20
<4>[ 306.271917] PGD 80a98c067 PUD 807e8e067 PMD 0
<4>[ 306.271937] Oops: 0002 [#1] SMP
<4>[ 306.271952] Modules linked in:
tcp_lp rfcomm fuse xt_CHECKSUM nf_conntrack_netbios_ns
nf_conntrack_broadcast ipt_MASQUERADE ........
function that triggered the crash
Kmsg dumpers
➢ ramoops
➢ utilizes the pstore for storing oopses and panics
➢ since 2011
➢ mtdoops
➢ utilizes Memory Technology Devices found on
some SoC
➢ available since 2007
➢ NVRAM
➢ still not widely available
➢ MCE - mainly EDAC
➢ Error Detection And Correction
Kdump
➢ No dependencies, theoretically ideal, but...
➢ Based on kexec
➢ Not all arch support kexec
➢ Not easy to setup
➢ Boots a second kernel to retrieve the crash vmcore
➢ Almost useless in cases of HW failure
➢ Needs assistance of other tools for analysis
Kdump
Kdump
➢ A second kernel needs to be started when
crashing
➢ Not all drivers work fine in the second kenrel
➢ Very limited memory for the second kernel
➢ We need to construct a new initrd for the
second kernel
Analyzing the crashed kernel
general protection fault: 0000 [#2] SMP
Modules linked in: module list here
CPU: 4 PID: 6839 Comm: iceweasel Tainted: G D 3.16-2-amd64 #1 Debian
3.16.3-2
Hardware name: Gigabyte 990FXA-UD5, BIOS FB 01/23/2013
task: ffff88009c063370 ti: ffff8801f7c94000 task.ti: ffff8801f7c94000
RIP: 0010:[<ffffffff811bcd08>] [<ffffffff811bcd08>]
__d_lookup_rcu+0xc8/0x160
RSP: 0018:ffff8801f7c97cb0 EFLAGS: 00010212
RAX: 0000000000000015 RBX: ffff8800984a2b60 RCX: 000000000000000c
RDX: ffff0800984a2b90 RSI: ffff8801f7c97e10 RDI: 6461657262757065
RBP: ffff8800984a2cd8 R08: ffff88009c19308c R09: ffff88009c19308c
R10: 0000000000000015 R11: ffffffffffffffff R12: ffff8800984a2b58
R13: 00000015067b0bda R14: ffff8801f7c97e10 R15: ffff8801f7c97d0c
FS: 00007f4f52f7d740(0000) GS:ffff88023fd00000(0000)
knlGS:00000000f55ffb40
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4f37200018 CR3: 0000000206380000 CR4: 00000000000007e0
Analyzing the crashed kernelStack:
ffff88009c063370 ffff8801f4e60c10 ffff88009c063370 ffff8801f7c97d78
ffff8801f7c97d68 0000000000000041 ffff8802356b80a0 ffff8800984a2cd8
ffff8801f7c97e00 ffffffff811aedde 02ffff8000000000 0000000200000000
Call Trace:
[<ffffffff811aedde>] ? lookup_fast+0x3e/0x2b0
[<ffffffff811b0865>] ? path_lookupat+0x155/0x780
[<ffffffffa0536c8a>] ? jfs_readdir+0x1ba/0xf90 [jfs]
[<ffffffff811b0eb6>] ? filename_lookup+0x26/0xc0
[<ffffffff811b4fa4>] ? user_path_at_empty+0x54/0x90
[<ffffffff810e908e>] ? from_kgid_munged+0xe/0x20
[<ffffffff811a9f0a>] ? cp_new_stat+0x13a/0x160
[<ffffffff811a9ab6>] ? vfs_fstatat+0x46/0x90
[<ffffffff811a9f4a>] ? SYSC_newstat+0x1a/0x40
[<ffffffff8150c26d>] ? system_call_fast_compare_end+0x10/0x15
Code: 6b 18 75 cf 41 89 07 4d 89 c8 48 8b 53 20 44 89 d0 eb 12 48 39 fe 75 bb 48 83
c2 08 49 83 c0 08 83 e8 08 74 26 49 8b 38 83 f8 07 <48> 8b 32 77 e3 8d 0c c5 00 00
00 00 4c 89 d8 48 31 fe 48 d3 e0
RIP [<ffffffff811bcd08>] __d_lookup_rcu+0xc8/0x160
RSP <ffff8801f7c97cb0>
---[ end trace d7e9304af4a09ee6 ]---
Analyzing the crashed kernel
➢ Try using ksymoops on the collected
oops/panic
➢ System.map - kernel function addresses
➢ /proc/ksyms - list of kernel symbols
➢ /proc/kcore - the system memory
➢ vmlinux - the uncompressed kernel, can be
disassembled using objdump
Analyzing the crashed kernel
# gdb namei.o
(gdb) list *(lookup_fast+0x3e)
0x48fe is in lookup_fast (fs/namei.c:1551).
1546 * going to fall back to non-racy lookup.
1547 */
1548 if (nd->flags & LOOKUP_RCU) {
1549 unsigned seq;
1550 bool negative;
1551 dentry = __d_lookup_rcu(parent, &nd-
>last, &seq);
1552 if (unlikely(!dentry)) {
1553 if (unlazy_walk(nd, NULL, 0))
1554 return -ECHILD;
1555 return 0;
Analyzing the crashed kernel
# cd /usr/src/kernels/KERNEL
# grep -r lookup_fast
.....
fs/namei.c:static int lookup_fast(struct
nameidata *nd,
.....
Using the crash utility
➢ Download and build the latest version
➢ https://github.com/crash-utility/crash
➢ Run it on the same architecture
➢ mix between 32 and 64bit arch is not supported
➢ Checkout the help information
➢ http://people.redhat.com/anderson/help.html
Using the crash utility
➢ Most used commands:
➢ bt - backtrace
➢ log - print the kernel buffer
➢ ps - list all processes
➢ files - list all file descriptors related to task/PID
➢ whatis - gives you data or type information
Analyzing the crashed kernel
➢ Usually you would manually examine the crash
➢ But a small automation may be nice:
# cat extract-basic-info
bt
log
ps
exit
# crash < extract-basic-info
Analyzing the crashed kernel
➢ Usually you would manually examine the crash
➢ But a small automation may be nice:
# cat extract-basic-info
bt
log
ps
exit
# crash vmlinux vmcore < extract-basic-info > report
Analyzing the crashed kernel
RIP: 0010:[<ffffffff9bd376d0>] [<ffffffff9bd376d0>] __list_del_entry+0x0/0xb0
RSP: 0018:ffff88002a4e3d20 EFLAGS: 00010006
RAX: dead000000000100 RBX: dead000000000100 RCX: 0000000000000001
RDX: 0000000000000101 RSI: 0000000000000001 RDI: dead000000000100
RBP: ffff88006a6e6028 R08: 0000000000000101 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006a6e6008
R13: 0000000000000246 R14: deacffffffffff18 R15: ffff880036a22098
FS: 00007f2970ff9700(0000) GS:ffff88006fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efdc457c000 CR3: 0000000069a27000 CR4: 00000000000006f0
Analyzing the crashed kernel
➢ Code Segment(CS): 0010
➢ if the right most bit is even
➢ kernel space
➢ if it is odd
➢ user space
Analyzing the crashed kernel
[4005105.249407] Kernel panic - not syncing: Hard LOCKUP
[4005105.249409] CPU: 16 PID: 18891 Comm: kworker/u96:0 Tainted: G O 4.4.14-clouder2 #1
[4005105.249411] Workqueue: ipoib_wq ipoib_mcast_join_task [ib_ipoib]
[4005105.249412] 0000000000000000 ffff883fff285b10 ffffffff812f4269 ffffffff81a05545
[4005105.249414] ffff883fff285ba0 ffff883fff285b90 ffffffff8112738d ffffffff00000008
[4005105.249416] ffff883fff285ba0 ffff883fff285b38 0000000000000000 0000000000000046
[4005105.249417] Call Trace:
[4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e
[4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b
[4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0
[4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250
[4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20
[4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430
[4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310
[4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20
[4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0
[4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30
[4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40
[4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50
[4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140
[4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100
[4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150
[4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e
[4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
Analyzing the crashed kernel
[4005105.249407] Kernel panic - not syncing: Hard LOCKUP
[4005105.249409] CPU: 16 PID: 18891 Comm: kworker/u96:0 Tainted: G O 4.4.14-clouder2 #1
[4005105.249411] Workqueue: ipoib_wq ipoib_mcast_join_task [ib_ipoib]
[4005105.249412] 0000000000000000 ffff883fff285b10 ffffffff812f4269 ffffffff81a05545
[4005105.249414] ffff883fff285ba0 ffff883fff285b90 ffffffff8112738d ffffffff00000008
[4005105.249416] ffff883fff285ba0 ffff883fff285b38 0000000000000000 0000000000000046
[4005105.249417] Call Trace:
[4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e
[4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b
[4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0
[4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250
[4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20
[4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430
[4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310
[4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20
[4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0
[4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30
[4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40
[4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50
[4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140
[4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100
[4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150
[4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e
[4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
Analyzing the crashed kernel
[4005105.249417] Call Trace:
[4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e
[4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b
[4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0
[4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250
[4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20
[4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430
[4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310
[4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20
[4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0
[4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30
[4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40
[4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50
[4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140
[4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100
[4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150
[4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e
[4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330
[ib_ipoib]
Analyzing the crashed kernel
[4005105.249450] [<ffffffffa02dd7fc>] ?
ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
crash> list *(ipoib_mcast_join_task+0x14c)
list: invalid argument:
*(ipoib_mcast_join_task+0x14c)
crash>
# grep -r ipoib_mcast_join_task
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:void
ipoib_mcast_join_task(struct work_struct *work)
Analyzing the crashed kernel
# grep -r ipoib_mcast_join_task
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:void
ipoib_mcast_join_task(struct work_struct *work)
# gdb drivers/infiniband/ulp/ipoib/ipoib_multicast.o
(gdb) list *(ipoib_mcast_join_task+0x14c)
0xffc is in ipoib_mcast_join_task
(drivers/infiniband/ulp/ipoib/ipoib_multicast.c:641)
636 }
637 } else if (!delay_until ||
638 time_before(mcast->delay_until, delay_until))
639 delay_until = mcast->delay_until;
640 }
641 }
642
643 mcast = NULL;
644 ipoib_dbg_mcast(priv, "successfully started all multicast joinsn");
645
Analyzing the crashed kernel
# crash vmlinux vmcore
crash> ps
PID PPID CPU TASK ST %MEM VSZ RSS COMM
....
5093 1 4 ffff880856d30c80 IN 0.0 36276 6340 hald
....
crash> whatis ffff880856d30c80
unsigned long
crash> p ffff880856d30c80
$1 = 18446612168130628736
Analyzing the crashed kernel
# crash vmlinux vmcore
crash> ps
PID PPID CPU TASK ST %MEM VSZ RSS COMM
....
5093 1 4 ffff880856d30c80 IN 0.0 36276 6340 hald
....
crash> whatis ffff880856d30c80
unsigned long
crash> p ffff880856d30c80
$1 = 18446612168130628736
Analyzing the crashed kernel
crash> task 5093
PID: 5093 TASK: ffff880856d30c80 CPU: 4 COMMAND: "hald"
struct task_struct {
state = 1,
stack = 0xffff8808429b0000,
flags = 4211008,
ptrace = 0,
real_cred = 0xffff880845a2ec00,
cred = 0xffff880845a2ec00,
on_cpu = 0,
prio = 120,
static_prio = 120,
normal_prio = 120,
Analyzing the crashed kernel
crash> task 5093
PID: 5093 TASK: ffff880856d30c80 CPU: 4 COMMAND: "hald"
struct task_struct {
state = 1,
stack = 0xffff8808429b0000,
flags = 4211008,
ptrace = 0,
real_cred = 0xffff880845a2ec00,
cred = 0xffff880845a2ec00,
on_cpu = 0,
prio = 120,
static_prio = 120,
normal_prio = 120,
Analyzing the crashed kernel
crash> whatis task_struct.cred
struct task_struct {
[1456] const struct cred *cred;
}
crash> whatis struct cred
struct cred {
kuid_t uid;
kuid_t suid;
kuid_t euid;
kuid_t fsuid;
kernel_cap_t cap_inheritable;
kernel_cap_t cap_permitted;
kernel_cap_t cap_effective;
struct user_struct *user;
struct user_namespace *user_ns;
struct group_info *group_info;
Analyzing the crashed kernel
crash> struct cred 0xffff880845a2ec00
struct cred {
usage = {
counter = 48
},
uid = {
val = 1849
},
gid = {
val = 1845
},
suid = {
val = 1849
},
sgid = {
val = 1845
},
Using the crash utility
➢ Most used commands:
➢ sys - show the system information stored in the
crash
➢ ipcs - show the shared memory segments
➢ vm - examine the virtual memory in the crash dump
➢ dev - list all devices
Marian Marinov <mm@1h.com>
hackman @ irc.freenode.net
https://github.com/hackman

Linux Kernel Crashdump

  • 2.
  • 3.
  • 4.
    What will Icover? ➢ How to gather crash data ➢ How to analyze crash dumps
  • 5.
    Ways to gathercrash data ➢ Serial console, netconsole ➢ Kmsg dumpers: ramoops, mtdoops ➢ Kdump: core dump of the whole kernel ➢ Pstore: persistent store filesystem ➢ NVRAM: Non-Volatile RAM (in progress) ➢ MCE: hardware errors
  • 6.
    Gather the OOPS ➢Serial console ➢ it is not wide spread ➢ it is limited to a several meters from the machine ➢ Netconsole ➢ allows for sending oopses over the network ➢ if compiled as a module, allows reconfiguration ➢ relies on UDP ➢ if the network is broken or the network stack is the one experiencing issues - IT DOES NOT WORK :)
  • 7.
    Gather the OOPS ➢Serial console ➢ it is not wide spread ➢ it is limited to a several meters from the machine ➢ Netconsole ➢ allows for sending oopses over the network ➢ if compiled as a module, allows reconfiguration ➢ relies on UDP ➢ if the network is broken or the network stack is the one experiencing issues - IT DOES NOT WORK :)
  • 8.
    Pstore ➢ Pstore: persistentstore filesystem ➢ Relies on APEI or UEFI ➢ ACPI Platform Error Interface(APEI) ➢ Provides a generic FS layer for lower persistent storage ➢ Relies on platform drivers ➢ Available since 2010 # dmesg|grep persistent pstore: Registered erst as persistent store backend # mount -t pstore none /sys/fs/pstore
  • 9.
    Pstore # ls -l/sys/fs/pstore total 0 -r--r--r--. 1 root root 1016 May 13 07:46 dmesg-efi-1 -r--r--r--. 1 root root 1012 May 13 07:46 dmesg-efi-10 -r--r--r--. 1 root root 948 May 13 07:46 dmesg-efi-11 -r--r--r--. 1 root root 943 May 13 07:46 dmesg-efi-2 -r--r--r--. 1 root root 677 May 13 07:46 dmesg-efi-3 -r--r--r--. 1 root root 993 May 13 07:46 dmesg-efi-4 -r--r--r--. 1 root root 1010 May 13 07:46 dmesg-efi-5 -r--r--r--. 1 root root 999 May 13 07:46 dmesg-efi-6 -r--r--r--. 1 root root 976 May 13 07:46 dmesg-efi-7 -r--r--r--. 1 root root 1006 May 13 07:46 dmesg-efi-8 -r--r--r--. 1 root root 949 May 13 07:46 dmesg-efi-9
  • 10.
    Pstore # cat dmesg-efi-4 cat/sys/fs/pstore/dmesg-efi-4 Panic#2 Part4 <1>[ 306.271891] IP: [<ffffffff813ba3e6>] sysrq_handle_crash+0x16/0x20 <4>[ 306.271917] PGD 80a98c067 PUD 807e8e067 PMD 0 <4>[ 306.271937] Oops: 0002 [#1] SMP <4>[ 306.271952] Modules linked in: tcp_lp rfcomm fuse xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ........ function that triggered the crash
  • 11.
    Kmsg dumpers ➢ ramoops ➢utilizes the pstore for storing oopses and panics ➢ since 2011 ➢ mtdoops ➢ utilizes Memory Technology Devices found on some SoC ➢ available since 2007
  • 12.
    ➢ NVRAM ➢ stillnot widely available ➢ MCE - mainly EDAC ➢ Error Detection And Correction
  • 13.
    Kdump ➢ No dependencies,theoretically ideal, but... ➢ Based on kexec ➢ Not all arch support kexec ➢ Not easy to setup ➢ Boots a second kernel to retrieve the crash vmcore ➢ Almost useless in cases of HW failure ➢ Needs assistance of other tools for analysis
  • 14.
  • 15.
    Kdump ➢ A secondkernel needs to be started when crashing ➢ Not all drivers work fine in the second kenrel ➢ Very limited memory for the second kernel ➢ We need to construct a new initrd for the second kernel
  • 16.
    Analyzing the crashedkernel general protection fault: 0000 [#2] SMP Modules linked in: module list here CPU: 4 PID: 6839 Comm: iceweasel Tainted: G D 3.16-2-amd64 #1 Debian 3.16.3-2 Hardware name: Gigabyte 990FXA-UD5, BIOS FB 01/23/2013 task: ffff88009c063370 ti: ffff8801f7c94000 task.ti: ffff8801f7c94000 RIP: 0010:[<ffffffff811bcd08>] [<ffffffff811bcd08>] __d_lookup_rcu+0xc8/0x160 RSP: 0018:ffff8801f7c97cb0 EFLAGS: 00010212 RAX: 0000000000000015 RBX: ffff8800984a2b60 RCX: 000000000000000c RDX: ffff0800984a2b90 RSI: ffff8801f7c97e10 RDI: 6461657262757065 RBP: ffff8800984a2cd8 R08: ffff88009c19308c R09: ffff88009c19308c R10: 0000000000000015 R11: ffffffffffffffff R12: ffff8800984a2b58 R13: 00000015067b0bda R14: ffff8801f7c97e10 R15: ffff8801f7c97d0c FS: 00007f4f52f7d740(0000) GS:ffff88023fd00000(0000) knlGS:00000000f55ffb40 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4f37200018 CR3: 0000000206380000 CR4: 00000000000007e0
  • 17.
    Analyzing the crashedkernelStack: ffff88009c063370 ffff8801f4e60c10 ffff88009c063370 ffff8801f7c97d78 ffff8801f7c97d68 0000000000000041 ffff8802356b80a0 ffff8800984a2cd8 ffff8801f7c97e00 ffffffff811aedde 02ffff8000000000 0000000200000000 Call Trace: [<ffffffff811aedde>] ? lookup_fast+0x3e/0x2b0 [<ffffffff811b0865>] ? path_lookupat+0x155/0x780 [<ffffffffa0536c8a>] ? jfs_readdir+0x1ba/0xf90 [jfs] [<ffffffff811b0eb6>] ? filename_lookup+0x26/0xc0 [<ffffffff811b4fa4>] ? user_path_at_empty+0x54/0x90 [<ffffffff810e908e>] ? from_kgid_munged+0xe/0x20 [<ffffffff811a9f0a>] ? cp_new_stat+0x13a/0x160 [<ffffffff811a9ab6>] ? vfs_fstatat+0x46/0x90 [<ffffffff811a9f4a>] ? SYSC_newstat+0x1a/0x40 [<ffffffff8150c26d>] ? system_call_fast_compare_end+0x10/0x15 Code: 6b 18 75 cf 41 89 07 4d 89 c8 48 8b 53 20 44 89 d0 eb 12 48 39 fe 75 bb 48 83 c2 08 49 83 c0 08 83 e8 08 74 26 49 8b 38 83 f8 07 <48> 8b 32 77 e3 8d 0c c5 00 00 00 00 4c 89 d8 48 31 fe 48 d3 e0 RIP [<ffffffff811bcd08>] __d_lookup_rcu+0xc8/0x160 RSP <ffff8801f7c97cb0> ---[ end trace d7e9304af4a09ee6 ]---
  • 18.
    Analyzing the crashedkernel ➢ Try using ksymoops on the collected oops/panic ➢ System.map - kernel function addresses ➢ /proc/ksyms - list of kernel symbols ➢ /proc/kcore - the system memory ➢ vmlinux - the uncompressed kernel, can be disassembled using objdump
  • 19.
    Analyzing the crashedkernel # gdb namei.o (gdb) list *(lookup_fast+0x3e) 0x48fe is in lookup_fast (fs/namei.c:1551). 1546 * going to fall back to non-racy lookup. 1547 */ 1548 if (nd->flags & LOOKUP_RCU) { 1549 unsigned seq; 1550 bool negative; 1551 dentry = __d_lookup_rcu(parent, &nd- >last, &seq); 1552 if (unlikely(!dentry)) { 1553 if (unlazy_walk(nd, NULL, 0)) 1554 return -ECHILD; 1555 return 0;
  • 20.
    Analyzing the crashedkernel # cd /usr/src/kernels/KERNEL # grep -r lookup_fast ..... fs/namei.c:static int lookup_fast(struct nameidata *nd, .....
  • 21.
    Using the crashutility ➢ Download and build the latest version ➢ https://github.com/crash-utility/crash ➢ Run it on the same architecture ➢ mix between 32 and 64bit arch is not supported ➢ Checkout the help information ➢ http://people.redhat.com/anderson/help.html
  • 22.
    Using the crashutility ➢ Most used commands: ➢ bt - backtrace ➢ log - print the kernel buffer ➢ ps - list all processes ➢ files - list all file descriptors related to task/PID ➢ whatis - gives you data or type information
  • 23.
    Analyzing the crashedkernel ➢ Usually you would manually examine the crash ➢ But a small automation may be nice: # cat extract-basic-info bt log ps exit # crash < extract-basic-info
  • 24.
    Analyzing the crashedkernel ➢ Usually you would manually examine the crash ➢ But a small automation may be nice: # cat extract-basic-info bt log ps exit # crash vmlinux vmcore < extract-basic-info > report
  • 25.
    Analyzing the crashedkernel RIP: 0010:[<ffffffff9bd376d0>] [<ffffffff9bd376d0>] __list_del_entry+0x0/0xb0 RSP: 0018:ffff88002a4e3d20 EFLAGS: 00010006 RAX: dead000000000100 RBX: dead000000000100 RCX: 0000000000000001 RDX: 0000000000000101 RSI: 0000000000000001 RDI: dead000000000100 RBP: ffff88006a6e6028 R08: 0000000000000101 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006a6e6008 R13: 0000000000000246 R14: deacffffffffff18 R15: ffff880036a22098 FS: 00007f2970ff9700(0000) GS:ffff88006fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efdc457c000 CR3: 0000000069a27000 CR4: 00000000000006f0
  • 26.
    Analyzing the crashedkernel ➢ Code Segment(CS): 0010 ➢ if the right most bit is even ➢ kernel space ➢ if it is odd ➢ user space
  • 27.
    Analyzing the crashedkernel [4005105.249407] Kernel panic - not syncing: Hard LOCKUP [4005105.249409] CPU: 16 PID: 18891 Comm: kworker/u96:0 Tainted: G O 4.4.14-clouder2 #1 [4005105.249411] Workqueue: ipoib_wq ipoib_mcast_join_task [ib_ipoib] [4005105.249412] 0000000000000000 ffff883fff285b10 ffffffff812f4269 ffffffff81a05545 [4005105.249414] ffff883fff285ba0 ffff883fff285b90 ffffffff8112738d ffffffff00000008 [4005105.249416] ffff883fff285ba0 ffff883fff285b38 0000000000000000 0000000000000046 [4005105.249417] Call Trace: [4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e [4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b [4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0 [4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250 [4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20 [4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430 [4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310 [4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20 [4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0 [4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30 [4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40 [4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50 [4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140 [4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100 [4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150 [4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e [4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
  • 28.
    Analyzing the crashedkernel [4005105.249407] Kernel panic - not syncing: Hard LOCKUP [4005105.249409] CPU: 16 PID: 18891 Comm: kworker/u96:0 Tainted: G O 4.4.14-clouder2 #1 [4005105.249411] Workqueue: ipoib_wq ipoib_mcast_join_task [ib_ipoib] [4005105.249412] 0000000000000000 ffff883fff285b10 ffffffff812f4269 ffffffff81a05545 [4005105.249414] ffff883fff285ba0 ffff883fff285b90 ffffffff8112738d ffffffff00000008 [4005105.249416] ffff883fff285ba0 ffff883fff285b38 0000000000000000 0000000000000046 [4005105.249417] Call Trace: [4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e [4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b [4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0 [4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250 [4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20 [4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430 [4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310 [4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20 [4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0 [4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30 [4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40 [4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50 [4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140 [4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100 [4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150 [4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e [4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
  • 29.
    Analyzing the crashedkernel [4005105.249417] Call Trace: [4005105.249418] <NMI> [<ffffffff812f4269>] dump_stack+0x67/0x9e [4005105.249422] [<ffffffff8112738d>] panic+0xc5/0x20b [4005105.249424] [<ffffffff810e1dcd>] watchdog_overflow_callback+0xdd/0xe0 [4005105.249426] [<ffffffff8111f5f8>] __perf_event_overflow+0x88/0x250 [4005105.249427] [<ffffffff81120174>] perf_event_overflow+0x14/0x20 [4005105.249429] [<ffffffff8101e228>] intel_pmu_handle_irq+0x1c8/0x430 [4005105.249432] [<ffffffff81165bc6>] ? vunmap_page_range+0x1a6/0x310 [4005105.249434] [<ffffffff81165d41>] ? unmap_kernel_range_noflush+0x11/0x20 [4005105.249436] [<ffffffff81382ab8>] ? ghes_copy_tofrom_phys+0x118/0x1e0 [4005105.249437] [<ffffffff81034cff>] ? native_apic_wait_icr_idle+0x1f/0x30 [4005105.249439] [<ffffffff8100a275>] ? arch_irq_work_raise+0x35/0x40 [4005105.249441] [<ffffffff81016b48>] perf_event_nmi_handler+0x28/0x50 [4005105.249443] [<ffffffff81008efd>] nmi_handle+0x6d/0x140 [4005105.249445] [<ffffffff81009480>] default_do_nmi+0x40/0x100 [4005105.249446] [<ffffffff81009641>] do_nmi+0x101/0x150 [4005105.249447] [<ffffffff81616687>] end_repeat_nmi+0x1a/0x1e [4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib]
  • 30.
    Analyzing the crashedkernel [4005105.249450] [<ffffffffa02dd7fc>] ? ipoib_mcast_join_task+0x14c/0x330 [ib_ipoib] crash> list *(ipoib_mcast_join_task+0x14c) list: invalid argument: *(ipoib_mcast_join_task+0x14c) crash> # grep -r ipoib_mcast_join_task drivers/infiniband/ulp/ipoib/ipoib_multicast.c:void ipoib_mcast_join_task(struct work_struct *work)
  • 31.
    Analyzing the crashedkernel # grep -r ipoib_mcast_join_task drivers/infiniband/ulp/ipoib/ipoib_multicast.c:void ipoib_mcast_join_task(struct work_struct *work) # gdb drivers/infiniband/ulp/ipoib/ipoib_multicast.o (gdb) list *(ipoib_mcast_join_task+0x14c) 0xffc is in ipoib_mcast_join_task (drivers/infiniband/ulp/ipoib/ipoib_multicast.c:641) 636 } 637 } else if (!delay_until || 638 time_before(mcast->delay_until, delay_until)) 639 delay_until = mcast->delay_until; 640 } 641 } 642 643 mcast = NULL; 644 ipoib_dbg_mcast(priv, "successfully started all multicast joinsn"); 645
  • 32.
    Analyzing the crashedkernel # crash vmlinux vmcore crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM .... 5093 1 4 ffff880856d30c80 IN 0.0 36276 6340 hald .... crash> whatis ffff880856d30c80 unsigned long crash> p ffff880856d30c80 $1 = 18446612168130628736
  • 33.
    Analyzing the crashedkernel # crash vmlinux vmcore crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM .... 5093 1 4 ffff880856d30c80 IN 0.0 36276 6340 hald .... crash> whatis ffff880856d30c80 unsigned long crash> p ffff880856d30c80 $1 = 18446612168130628736
  • 34.
    Analyzing the crashedkernel crash> task 5093 PID: 5093 TASK: ffff880856d30c80 CPU: 4 COMMAND: "hald" struct task_struct { state = 1, stack = 0xffff8808429b0000, flags = 4211008, ptrace = 0, real_cred = 0xffff880845a2ec00, cred = 0xffff880845a2ec00, on_cpu = 0, prio = 120, static_prio = 120, normal_prio = 120,
  • 35.
    Analyzing the crashedkernel crash> task 5093 PID: 5093 TASK: ffff880856d30c80 CPU: 4 COMMAND: "hald" struct task_struct { state = 1, stack = 0xffff8808429b0000, flags = 4211008, ptrace = 0, real_cred = 0xffff880845a2ec00, cred = 0xffff880845a2ec00, on_cpu = 0, prio = 120, static_prio = 120, normal_prio = 120,
  • 36.
    Analyzing the crashedkernel crash> whatis task_struct.cred struct task_struct { [1456] const struct cred *cred; } crash> whatis struct cred struct cred { kuid_t uid; kuid_t suid; kuid_t euid; kuid_t fsuid; kernel_cap_t cap_inheritable; kernel_cap_t cap_permitted; kernel_cap_t cap_effective; struct user_struct *user; struct user_namespace *user_ns; struct group_info *group_info;
  • 37.
    Analyzing the crashedkernel crash> struct cred 0xffff880845a2ec00 struct cred { usage = { counter = 48 }, uid = { val = 1849 }, gid = { val = 1845 }, suid = { val = 1849 }, sgid = { val = 1845 },
  • 38.
    Using the crashutility ➢ Most used commands: ➢ sys - show the system information stored in the crash ➢ ipcs - show the shared memory segments ➢ vm - examine the virtual memory in the crash dump ➢ dev - list all devices
  • 41.
    Marian Marinov <mm@1h.com> hackman@ irc.freenode.net https://github.com/hackman