Debugging linux

2,767 views

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,767
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
93
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Debugging linux

  1. 1. Andrea Righi - andrea@betterlinux.com Tecniche di debugging nel kernel Linux
  2. 2. Andrea Righi - andrea@betterlinux.com Agenda ● Overview (kernel programming) ● Kernel crash taxonomy ● Debugging techniques ● Example(s) ● Q/A
  3. 3. Andrea Righi - andrea@betterlinux.com What's a kernel? ● The kernel provides an abstraction layer for the applications to use the physical hardware resources ● Kernel basic facilities ● Process management ● Memory management ● Device management ● System call interface
  4. 4. Andrea Righi - andrea@betterlinux.com User space ● Good for debugging (gdb) ● Lots of user-space libraries available ● Unpredictable latency (context switch, scheduler, syscall, ...) ● Overhead ● Impossibility to fully interact with interrupt routines ● Impossibility to access certain memory address ● More difficult to share certain features with other drivers ● Reliability: user processes can be terminated upon critical system events (OOM, filesystem errors, etc.)
  5. 5. Andrea Righi - andrea@betterlinux.com Kernel space ● Written in C and assembly ● No debugging tool (kgdb, UML, ...) ● Bugs can hang the entire system ● User memory is swappable, kernel memory can't be swapped out ● Kernel stack size is small (8K / 4K - THREAD_SIZE_ORDER) ● Floating point is forbidden ● Userspace libraries are not available ● Linux kernel must be portable (this is important if you consider to contribute mainstream) ● Closed source kernel modules taint the kernel
  6. 6. Andrea Righi - andrea@betterlinux.com Example kernel module #include <linux/init.h> #include <linux/module.h> /* Module constructor */ static int __init hello_init(void) { printk(KERN_INFO "Hello, world!n"); return 0; } /* Module destructor */ static void __exit hello_exit(void) { printk(KERN_INFO "Goodbyen"); } module_init(hello_init); module_exit(hello_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Andrea Righi <andrea@betterlinux.com>"); MODULE_DESCRIPTION("BetterEmbedded hello world example");
  7. 7. Andrea Righi - andrea@betterlinux.com Kernel problems ● Kernel panic (fatal error for the system) ● Kernel oops (non-fatal error) ● Wrong result (fatal from user's perspective)
  8. 8. Andrea Righi - andrea@betterlinux.com Kernel panic ● No recovery is possible ● Example: exception in an atomic context (i.e., interrupt) ● Typically result in a system reboot (panic=N), or blinking LED or just hang
  9. 9. Andrea Righi - andrea@betterlinux.com [ 165.552280] general protection fault: 0000 [#1] PREEMPT SMP [ 165.553055] Modules linked in: crashtest(O) [last unloaded: crashtest] [ 165.553092] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.10.0-rc7+ #535 [ 165.553092] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 165.553092] task: ffff88003d90a2c0 ti: ffff88003d92e000 task.ti: ffff88003d92e000 [ 165.553092] RIP: 0010:[<ffffffff811ab0e5>] [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0 [ 165.553092] RSP: 0018:ffff88003e003988 EFLAGS: 00010206 [ 165.553092] RAX: 0000000000000000 RBX: ffff88003e1d6a20 RCX: 00000000000be841 [ 165.553092] RDX: 00000000000be801 RSI: 0000000000000000 RDI: 0000000000000001 [ 165.553092] RBP: ffff88003e0039c8 R08: 00000000001d6a20 R09: 0000000000000000 [ 165.553092] R10: 0000000000000000 R11: 0000000000000001 R12: 7878787878787878 [ 165.553092] R13: 0000000000010220 R14: 0000000000000240 R15: ffff88003d801780 [ 165.553092] FS: 0000000000000000(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000 [ 165.553092] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 165.553092] CR2: 00000000081ab008 CR3: 0000000037dc8000 CR4: 00000000000006e0 [ 165.553092] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 165.553092] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 165.553092] Stack: [ 165.553092] 00000000000be801 ffff88003d92ffd8 ffffffff8161683d ffff880034e3f300 [ 165.553092] ffff88003e003a17 0000000000000020 0000000000000240 0000000000000000 [ 165.553092] ffff88003e003a00 ffffffff8161433c ffff880034e3f300 0000000000000020 ... ... ...
  10. 10. Andrea Righi - andrea@betterlinux.com ... [ 165.553092] Call Trace: [ 165.553092] <IRQ> [ 165.553092] [<ffffffff8161683d>] ? __alloc_skb+0x7d/0x290 [ 165.553092] [<ffffffff8161433c>] __kmalloc_reserve.isra.52+0x3c/0xa0 [ 165.553092] [<ffffffff8161683d>] __alloc_skb+0x7d/0x290 [ 165.553092] [<ffffffff81677e5b>] tcp_send_ack+0x3b/0xf0 [ 165.553092] [<ffffffff8166ab1e>] __tcp_ack_snd_check+0x5e/0xa0 [ 165.553092] [<ffffffff81671c64>] tcp_rcv_established+0x204/0x6f0 [ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40 [ 165.553092] [<ffffffff8167c681>] tcp_v4_do_rcv+0x161/0x360 [ 165.553092] [<ffffffff816fea39>] ? _raw_spin_lock_nested+0x79/0x90 [ 165.553092] [<ffffffff8167dc91>] tcp_v4_rcv+0x731/0x980 [ 165.553092] [<ffffffff810e706f>] ? __lock_is_held+0x5f/0x80 [ 165.553092] [<ffffffff816563d8>] ip_local_deliver_finish+0xc8/0x2f0 [ 165.553092] [<ffffffff8165635a>] ? ip_local_deliver_finish+0x4a/0x2f0 [ 165.553092] [<ffffffff81656e77>] ip_local_deliver+0x47/0x80 [ 165.553092] [<ffffffff81656740>] ip_rcv_finish+0x140/0x5e0 [ 165.553092] [<ffffffff816570e3>] ip_rcv+0x233/0x380 [ 165.553092] [<ffffffff81626062>] __netif_receive_skb_core+0x6a2/0x970 [ 165.553092] [<ffffffff81625a10>] ? __netif_receive_skb_core+0x50/0x970 [ 165.553092] [<ffffffff81626351>] __netif_receive_skb+0x21/0x70 [ 165.553092] [<ffffffff81626563>] netif_receive_skb+0x23/0x1f0 [ 165.553092] [<ffffffff81627448>] napi_gro_receive+0x98/0xd0 [ 165.553092] [<ffffffff81565c5a>] e1000_clean_rx_irq+0x18a/0x520 [ 165.553092] [<ffffffff81567451>] e1000_clean+0x251/0x910 [ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40 [ 165.553092] [<ffffffff810e6df4>] ? lock_release_holdtime.part.27+0xd4/0x160 [ 165.553092] [<ffffffff81627015>] net_rx_action+0xd5/0x2e0 [ 165.553092] [<ffffffff81088d17>] __do_softirq+0xf7/0x420 [ 165.553092] [<ffffffff810891d5>] irq_exit+0xb5/0xc0 [ 165.553092] [<ffffffff81709303>] do_IRQ+0x63/0xd0 [ 165.553092] Code: c8 48 8b 55 c0 48 8b 81 38 e0 ff ff a8 08 0f 85 5f 01 00 00 4c 8b 23 4d 85 e4 0f 84 15 01 00 00 49 63 47 20 48 8d 4a 40 4d 8b 07 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 97 49 63 [ 165.553092] RIP [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0 [ 165.553092] RSP <ffff88003e003988> [ 165.553092] ---[ end trace baac76a23c6da73c ]--- [ 165.553092] Kernel panic - not syncing: Fatal exception in interrupt
  11. 11. Andrea Righi - andrea@betterlinux.com Kernel oops ● A message is displayed in the log when a recoverable error has occurred in kernel space ● Example: access a bad address (i.e., NULL pointer dereference) ● An oops does not mean the system has crashed ● Current process is killed ● Oops message is displayed along with a registers dump and a stack trace
  12. 12. Andrea Righi - andrea@betterlinux.com [ 75.962412] BUG: unable to handle kernel NULL pointer dereference at (null) [ 75.963046] IP: [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] PGD 3a78d067 PUD 362be067 PMD 0 [ 75.963046] Oops: 0002 [#1] PREEMPT SMP [ 75.963046] Modules linked in: crashtest(O) [ 75.963046] CPU: 0 PID: 1587 Comm: bash Tainted: G O 3.10.0-rc7+ #535 [ 75.963046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 75.963046] task: ffff88003a7ec580 ti: ffff8800362f6000 task.ti: ffff8800362f6000 [ 75.963046] RIP: 0010:[<ffffffffa00003c6>] [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] RSP: 0018:ffff8800362f7e78 EFLAGS: 00010297 [ 75.963046] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000004e [ 75.963046] RDX: 0000000000000000 RSI: ffffffffa0000469 RDI: ffff8800362f7eaa [ 75.963046] RBP: ffff8800362f7ee0 R08: 0000000000000000 R09: 0000000000000000 [ 75.963046] R10: ffff88003a7ec580 R11: 0000000000000000 R12: 0000000000000003 [ 75.963046] R13: 000000000000000a R14: ffff8800362f7f50 R15: 0000000000000000 [ 75.963046] FS: 0000000000000000(0000) GS:ffff88003de00000(0063) knlGS:00000000f75f76c0 [ 75.963046] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 75.963046] CR2: 0000000000000000 CR3: 0000000036209000 CR4: 00000000000006f0 [ 75.963046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 75.963046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 75.963046] Stack: [ 75.963046] ffffffff811b66cb 0000000000000000 0000000000000000 ffff88003a7ec580 [ 75.963046] ffff8800362f7ec8 4f49545045435845 000000000000004e 0000000000000000 [ 75.963046] 0000000000000000 00000000463b9fa0 ffff8800362fd300 000000000000000a [ 75.963046] Call Trace: [ 75.963046] [<ffffffff811b66cb>] ? vfs_write+0x1bb/0x1f0 [ 75.963046] [<ffffffff8121a86d>] proc_reg_write+0x3d/0x80 [ 75.963046] [<ffffffff811b65d8>] vfs_write+0xc8/0x1f0 [ 75.963046] [<ffffffff811b6ad5>] SyS_write+0x55/0xa0 [ 75.963046] [<ffffffff81708ce5>] sysenter_dispatch+0x7/0x1f [ 75.963046] [<ffffffff813c50ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 75.963046] Code: e1 f3 6f e1 48 c7 c7 60 09 00 a0 e8 d5 f3 6f e1 e9 e2 fd ff ff c7 45 d0 78 56 34 12 e9 d6 fd ff ff e8 bf fc ff ff e9 cc fd ff ff <c7> 04 25 00 00 00 00 00 00 00 00 e9 bc fd ff ff eb fe 66 c7 07 [ 75.963046] RIP [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] RSP <ffff8800362f7e78> [ 75.963046] CR2: 0000000000000000
  13. 13. Andrea Righi - andrea@betterlinux.com Taxonomy of kernel faults ● panic(“have a nice day... ;-)”) ● BUG() / BUG_ON(condition) ● exception (i.e., invalid opcode, division by zero, ...) ● memory corruption ● stack overflow/underflow – NOTE: in kernel space stack size is limited to 2 pages (8K in almost all architectures) ● write after free ● write to a bad address ● concurrent access without protections (locks, etc.) ● soft lockup ● lock a CPU without giving other tasks a chance to run ● hard lockup ● lock a CPU without giving other tasks or interrupts a chance to run ● hung task: task doesn't get a chance to run for more than N seconds ● scheduling while atomic ● deadlock ● use FPU registers in kernel space
  14. 14. Andrea Righi - andrea@betterlinux.com Useful debugging kernel options ● Kernel Hacking section -> ● CONFIG_KALLSYMS_ALL: print function names instead of addresses in kernel messages ● CONFIG_FRAME_POINTER: get useful stack info in case of kernel bugs ● CONFIG_DEBUG_ATOMIC_SLEEP: enable sleep inside atomic section checks (i.e., sleep from interrupt handler, sleep when a lock is held, etc...) ● CONFIG_LOCKUP_DETECTOR: detect hard and soft lockups ● CONFIG_LOCKDEP: lock dependency enging (deadlock detection) ● CONFIG_DYNAMIC_FTRACE: enable individual function tracing dynamically (via debugfs /sys/kernel/debug/tracing)
  15. 15. Andrea Righi - andrea@betterlinux.com Debugging techniques ● blinking LED ● printk() ● procfs ● SysReq key (Documentation/sysrq.txt) ● function instrumentation (kprobes) ● dynamic ftrace (CONFIG_DYNAMIC_FTRACE) ● debugger (kgdb)
  16. 16. Andrea Righi - andrea@betterlinux.com printk() ● Advantages ● easy to use ● no need any other system support ● Disadvantages ● have to modify and rebuild kernel/modules ● no interactive debugging
  17. 17. Andrea Righi - andrea@betterlinux.com printk(): levels ● printk levels ● KERN_EMERG: system is unusable ● KERN_ALERT: action must be taken immediately ● KERN_CRIT: critical condition ● KERN_ERR: error condition ● KERN_WARNING: warning condition ● KERN_NOTICE: normal condition ● KERN_INFO: informational ● KERN_DEBUG: debug message ● Show kernel messages: # dmesg ● Redirect all kernel messages to the console # echo 8 > /proc/sys/kernel/printk ●
  18. 18. Andrea Righi - andrea@betterlinux.com procfsstatic int procfs_read(struct seq_file *m, void *v) { ... } static ssize_t procfs_write(struct file *file, const char __user *ubuf, size_t count, loff_t *pos) { ... } static int procfs_open(struct inode *inode, struct file *file) { return single_open(file, procfs_read, NULL); } static int procfs_release(struct inode *inode, struct file *file) { return 0; } static const struct file_operations procfs_fops = { .open = procfs_open, .read = seq_read, .write = procfs_write, .llseek = seq_lseek, .release = procfs_release, }; static int __init myproc_init(void) { if (!proc_create(“myproc”, 0666, NULL, &procfs_fops)) return -ENOMEM; return 0; } static void __exit myproc_exit(void) { remove_proc_entry(“myproc”, NULL); }
  19. 19. Andrea Righi - andrea@betterlinux.com Kprobes (Kernel probes) ● Kprobes allow to dynamically break into any kernel routine and collect debugging and performance information (CONFIG_KPROBES=y) ● Trap almost every kernel code address, specifying a handler routine to be invoked when the breakpoint is hit ● How does it work? ● Make a copy of the probed instruction and replace the original instruction with a breakpoint instruction (int3 on x86) ● When the breakpoint is hit, a trap occurs, CPU's registers are saved and the control passes to the Kprobes pre-handler ● The saved instruction is executed in single-step mode ● The Kprobes post-handler is executed ● The rest of the original function is executed
  20. 20. Andrea Righi - andrea@betterlinux.com Kprobes (example) static int my_handler(struct kprobe *p, struct pt_regs *regs) { /* Do something here... */ } static struct kprobe my_kp = { .pre_handler = my_wrapper, .symbol_name = “schedule_timeout”, }; static int __init my_kprobe_init(void) { int ret; ret = register_kprobe(&my_kp); if (ret < 0) { printk(KERN_INFO "%s: error %dn", __func__, ret); return ret; } return 0; } static void __exit my_kprobe_exit(void) { unregister_kprobe(&my_kp); }
  21. 21. Andrea Righi - andrea@betterlinux.com Dump a stack trace static const char function_name[] = "schedule_timeout"; static int my_handler(struct kprobe *p, struct pt_regs *regs) { dump_stack(); printk(KERN_INFO "%s called %s(%d)n", current->comm, function_name, (int)regs->di); } static struct kprobe my_kp = { .pre_handler = my_wrapper, .symbol_name = function_name, }; static int __init my_kprobe_init(void) { int ret; ret = register_kprobe(&my_kp); if (ret < 0) { printk(KERN_INFO "%s: error %dn", __func__, ret); return ret; } return 0; } static void __exit my_kprobe_exit(void) { unregister_kprobe(&my_kp); }
  22. 22. Andrea Righi - andrea@betterlinux.com Dynamic ftrace # mount -t debufs none /sys/kernel/debug # cd /sys/kernel/debug # echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter # echo function > current_tracer # echo 1 > tracing_on # usleep 1 # echo 0 > tracing_on # cat trace # tracer: function # # entries-in-buffer/entries-written: 5/5 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath <idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt <idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt <idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt
  23. 23. Andrea Righi - andrea@betterlinux.com KGDB + QEMU $ kvm -m 1024 -smp 4 -drive file=debian-6-i386.img -vnc :1 -redir tcp:5190:10.0.2.15:22 -kernel /src/linux/arch/x86/boot/bzImage -append "root=/dev/sda1 kgdbwait kgdboc=ttyS0" -serial pty char device redirected to /dev/pts/3 (label serial0) $ gdb vmlinux (gdb) target remote /dev/pts/3 ● Setting up kgdb using kvm/qemu
  24. 24. Andrea Righi - andrea@betterlinux.com Debugging workqueues ● workqueue: asynchronous process execution context ● kworkers are going crazy (using too much cpu)? ● Something being scheduled in rapid succession ● A single work item consumes alots of cpu cycles ● How to debug? ● kernel tracepoints: – echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event ● kworker stack trace: – cat /proc/THE_OFFENDING_KWORKER/stack root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0]
  25. 25. Andrea Righi - andrea@betterlinux.com References ● J. Corbet, A. Rubini, G. Kroah-Hartman: Linux Device Drivers 3rd Edition ● Linux documentation ● http://lxr.linux.no/linux/Documentation/trace ● http://lxr.linux.no/linux/Documentation/kprobes.txt ● Linux weekly news: http://lwn.net
  26. 26. Andrea Righi - andrea@betterlinux.com Q/A ● You're very welcome! ● Twitter ● @arighi ● #bem2013

×