• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
The true story_of_hello_world
 

The true story_of_hello_world

on

  • 1,609 views

龙乐乐同学的hello world背后的故事

龙乐乐同学的hello world背后的故事

Statistics

Views

Total Views
1,609
Views on SlideShare
1,609
Embed Views
0

Actions

Likes
1
Downloads
18
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    The true story_of_hello_world The true story_of_hello_world Presentation Transcript

    • P Review the True Story of Hello World? Are you kidding? Web Cookie: Google nativeclient--Native code for web apps Native Cookie: Structure and Interpretation of Computer Programs
    • Releated topics 1. Loader and Linker 2. ELF 3. LLVM 4. LIBrary 5. More behind the scene
    • Wrenches readelf objdump nm file string strace strip ld NASA done this after Bush cancelled the Back to moon plan
    • news updated
    • You can interrupt me at any time (rather than Q&A section), this is more conivenent for context reference。 I am learning,correct me if necessary. 可以在随时打断我,这样方便引用上下文, 以免在Q&A截断重复。 我也只是入门,请纠正出现的错误。 pls read the note before the lecture or along the lecture if you have a computer before hand now. 请提前阅读讲稿附带的笔记, 如果还没有阅读,如果手头现在有台机 器,也可以跟随演讲进行。
    • The True Story of Hello World (or at least a good part of it) original author:Antônio Augusto M. Fröhlich original story zh_CN
    • Offtopic:about the author Prof. Dr. Antônio Augusto Fröhlich LISHA Software/Hardware Integration Lab Federal University of Santa Catarina Course taught Object-Oriented Programming System Programming Operating Systems Computational Biology (parallel programming)
    • Hello World in C 1 #include <stdio.h> header file directives 2 int main(int argc, char *argv[]) the so-called entry 3 { point 4 printf("Hello world!n"); C library call printf (buffered) 5 return 0; tell OS:everything is ok 6 }
    • Let's compile,link,and run it as a beginner #compile -c get object file(Optimization in GCC and here) %gcc -Os -c hello.c #linking %ld --dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtn.o -lc hello.o -o hello #run %./hello
    • How about LLVM? # compile the C file into a native executable: % llvm-gcc hello.c -o hello #compile the C file into a LLVM bitcode file: % llvm-gcc -O3 -emit-llvm hello.c -c -o hello.bc #run the program using the just-in-time compiler: % lli hello.bc LLVM Getting Started Writing your own toy compiler zh_CN 使用Flex Bison 和LLVM编写自己的编译器
    • %objdump -hrt hello.o 1 hello.o: file format elf32-i386 2 Sections: 3 Idx Name Size VMA LMA File off Algn 4 0 .text 00000027 00000000 00000000 00000034 2**2 5 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 6 1 .data 00000000 00000000 00000000 0000005c 2**2 7 CONTENTS, ALLOC, LOAD, DATA 8 2 .bss 00000000 00000000 00000000 0000005c 2**2 9 ALLOC 10 3 .rodata.str1.1 0000000e 00000000 00000000 0000005c 2**0 11 CONTENTS, ALLOC, LOAD, READONLY, DATA 12 4 .comment 00000024 00000000 00000000 0000006a 2**0 13 CONTENTS, READONLY 14 5 .note.GNU-stack 00000000 00000000 00000000 0000008e 2**0 15 CONTENTS, READONLY 16 SYMBOL TABLE: 17 00000000 l df *ABS* 00000000 hello.c 18 00000000 l d .text 00000000 .text 19 00000000 l d .data 00000000 .data 20 00000000 l d .bss 00000000 .bss 21 00000000 l d .rodata.str1.1 00000000 .rodata.str1.1 22 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 23 00000000 l d .comment 00000000 .comment 24 00000000 g F .text 00000027 main 25 00000000 *UND* 00000000 __printf_chk 26 RELOCATION RECORDS FOR [.text]: 27 OFFSET TYPE VALUE 28 00000012 R_386_32 .rodata.str1.1 29 00000019 R_386_PC32 __printf_chk
    • %readelf -l hello 1 Elf file type is EXEC (Executable file) 2 Entry point 0x80482e0 3 There are 7 program headers, starting at offset 52 4 Program Headers: 5 Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align 6 PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4 7 INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1 8 [Requesting program interpreter: /lib/ld-linux.so.2] 9 LOAD 0x000000 0x08048000 0x08048000 0x003d6 0x003d6 R E 0x1000 10 LOAD 0x0003d8 0x080493d8 0x080493d8 0x000e8 0x000e8 RW 0x1000 11 DYNAMIC 0x0003d8 0x080493d8 0x080493d8 0x000c8 0x000c8 RW 0x4 12 NOTE 0x000128 0x08048128 0x08048128 0x00020 0x00020 R 0x4 13 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 14 Section to Segment mapping: 15 Segment Sections... 16 00 17 01 .interp 18 02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel. dyn .rel.plt .init .plt .text .fini .rodata 19 03 .dynamic .got .got.plt .data 20 04 .dynamic 21 05 .note.ABI-tag 22 06
    • More explained one
    • An example of an LKM with various ELF sections
    • Executable and Linking Format (ELF) %file hello ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped 1,AMD-64 ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux 2.6.8, dynamically linked (uses shared libs), not stripped 2,RISC_6000 executable (RISC System/6000 V3.1) or obj module not stripped 3,UltraSPARC ELF 32-bit MSB executable, SPARC32PLUS, V8+ Required, version 1 (SYSV), dynamically linked (uses shared libs), not stripped 4,loongson ELF 32-bit LSB executable, MIPS, MIPS-I version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), not stripped above file format data is collected from unix-center.net
    • Sections and Segments
    • Hello World in assembly %gcc -Os -S hello.c -o - 1 .file "hello.c" 2 .section .rodata.str1.1,"aMS",@progbits,1 3 .LC0: 4 .string "Hello world!n" 5 .text 6 .globl main 7 .type main, @function 8 main: 9 leal 4(%esp), %ecx 10 andl $-16, %esp 11 pushl -4(%ecx) 12 pushl %ebp 13 movl %esp, %ebp 14 pushl %ecx 15 subl $12, %esp 16 pushl $.LC0 17 pushl $1 18 call __printf_chk 19 movl -4(%ebp), %ecx 20 xorl %eax, %eax 21 leave 22 leal -4(%ecx), %esp 23 ret 24 .size main, .-main 25 .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3" 26 .section .note.GNU-stack,"",@progbits
    • Look into %objdump -s hello Contents of section .interp: 8048114 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so 8048124 2e3200 .2. Contents of section .dynstr: 80481c0 006c6962 632e736f 2e36005f 494f5f73 .libc.so.6._IO_s 80481d0 7464696e 5f757365 64005f5f 6c696263 tdin_used.__libc 80481e0 5f737461 72745f6d 61696e00 5f5f7072 _start_main.__pr 80481f0 696e7466 5f63686b 005f5f67 6d6f6e5f intf_chk.__gmon_ 8048200 73746172 745f5f00 474c4942 435f322e start__.GLIBC_2. 8048210 3000474c 4942435f 322e332e 3400 0.GLIBC_2.3.4. ....... Contents of section .rodata: 80483c0 03000000 01000200 48656c6c 6f20776f ........Hello wo 80483d0 726c6421 0a00 rld!.. Contents of section .comment: 0000 4743433a 20285562 756e7475 20342e34 GCC: (Ubuntu 4.4 0010 2e332d34 7562756e 74753529 20342e34 .3-4ubuntu5) 4.4 0020 2e3300 .3.
    • entry pointer %ld -o hello_ld hello.o -lc ld: warning: cannot find entry symbol _start; defaulting to 0000000008048074 %./hello_ld bash: ./hello_ld: No such file or directory %gcc -nostdlib -o hello-nostdlib hello.c /usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000080480b8 /tmp/ccmKwryA.o: In function `main': hello.c:(.text+0x11): undefined reference to `puts' collect2: ld returned 1 exit status So what does ld forget here?
    • difference between hello and hello-ld %nm hello 1 080493d8 d _DYNAMIC %nm hello-ld 2 080494a4 d _GLOBAL_OFFSET_TABLE_ 1 080491e4 d _DYNAMIC 3 080483c4 R _IO_stdin_used 2 08049284 d _GLOBAL_OFFSET_TABLE_ 4 080494c0 A __bss_start 3 08049294 A __bss_start 5 080494bc D __data_start 4 U __printf_chk@@GLIBC_2.3.4 6 w __gmon_start__ 5 08049294 A _edata 7 0804837a T __i686.get_pc_thunk.bx 6 08049294 A _end 8 080493d6 d __init_array_end 7 U _start 9 080493d6 d __init_array_start 8 080481ac T main 10 08048310 T __libc_csu_fini 11 08048320 T __libc_csu_init 12 U __libc_start_main@@GLIBC_2.0 13 U __printf_chk@@GLIBC_2.3.4 14 080494c0 A _edata 15 080494c0 A _end 16 080483a8 T _fini 17 080483c0 R _fp_hw 18 08048278 T _init 19 080482e0 T _start 20 080494bc W data_start 21 08048380 T main
    • %strace ./hello 1 execve("./hello", ["./hello"], [/* 43 vars */]) = 0 2 brk(0) = 0x848a000 3 uname({sys="Linux", node="lele-laptop", ...}) = 0 4 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 5 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7757000 6 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 7 open("/etc/ld.so.cache", O_RDONLY) =3 8 fstat64(3, {st_mode=S_IFREG|0644, st_size=170053, ...}) = 0 9 mmap2(NULL, 170053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb772d000 10 close(3) =0 11 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 12 open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3 13 read(3, "177ELF11100000000030301000000m10004000"..., 512) = 512 14 fstat64(3, {st_mode=S_IFREG|0755, st_size=1405508, ...}) = 0 15 mmap2(NULL, 1415592, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb84000 16 mprotect(0xcd7000, 4096, PROT_NONE) = 0 17 mmap2(0xcd8000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x153) = 0xcd8000 18 mmap2(0xcdb000, 10664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xcdb000 19 close(3) =0 20 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb772c000 21 set_thread_area({entry_number:-1 -> 6, base_addr:0xb772c6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 22 mprotect(0xcd8000, 8192, PROT_READ) = 0 23 mprotect(0xea6000, 4096, PROT_READ) = 0 24 munmap(0xb772d000, 170053) =0 25 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 26 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7756000 27 write(1, "Hello world!n", 13) = 13 28 exit_group(0) =?
    • Layout of Hello World
    • Internal Symbols printf("(end of text)etext:%pn",&etext); printf("(end of data)edata:%pn",&edata); printf("(end of segments)end:%pn",&end); printf("(__executable_start):%pn",&__executable_start); (end of text)etext:0x80486b8 (end of data)edata:0x804a028 (end of segments)end:0x804c060 (__executable_start):0x8048000
    • Child and Parent(wait4) %strace -e trace=process -f sh -c "hello" > /dev/null 1 execve("/bin/sh", ["sh", "-c", "hello"], [/* 43 vars */]) = 0 2 clone(Process 6067 attached 3 child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb781a938) = 6067 4 [pid 6066] wait4(-1, Process 6066 suspended 5 <unfinished ...> 6 [pid 6067] execve("/usr/bin/hello", ["hello"], [/* 43 vars */]) = 0 7 [pid 6067] exit_group(0) =? 8 Process 6066 resumed 9 Process 6067 detached 10 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6067 11 --- SIGCHLD (Child exited) @ 0 (0) --- 12 exit_group(0)
    • Can we do it without c library? %more hello-nostd.c void _start() { /* exit system call */ asm("movl $1,%eax;" "xorl %ebx,%ebx;" "int $0x80" ); } int main(int argc, char *argv[]) { char* str="Hello world!n"; return 0; } %gcc -nostdlib hello-nostd.c -o hello-nostd asm(call main): http://blog.ksplice.com/2010/03/libc-free-world/
    • Freak out 1. gcc -v 2. ld --verbose 3. objdump -d (man objdump)
    • Web References 1. A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux 2. Hello from a libc-free world!(part one and part two) 3. bash: ./hello-ld: No such file or directory 4. Linkers and Loaders (Linux Journal) 5. Structure and Interpretation of Computer Programs 6. Linux Standard Base (LSB)
    • Books 1. Loader and Linker John R. Levine - 2000 2. 程序员的自我修养 --链接、装载与库 俞甲子/石凡/潘爱 民 2009 3. Computer Systems: A Programmer's Perspective Randal E.Bryant / David O'Hallaron 2003