Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dynamic Linker


Published on

This PPT discusses the concept of Dynamic Linker as in Linux and its porting to Solaris ARM platform. It starts from the very basics of linking process

Published in: Technology

Dynamic Linker

  1. 1. Dynamic Linker An Overview Sanjiv Malik
  2. 2. Type of Linking Static Linking : All symbols resolved at the time of linking. For  example, gcc –static flag sets the program to be linked in static fashion. Disadvantage : Large program size Advantage : Fast processing Dynamic Linking : Symbols are resolved at the time of execution  of the program. gcc by default links the program dynamically. The program size is small, but the runtime performance cost is substantial. 2 2009/1/26
  3. 3. Example linking process m.c a.c Translators Translators separately compiled m.o a.o relocatable object files Linker (ld) executable object file p (contains code and data for all functions defined in m.c and a.c) 3 2009/1/26
  4. 4. What does a linker do? Merges object files   merges multiple relocatable (.o) object files into a single executable object file that can loaded and executed by the loader. Resolves external references   as part of the merging process, resolves external references.  external reference: reference to a symbol defined in another object file. Relocates symbols   relocates symbols from their relative locations in the .o files to new absolute positions in the executable.  updates all references to these symbols to reflect their new positions.  references can be in either code or data  code: a(); /* ref to symbol a */  data: int *xp=&x; /* ref to symbol x */ 4 2009/1/26
  5. 5. Executable and linkable format (ELF)  Standard binary format for object files  Derives from AT&T System V Unix later adopted by BSD Unix variants and Linux   One unified format for relocatable object files (.o), executable object files, and shared object files (.so) generic name: ELF binaries   Better support for shared libraries than old a.out formats. 5 2009/1/26
  6. 6. ELF object file format Elf header  0 ELF header  magic number, type (.o, exec, .so), machine, byte Program header table ordering, etc. (required for executables) Program header table  .text section  page size, virtual addresses for memory segments (sections), segment sizes. .data section .text section  .bss section  code .symtab .data section  .rel.txt  initialized (static) data .bss section  .debug  uninitialized (static) data  “Block Started by Symbol” Section header table (required for relocatables)  “Better Save Space”  has section header but occupies no space 6 2009/1/26
  7. 7. ELF object file format .symtab section  symbol table  0 ELF header  procedure and static variable names Program header table  section names and locations (required for executables) .rel.text section  .text section  relocation info for .text section .data section  addresses of instructions that will need to be modified in the executable .bss section  instructions for modifying. .symtab section  .rel.text  relocation info for .data section  addresses of pointer data that will need to be .debug modified in the merged executable .debug section Section header table  (required for relocatables)  info for symbolic debugging (gcc -g) 7 2009/1/26
  8. 8. Example C program m.c a.c extern int e; int e=7; int *ep=&e; int main() { int x=15; int r = a(); int y; exit(0); } int a() { return *ep+x+y; } 8 2009/1/26
  9. 9. Merging .o files into an executable Relocatable object files Executable object file 0 .text system code headers .data & .bss system data system code main() .text a() .text main() m.o more system code .data int e = 7 system data int e = 7 .data int *ep = &e .text a() int x = 15 .bss a.o int *ep = &e uninitialized data .data int x = 15 .symtab .bss int y .debug 9 2009/1/26
  10. 10. Relocating symbols and resolving external references Symbols are lexical entities that name functions and variables. Each symbol has a value (typically a memory address). Code consists of symbol definitions and references. References can be either local or external. m.c a.c extern int e; int e=7; Def of local symbol e int *ep=&e; int main() { Ref to int x=15; int r = a(); external int y; exit(0); symbol e Def of } Defs of int a() { local local return *ep+x+y; symbol Ref to external symbols x } Ref to external ep symbol exit and y (defined in symbol a Def of Refs of local local symbols e,x,y symbol a 10 2009/1/26
  11. 11. m.o relocation info m.c Disassembly of section .text: int e=7; 00000000 <main>: 00000000 <main>: int main() { 0: 55 pushl %ebp int r = a(); 1: 89 e5 movl %esp,%ebp exit(0); 3: e8 fc ff ff ff call 4 <main+0x4> } 4: R_386_PC32 a 8: 6a 00 pushl $0x0 a: e8 fc ff ff ff call b <main+0xb> b: R_386_PC32 exit f: 90 nop Disassembly of section .data: 00000000 <e>: 0: 07 00 00 00 source: objdump 11 2009/1/26
  12. 12. a.o relocation info (.text) a.c Disassembly of section .text: extern int e; 00000000 <a>: int *ep=&e; 0: 55 pushl %ebp int x=15; 1: 8b 15 00 00 00 movl 0x0,%edx int y; 6: 00 3: R_386_32 ep int a() { 7: a1 00 00 00 00 movl 0x0,%eax return *ep+x+y; 8: R_386_32 x c: 89 e5 movl %esp,%ebp } e: 03 02 addl (%edx),%eax 10: 89 ec movl %ebp,%esp 12: 03 05 00 00 00 addl 0x0,%eax 17: 00 14: R_386_32 y 18: 5d popl %ebp 19: c3 ret 12 2009/1/26
  13. 13. a.o relocation info (.data) a.c Disassembly of section .data: extern int e; 00000000 <ep>: int *ep=&e; 0: 00 00 00 00 int x=15; 0: R_386_32 e int y; 00000004 <x>: 4: 0f 00 00 00 int a() { return *ep+x+y; } 13 2009/1/26
  14. 14. Executable after relocation and external reference resolution (.text) 08048530 <main>: 8048530: 55 pushl %ebp 8048531: 89 e5 movl %esp,%ebp 8048533: e8 08 00 00 00 call 8048540 <a> 8048538: 6a 00 pushl $0x0 804853a: e8 35 ff ff ff call 8048474 <_init+0x94> 804853f: 90 nop 08048540 <a>: 8048540: 55 pushl %ebp 8048541: 8b 15 1c a0 04 movl 0x804a01c,%edx 8048546: 08 8048547: a1 20 a0 04 08 movl 0x804a020,%eax 804854c: 89 e5 movl %esp,%ebp 804854e: 03 02 addl (%edx),%eax 8048550: 89 ec movl %ebp,%esp 8048552: 03 05 d0 a3 04 addl 0x804a3d0,%eax 8048557: 08 8048558: 5d popl %ebp 8048559: c3 ret 14 2009/1/26
  15. 15. Executable after relocation and external reference resolution (.data) m.c int e=7; Disassembly of section .data: int main() { 0804a010 <__data_start>: int r = a(); 804a010: 00 00 00 00 exit(0); 0804a014 <p.2>: } 804a014: f8 a2 04 08 a.c 0804a018 <e>: extern int e; 804a018: 07 00 00 00 int *ep=&e; 0804a01c <ep>: int x=15; 804a01c: 18 a0 04 08 int y; 0804a020 <x>: 804a020: 0f 00 00 00 int a() { return *ep+x+y; } 15 2009/1/26
  16. 16. Strong and weak symbols  Program symbols are either strong or weak strong: procedures and initialized globals  weak: uninitialized globals  p1.c: p2.c: weak strong int foo=5; int foo; strong strong p1() { p2() { } } 16 2009/1/26
  17. 17. Static libraries (archives) p1.c p2.c Translator Translator static library (archive) of p1.o p2.o libc.a relocatable object files concatenated into one file. Linker (ld) executable object file (only contains code and p data for libc functions that are called from p1.c and p2.c) Further improves modularity and efficiency by packaging commonly used functions (e.g., C standard library, math library) Linker selectively includes only the .o files in the archive that are actually needed by the program. 17 2009/1/26
  18. 18. Creating static libraries atoi.c printf.c random.c ... Translator Translator Translator atoi.o printf.o random.o ar rs libc.a Archiver (ar) atoi.o printf.o … random.o C standard library libc.a Archiver allows incremental updates: • recompile function that changes and replace .o file in archive. 18 2009/1/26
  19. 19. The complete picture m.c a.c Translator Translator m.o a.o libwhatever.a Static Linker (ld) p Loader/Dynamic Linker ( p’ 19 2009/1/26
  20. 20. Dynamically linked shared libraries m.c a.c Translators Translators (cc1, as) (cc1,as) m.o a.o Linker (ld) shared libraries of dynamically p relocatable object files functions called by m.c Loader/Dynamic Linker and a.c are loaded, linked, and ( (potentially) shared among processes. 20 2009/1/26
  21. 21. Solaris specific options for generating static and dynamic executables 21 2009/1/26
  22. 22. Linking process in Solaris In Solaris/Open Solaris, the linking process is performed in two steps:  Compile time linking is done by the “ld” tool called the Link Editor. The link-editor, ld(1), concatenates and interprets data from one or more input files. These files can be relocatable objects, shared objects, or archive libraries. From these input files, one output file is created. This file is either a relocatable object, an executable application, or a shared object.  The link-editor is most commonly invoked as part of the compilation environment. 22 2009/1/26
  23. 23. Dynamic linker in Solaris The runtime linker,, processes dynamic executables and  shared objects at runtime, binding the executable and shared objects together to create a runnable process. During the link-editing of a dynamic executable, a special .interp  section, together with an associated program header, are created. This section contains a path name specifying the program’s interpreter. The default name supplied by the link-editor is the name of the runtime linker: /usr/lib/ for a 32–bit executable and /usr/lib/64/ for a 64–bit executable. The dynamic linker is itself an ELF shared library. At  program startup, the system maps the to a part of the address space and runs its bootstrap code. 23 2009/1/26
  24. 24. Link Editor Functions Following is summary of Link Editor functions:  The concatenation of sections of the same characteristics from the input relocatable objects to form new sections within the output file. The concatenated sections can in turn be associated to output segments.  The processing of symbol table information from both relocatable objects and shared objects to verify and unite references with definitions. The generation of a new symbol table, or tables, within the output file. The processing of relocation information from the input relocatable objects,  and the application of this information to the output file by updating other input sections. In addition, output relocation sections might be generated for use by the runtime linker. The generation of program headers that describe all the segments that are  created. The generation of dynamic linking information sections if necessary, which  provide information such as shared object dependencies and symbol bindings to the runtime linker. 24 2009/1/26
  25. 25. Symbol processing by Link Editor During input file processing, all local symbols from the input relocatable  objects are passed through to the output file image. All global symbols are accumulated internally within the link-editor. Each global symbol supplied by a relocatable object is searched for within this internal symbol table. If a symbol with the same name has already been encountered from a previous input file, a symbol resolution process is called. This symbol resolution process determines which of the two entries are kept. On completing input file processing, and providing no fatal symbol resolution  errors have occurred, the link-editor determines if any unresolved symbol references remain. Unresolved symbol references can cause the link-edit to terminate. Finally, the link-editor’s internal symbol table is added to the symbol tables of  the image being created. 25 2009/1/26
  26. 26. Example $ cat main.c  extern int u_bar; extern int u_foo(); int t_bar; int d_bar = 1; d_foo() { return (u_foo(u_bar, t_bar, d_bar)); } $ cc -o main.o -c main.c  $ nm -x main.o  [Index] Value Size Type Bind Other Shndx Name ............... [8] |0x00000000|0x00000000|NOTY |GLOB |0x0 |UNDEF |u_foo [9] |0x00000000|0x00000040|FUNC |GLOB |0x0 |2 |d_foo [10] |0x00000004|0x00000004|OBJT |GLOB |0x0 |COMMON |t_bar [11] |0x00000000|0x00000000|NOTY |GLOB |0x0 |UNDEF |u_bar [12] |0x00000000|0x00000004|OBJT |GLOB |0x0 |3|d_bar 26 2009/1/26
  27. 27. ELF File processing Sections are the smallest  indivisible units that can be processed within an ELF file. Segments are a collection of  sections that represent the smallest individual units that can be mapped to a memory image by the dynamic linker 27 2009/1/26
  28. 28. Functions of the Dynamic Linker The runtime linker: Analyzes the executable’s dynamic information section (.dynamic) and determines what dependencies are required. Locates and loads these dependencies, analyzing their dynamic information sections to determine if any additional dependencies are required. Performs any necessary relocations to bind these objects in preparation for process execution. Calls any initialization functions provided by the dependencies. Passes control to the application. Can be called upon during the application’s execution, to perform any delayed function binding. 28 2009/1/26
  29. 29. ELF Parsing by Dynamic Linker Executable object file for example program p 0 ELF header virtual addr Process image Program header table 0x080483e0 (required for executables) init and shared lib segments .text section .data section 0x08048494 .text segment .bss section (r/o) .symtab .rel.text 0x0804a010 .data segment .dynamic (initialized r/w) .debug 0x0804a3b0 Section header table .bss segment (required for relocatables) (uninitialized r/w) 29 2009/1/26
  30. 30. 1. Resolving the Dependencies When linking a dynamic executable, one or more shared objects  are explicitly referenced. These objects are recorded as dependencies within the dynamic executable. The runtime linker uses this dependency information to locate,  and load, the associated objects. These dependencies are processed in the same order as the dependencies were referenced during the link-edit of the executable. Once all the dynamic executable’s dependencies are loaded, each  dependency is inspected, in the order the dependency is loaded, to locate any additional dependencies. This process continues until all dependencies are located and loaded. This technique results in a breadth-first ordering of all dependencies. 30 2009/1/26
  31. 31. 1. Resolving the Dependencies The Solaris runtime linker looks in two default locations for dependencies  /lib and /usr/lib. The dependencies of a dynamic executable or shared object can be displayed  using ldd. For example, the file /usr/bin/cat has the following dependencies: $ ldd /usr/bin/cat  => /lib/ => /lib/ The dependencies recorded in an object can be inspected using dump. Use  this command to display the file’s .dynamic section, and look for entries that have a NEEDED tag. $ dump -Lvp prog  prog: [INDEX] Tag Value [1] NEEDED [2] NEEDED [3] RUNPATH /home/me/lib:/home/you/lib ......... 31 2009/1/26
  32. 32. 1. Resolving the Dependencies The dynamic segment (pointed to by the program header) in the ELF file  contains a pointer to the file's string table (DT_STRTAB) as well as to the DT_NEEDED entries, each of which contains the offset in the string table for the name of a required library. The dynamic linker creates a scope list for the executable, consisting of libraries to be loaded. For each of the entries in the scope list , the linker searches for the file  containing the library. Once the file is found, the linker reads the ELF Header to find the program header, which points to the dynamic segment . The linker maps the library to the process address space. From the dynamic  segment, it adds the library's symbol table to the chain of symbol tables - and if the libraries has further dependencies, it adds those libraries to the list to be loaded and the process is continued. For clarification, note that in fact it actually creates a struct link_map for each of the library and adds it into a global linked list. 32 2009/1/26
  33. 33. Symbol table structure typedef struct{  Elf32_Word st_name; Elf32_Addr st_value; Elf32_Word st_size; unsignedchar st_info; unsignedchar st_other; Elf32_Half st_shndx; }Elf32_Sym; 33 2009/1/26
  34. 34. Parsing other sections of ELF For dynamic linking, the Dynamic linker primarily uses two  processor-specific tables, the Global Offset Table (GOT) and the Procedure Linkage Table (PLT). Dynamic linkers support PIC Code through the GOT in each shared library. The GOT contains absolute addresses to all of the static data  referenced in the program. Both the executables that use the shared libraries and the shared library itself has a PLT. Similar to how the GOT redirects any position-independent address calculations to absolute locations, the PLT redirects position- independent function calls to absolute locations. 34 2009/1/26
  35. 35. Parsing other sections of ELF In the .dynamic section, the important tag types are: DT_NEEDED: This element holds the string table offset of a  null-terminated string, giving the name of a needed library. The offset is an index into the table recorded in the DT_STRTAB entry. DT_HASH: This element holds the address of the symbol hash table which refers to the symbol table referenced by the DT_SYMTAB element. DT_STRTAB: This element holds the address of the string table. DT_SYMTAB: This element holds the address of the symbol table. 35 2009/1/26
  36. 36. 2. Relocation Processing After the runtime linker has loaded all the dependencies required  by an application, the linker processes each object and performs all necessary relocations. Relocation is the process of connecting symbolic references with  symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution. Relocatable files must have information that describes how to modify their section contents. This information allows executable and shared object files to hold the right information for a process’s program image. 36 2009/1/26
  37. 37. 3. Loading segments in memory The LD_BIND_NOW variable determines the dynamic linking  behavior. If its set, the dynamic linker evaluates the PLT entries, which is all entries of type R_386_JMP_SLOT, at the load time itself. Otherwise, the dynamic linker does lazy linking of procedure addresses and hence the addresses are not bound unless the routines are called. 37 2009/1/26
  38. 38. 4. Delayed Function Binding Under delayed function binding or lazy loading model, any dependencies that  are labeled for lazy loading are loaded only when explicitly referenced. By taking advantage of the lazy binding of a function call, the loading of a dependency is delayed until the function is first referenced. As a result, objects that are never referenced are never loaded. As a practical example (.dynamic), shows is marked for lazy  loading. The symbol information section (.SUNW_syminfo), shows the symbol reference that triggers loading. $ cc -o prog prog.c -L. -zlazyload -ldebug -znolazyload -lelf -R’$ORIGIN’ $ elfdump -d prog Dynamic Section: .dynamic index tag value [0] POSFLAG_1 0x1 [ LAZY ] [1] NEEDED [2] NEEDED 0x131 [3] NEEDED 0x13d 38 2009/1/26
  39. 39. A look into Solaris runtime linker The File dl_runtime.c contains the following main routines:  _dl_fixup [Resolves the PLT Symbols]  _dl_profile_fixup  _dl_call_pltexit Other related routines:  elf_machine_plt_value  elf_machine_fixup_plt  _dl_lookup_symbol_x 39 2009/1/26
  40. 40. ARM Specific ELF Header Settings For ARM target environment, the values in the ELF header are specifically defined. All other values are as specified in the Tool Interface Standard Portable Formats Specification:  e_machine is set to EM_ARM (defined as 40)  e_ident[EI_CLASS] is set to ELFCLASS32  e_ident[EI_DATA] is set to: ELFDATA2LSB for little-endian targets ELFDATA2MSB for big-endian targets 40 2009/1/26
  41. 41. Special sections in ARM ELF Files In Executable ARM ELF, all Executables have at least two Sections, unless the linker has been invoked with -nodebug:  The Symbol Table Section: This Section has the following attributes: sh_name: quot;.symtabquot; sh_type: SHT_SYMTAB sh_addr: 0 (to indicate it is not part of the image) The String Table Section:  This Section has the following attributes: sh_name: quot;.strtabquot; sh_type: SHT_STRTAB sh_addr: 0 (to indicate it is not part of the image) 41 2009/1/26
  42. 42. Special Sections in ARM ELF Debugging Sections  ARM Executable ELF supports three types of debugging  information held in debugging Sections. ASD debugging tables 1. These provide backwards compatibility with ARM's Symbolic Debugger. ASD debugging information is stored in a single Section in the executable named .asd. DWARF Version 1.0 2. DWARF Version 2.0 3. 42 2009/1/26