Linux Internals - Part 2


Published on

Published in: Technology
  • Be the first to comment

Linux Internals - Part 2

  1. 1. Linux Internals (Day 2) Pradeep D. Tewani © 2012 Pradeep Tewani <> All Rights Reserved.
  2. 2. What to Expect?GNU OS & GNU Development ToolsCompiler, Assembler and linkerWriting Large ProgramsWriting MakefilesLinux Process & Process ControlPrograms on Process control © 2012 Pradeep Tewani <> 2 All Rights Reserved.
  3. 3. GNU Operating SystemA Unix-like operating system that is free software – respects yourfreedomGNU is a recursive acronym for “GNUs Not Unix” because GNUs design is Unix like, but differs by being free software and containing no Unix Code.A software collection of application, libraries and developer tools. And a program to allocate resources and talk to hardware, known as Kernel Used with the Kernel called Linux Together known as GNU/Linux Operating System © 2012 Pradeep Tewani <> All Rights Reserved.
  4. 4. GNU ToolsContains a rich set of software packages ranging fromArchiving, Audio, Business Productivity to Health,Hobbies and so onArchiving – cpio, gzip, tar...Audio – EMMS, GmediaserverDatabase – Ferret, Gdbm, Guil dbi..Games – Acm, AetherspaceAnd the list continues... © 2012 Pradeep Tewani <> All Rights Reserved.
  5. 5. GNU Development ToolsThe GNU development tools are collection ofCompiler, Assembler, Linker, Libraries, Makefile,debugger and so on.GCC – GNU Compiler CollectionBinutils – Linker(ld), Assembler (gas) and so onGNU C Library – libcDebugger – GDB, DDD © 2012 Pradeep Tewani <> All Rights Reserved.
  6. 6. GNU Compiler Collection (GCC)Most widely used compilerA full featured ANSI C compiler with support forC, C++, java and FortranContains the libraries for these languages aswell.Historically, known as GNU C Compiler © 2012 Pradeep Tewani <> All Rights Reserved.
  7. 7. Ws of Building a ProgramA Process to translate a program from a HigherLevel Language to an executableFour stages for a C Program to become anexecutable Pre-processing Compiling Assembling Linking © 2012 Pradeep Tewani <> 7 All Rights Reserved.
  8. 8. Stages for Building a Program © 2012 Pradeep Tewani <> 8 All Rights Reserved.
  9. 9. Pre-ProcessorAccepts Source Code as input and is responsible for Removing Comments Interpreting special pre-processor directives denoted by # #include – includes contents of named file #include <stdio.h> #define – defines symbolic name or constants #define MAX_SIZE 100 gcc -E hello_world.c -o hello_world.i gives hello_world.i © 2012 Pradeep Tewani <> 9 All Rights Reserved.
  10. 10. Compiler, Assembler and LinkerCompiler Translates a pre-Processed source code to assembly code gcc -S hello_world.i -o hello_world.s Gives hello_world.sAssembler Translate assembly code to machine code with unresolved symbols gcc -c hello_world.s -o hello_world.o Gives hello_world.o © 2012 Pradeep Tewani <> 10 All Rights Reserved.
  11. 11. Assembler, Compiler & Linker...Linker Translates the object code into the executable All the called functions named would be replaced by address of these function. gcc hello_world.o -o hello_world Gives hello_world © 2012 Pradeep Tewani <> 11 All Rights Reserved.
  12. 12. Useful gcc Compiler Options-c Suppress the linking process and produce a .o file for each source file listed gcc -c file1.c file2.c ...-llibrary: Link with object libraries which are archived. Example : math library gcc sqrt.c -o sqrt -lm-L<path> - Look into the non standard path for libraries gcc -o bar bar.o -L/home/user/ -lfoo-I<path> - Look into the non standard path for headers Gcc -c -I/home/user foo.c-g : invoke debugging option-D: define symbols either as identifies or as values © 2012 Pradeep Tewani <> 12 All Rights Reserved.
  13. 13. GNU BinutilsThe GNU binutils are a collection of binary tools.Most important as – GNU assembler ld – GNU linkerAlso, includes Addr2line, ar, gprof, nlmconv, nm, objcopy, objdump, ranlib, readelf, size, strings, strip © 2012 Pradeep Tewani <> All Rights Reserved.
  14. 14. Getting the Listing with AssemblerUse the -Wa option in gcc to pass arguments to theassemblerThe arguments must be separated from each other bycommas.Eg: gcc -c -g -Wa,-alh,-L file.c -alh – emit the listing to standard output with high level and assembly source -L - Retain local symbols in symbol table © 2012 Pradeep Tewani <> All Rights Reserved.
  15. 15. Exploring GCC Linking Processgcc -c main.c func.c Give main.o and func.ogcc func.o main.o -o mainldd main Prints the shared libraries needed by program Eg for printf,Code Relocations Are the entries within a binary that are left to be filled at the link time or run time. Use objdump and readelf © 2012 Pradeep Tewani <> All Rights Reserved.
  16. 16. nm commandProvides the information on the symbols being used in the an object file orexecutable fileProvides Virtual address of the Symbol A character to depict the symbol type Lower case – Symbol is local Upper case – Symbol is global Name of the symbolnm <object file or executable name>nm -A ./*.o - display the symbol with function which contains itnm -u <executable> – displays all the unresolved symbols in a executable file © 2012 Pradeep Tewani <> All Rights Reserved.
  17. 17. Strings CommandFinds and displays the printable strings in givenexecutable, binary or object filestrings <filename> © 2012 Pradeep Tewani <> All Rights Reserved.
  18. 18. Building LibrariesCollection of object files can be used to form alibraryUse ar command ar cru libfoo.a foo1.o foo2.o foo3.o Will create a file libfoo.a from foo1.o foo2.o foo3.o Ranlib libfoo.a Generates and adds the symbol table to .a file Gcc main.o libfoo.a © 2012 Pradeep Tewani <> All Rights Reserved.
  19. 19. Writing a Larger ProgramsDivide the programs into modules Separate source file. Main() in main.c and others will contain the functions Can Create our own library of functions. Can be shared amongst many programs Advantages The modules will divide into common group of functions We can link each module separately and link in compiled modules © 2012 Pradeep Tewani <> All Rights Reserved.
  20. 20. Header FilesEach module needs to have a variable definition,function definitions etc with itself.Several modules need to share such somedeclarations. Centralize the PreProcessor commands function prototypes in a file called header with an extension .h.#include <stdio.h> - Standard Header file#inlcude “my_head.h” © 2012 Pradeep Tewani <> All Rights Reserved.
  21. 21. Sharing the VariablesPass the variable as Paramters Passing long variable list to many functions can be laborious. Large arrays and structures are difficult to store locally - Memory problems with stack.External Variable and functions Defined outside of functions Potentially available to the whole program We have global variable – AnotherString declared in main.c and shared with WriteMyString. © 2012 Pradeep Tewani <> All Rights Reserved.
  22. 22. Advantages of Using Several FilesEach programmer works on different file.An Object orientation style can be used.Files can contain all functions from a related group.Code re-usability for well implemented objects orfunctions.When changes are made to a file, only that file needsto be recompiled to rebuild the program © 2012 Pradeep Tewani <> All Rights Reserved.
  23. 23. Makefile & make Utilitymake is a utility for automatically building largeapplicationsFiles specifying the instructions for make areMakefile in the directory which it was invokedMake is an expert system that tracks which fileshave changed since the last time the project wasbuild and invokes the compiler on only those filesand their dependency © 2012 Pradeep Tewani <> All Rights Reserved.
  24. 24. Why make?A software project which consists of many source codes,can have complex and long compiler commands. Usingmake, it can be reduced.Programming project sometimes need specializedcompiler options that are so rarely used they are hard toremember; with make this can be reduced.Maintaining a consistent development environment.Automating the build process, because make can becalled easily from a shell script. © 2012 Pradeep Tewani <> All Rights Reserved.
  25. 25. Makefile ComponentsMakefile is a set of rules. Each rule has the form TARGET: DEPENDENCIES <TAB> COMMAND <TAB> COMMAND The <TABS> are MandatoryTarget Either a name of the file that is generated by the program or the name of the action to carry out Object files and executable files are example of files that are generated by other programs. Cleaning up the object files is an example of an action that we might need to carry © 2012 Pradeep Tewani <> All Rights Reserved.
  26. 26. Make File Components...Dependency A name of the file that is used to create the target More than one dependencies must be separated out by spacesCommands Must be prepended with <TAB> If the target is the file, commands explains how to create that file If the target is an action, then the commands describe the action © 2012 Pradeep Tewani <> All Rights Reserved.
  27. 27. How it works?Works on the Timestamps Dependency is up-to-date, if and only if the targets latest updates happened after the dependencys latest update. Dependency is changed, if and only if the targets latest update happened before the dependency s latest update Makefile takes the action only if Dependency is changed.Starts from the first target & chain proceeds and all the othertargets are ignored.Type make <target> for specific targetLets try a simple Makefile © 2012 Pradeep Tewani <> All Rights Reserved.
  28. 28. Building a C Programa.o : a.c gcc -c a.c -o a.oclean: rm -rf *.o © 2012 Pradeep Tewani <> All Rights Reserved.
  29. 29. What is Process?Running instance of programProgram in execution.Executable Program Loaded -> ProcessTwo instances of same program -> ? Processes © 2012 Pradeep Tewani <> 29 All Rights Reserved.
  30. 30. Program Vs Process © 2012 Pradeep Tewani <> 30 All Rights Reserved.
  31. 31. Why we need a process?Improve the system performanceMultitaskingIn turn needs Time sharing (on same processor) Scheduling Priority And for all these: Process Identifier (PID) © 2012 Pradeep Tewani <> 31 All Rights Reserved.
  32. 32. Lets viewShell Local process: psConsole attached System processes: ps aAll system processes: ps axList Many other details: Add lObserve uid, ppid, priority, nice, status, tty, timeDynamic process status : topTry pid.c © 2012 Pradeep Tewani <> 32 All Rights Reserved.
  33. 33. Linux Process SchedulingBased on the time-sharing technique Several processes are allowed to run concurrently CPU time is divided into “slices”, one for each runnable process. Preemption Based on various task priorities Specified by its scheduling policies © 2012 Pradeep Tewani <> 33 All Rights Reserved.
  34. 34. Types of ProcessesIO Bound Spend much of the time waiting on I/O Runs for only short durationCPU Bound Spend much of the time executing code Tend to run until they are preempted.Linux Implicitly favours I/O bound processes over CPUbound processes © 2012 Pradeep Tewani <> 34 All Rights Reserved.
  35. 35. Process Context Switch © 2012 Pradeep Tewani <> 35 All Rights Reserved.
  36. 36. Generic Process State Diagram © 2012 Pradeep Tewani <> 36 All Rights Reserved.
  37. 37. Linux Process StatesThe possible states are: TASK_RUNNING (R) – ready or running TASK_INTERRUPTIBLE (S) – blocked (waiting for an event) TASK_UNINTERRUPTIBLE (D) – blocked (usually for IO) TASK_ZOMBIE (Z) – terminated, but not cleaned by its parent TASK_STOPPED (T) : execution stopped Mutually exclusive Additional Information: Foreground (+), Threaded (I) © 2012 Pradeep Tewani <> 37 All Rights Reserved.
  38. 38. Process Management in LinuxInformation associated with each process Address Space (Code, Variables, stack, heap) Processor state (PC, Registers,..) CPU registers Scheduling info, priority Memory Management informationStored in a structure of type task_struct Maintained by Linux Kernel for each process Also, called the Process Descriptor/ Process Control Block © 2012 Pradeep Tewani <> 38 All Rights Reserved.
  39. 39. Basic Process Managementbg – Starts a suspended process in the backgroundfg = Starts a suspended process in the foregroundjobs – Lists the jobs runningpidof – Find the Process ID of a running processtop – Display the processes that are using the mostcpu resources © 2012 Pradeep Tewani <> 39 All Rights Reserved.
  40. 40. Process CreationFrom Shell By running a command / Program / Script By exec ing a Command / ProgramBy Programming Using System() Simple, but inefficient Using fork and exec family function Comparatively complex Greater flexibility, speed and security © 2012 Pradeep Tewani <> 40 All Rights Reserved.
  41. 41. Using the System FunctionExecution Flow Creates a sub-process running the standard shell Hands the command to that shell for executionUsed to execute the command from within a programSubjected to the features and limitations of the systemshellTry this example (system.c) © 2012 Pradeep Tewani <> 41 All Rights Reserved.
  42. 42. Using fork()Creates a child process that is an copy of its parent © 2012 Pradeep Tewani <> 42 All Rights Reserved.
  43. 43. Distinguishing Parent & ChildChild has a distinct PID from that of parentThe fork provides a different return values toparent and child.Return Values PID of the child in Parent process 0 in the Child process © 2012 Pradeep Tewani <> 43 All Rights Reserved.
  44. 44. Using Exec functionAn exec family of functions replace the currentprogram by new oneBefore exec(), PID program r e g i s t e r program counter code Program a.c data stack section © 2012 Pradeep Tewani <> 44 All Rights Reserved.
  45. 45. Using exec function... Preserved Reset PID program r e g i s t e r program counter code data ProgramA.c stack heap sectionChange program code Overwritten by the new program © 2012 Pradeep Tewani <> 45 All Rights Reserved.
  46. 46. Exec Family of Functionsexecl() execle() execlp()execv() execve() execvp() © 2012 Pradeep Tewani <> 46 All Rights Reserved.
  47. 47. Cons of fork & execFork Forked child process typically executes the copy of parent process program Cant execute the program external to current executableExec Doesnt creates a new process Calling program ceases to execute © 2012 Pradeep Tewani <> 47 All Rights Reserved.
  48. 48. fork complementing exec Initial process Fork fork() returns pid=0 and runs as aReturns a cloned parent until execv is callednew PID Original New process new_Program Copy of (replacement) Continues Parent execv(new_program) © 2012 Pradeep Tewani <> 48 All Rights Reserved.
  49. 49. Copy On Write (COW)Parent and Child shares the Address space (ro)Data & other resources are marked COWIf written to, a duplicate is make and each process receives a uniquecopyConsequently, the duplication of resources occurs only when theyare written toAvoids copy in case of immediate execfork()s only overheads Duplication of parents page table Creation of unique PCB for child © 2012 Pradeep Tewani <> 49 All Rights Reserved.
  50. 50. Process TerminationParent and Child process terminate as usual Success Exit – Normal Success Error Exit – Normal failure Fatal Exit – Signaled from Kernel space for a bug Kill Exit – Signaled by processBut which one of them terminates first?If it matters, parent can wait for their children usingwait family of system calls. © 2012 Pradeep Tewani <> 50 All Rights Reserved.
  51. 51. Wait family of system callsFour different system calls in the wait family wait() : Block until one of the child process terminates waitpid() : Wait for a specific child to exit/stop/resume. wait3() : Along with, return the resource usage info about exiting/stopping/resuming child process. Wait4() : wait3() for specific child or parent.All of these fill a status code In an integer pointer argument About how the child process exited Which can decoded using... © 2012 Pradeep Tewani <> 51 All Rights Reserved.
  52. 52. wait status macrosWEXITSTATUS : Extracts the child process exit codeWIFEXITED: Determine whether child process exited normally ordied from unhandled signalWIFSIGNALED : Determines if the child process was signaledWTERMSIG : Extracts from the process exit status, the signal bywhich it diedOther WIFSTOPPED, WCOREDUMP, WSTOPSIG, WIFCONTINUED © 2012 Pradeep Tewani <> 52 All Rights Reserved.
  53. 53. Zombie ProcessA process that is terminated but the resourcesare not yet cleaned up is called zombie processReason The child process exited before the parent The parent didnt call the waitEven if a parent does a wait later, it remains aZombie till then. © 2012 Pradeep Tewani <> 53 All Rights Reserved.
  54. 54. Who cleans up a Zombie?Typically, again a Parent By doing a wait on itWhat if the parent exits without “wait”?Does it stay around in the system?Not really. It gets inherited by init which then cleans it up right there © 2012 Pradeep Tewani <> 54 All Rights Reserved.
  55. 55. Orphan ProcessIf the parent dies before the children Child become orphan And are adopted by init which then does the clean up on their exit © 2012 Pradeep Tewani <> 55 All Rights Reserved.
  56. 56. What all we have learnt?GNU OS & GNU Development ToolsCompiler, Assembler and linkerWriting Large ProgramsWriting MakefilesLinux Process & Process ControlPrograms on Process control © 2012 Pradeep Tewani <> 56 All Rights Reserved.
  57. 57. Any Queries?© 2012 Pradeep Tewani <> 57 All Rights Reserved.