Injustice - Developers Among Us (SciFiDevCon 2024)
Ganesh naik linux_kernel_internals
1. 1
Linux Kernel Internals
- By Ganesh Naik
Director – Levana Technologies
Embedded Android & Linux Consultant and Trainer
www.levanatech.com
2. 2
About
Author of book : “Learning Linux Shell Scripting”, plublished by Packt Publication
Website : https://www.packtpub.com/networking-and-servers/learning-linux-shell-scripting
3. 3
• Compiling & testing hello World
– Preprocessing
– Compilation
– Assembling
– Linking
Understanding Hello World Program
5. 5
5
• Why do we need device drivers ?
• Conceptual understanding of real world character device driver
• User Space, Kernel Space & Hardware interaction
Conceptual Understanding of Device Driver
8. 8
• Device Registers: command, status and data
buffer
• Accessing device Registers: (i) address of the
device’s first register (ii) address space where
these registers live
• I/O Space Registers:
(i) inb() Read a single value from I/O Port
(ii) outb() Write a single value to I/O Port
• Memory-mapped registers:
(i) readb() Read a single value from I/O register
(ii) writeb() Write a single value to I/O register
Memory-mapped device registers and I/O space ports
9. 9
9
There are three types of device driver
Char (c)
Block (b)
Network
Linux Treats every device as a file and to interface the driver
uses device nodes found in /dev historically
Network device driver unlike char, and block uses a different
approach for the same
CHAR DRIVERS
10. 1010
Kernel architecture
System call interface
Process
management
Memory
management
Filesystem
support
Device
control Networking
CPU support
code
Filesystem
types
Storage
drivers
Character
device drivers
Network
device drivers
CPU / MMU
support code
C library
App1 App2 ... User
space
Kernel
space
Hardware
CPU RAM Storage
11. 1111
Linux Kernel Generic Structure
System Call Interface
Process
Management
Memory
Management
Device
Control
File
Management
Network
Management
CPU
MMU
File Systems
Block
Devices
(Disks,CDs)
Character
Devices
(Keyboard,
Monitor)
Network
Subsystem
Network
Card
Concurrency VM, paging Handling Files tty access N/W
Connectivity
Features
12. 12
Memory Management - Introduction
Memory management: one of the most important kernel
subsystems
Virtual Memory: Programmer need not to worry about the
size of RAM (Large address space)
static allocation: internal Fragmentation
Dynamic allocation: External Fragmentation
Avoid Fragmentation: Thrashing -overhead
13. 13
Memory mapping
executable image spilt into equal sized small parts
(normally a page size)
virtual memory assigns virtual address to a each part
linking of an executable image into a process virtual
memory
Swapping
swap space is in hard disk partition
if a page is waiting for certain event to occur, swap it.
use physical memory space efficiently
if there is no space in Physical memory, swap LRU pages
into swap space
14. 14
Demand Paging
Don't load all the pages of a process into memory
Load only necessary pages initially
if a required page is not found, generate page fault
page fault handler brings the corresponding page into
memory.
15. 15
A tour to a program execution
Execute
./a.out
Memory
Mapping
Demand
paging
LoadPage
Table
VPFN
VPFN
P
R
O
C
C
E
S
S
O
R
execute
Virtual
Memory
Page
Table
Page Fault
Swap
Space
Hard
Disc
Exit
16. 16
• Regular
• Directory
• Symbolic Link – Hard Link, Soft Link
• Pseudo - /proc
• Special files – Device drivers
• Pipe - FIFO
• Socket
Everything in Linux is a file
18. 18
When you create a file system, Linux creates a number of blocks on that
device.
– Boot Block
– Super-block
– I-node table
– Data Blocks
• Linux also creates an entry for the “/” (root) directory in the I-node table, and allocates
data block to store the contents of the “/” directory.
File Systems - Creating
B S inode table Data blocks
19. 19
• The super-block contains info. such as:
– a bitmap of blocks on the device, each bit specifies whether a block is
free or in use.
– the size of a data block
– the count of entries in the I-node table
– the date and time when the file system was last checked
– the date and time when the file system was last backed up
– Each device also contains more than one copy of the super-block.
– Linux maintains multiple copies of super-block, as the super-block
contains information that must be available to use the device.
– If the original super-block is corrupted, an alternate super-block can be
used to mount the file system.
File Systems - Superblock
20. 20
• The I-node table contains an entry for each file stored in the file system. The
total number of I-nodes in a file system determine the number of files that a
file system can contain.
• When a file system is created, the I-node for the root directory of the file
system is automatically created.
• Each I-node entry describes one file.
File Systems – I-node table
21. 21
• Each I-node contains following info:
– file owner UID and GID
– file type and access permissions
– date/time the file was created, last modified, last accessed
– the size of the file
– the number of hard links to the file
– Each I-node entry can track a very large file
File Systems – I-node table
22. 22
ext File System I-node
I-node StructureI-node No
……….
Block -1 Address
Block -N Address
……….
Single Indirect
Block Address
Double Indirect
Block Address
Triple Indirect
Block Address
N direct
Block Address
N single indirect
Block Address
N double indirect
Block Address
23. 23
• A device special file describes following characteristics of a device
– Device name
– Device type (block device or character device)
– Major device number (for example ‘2” for floppy, “3” for hard-disk )
– Minor device number (such as “1” for “hda1”)
• Switch Table - Linux kernel maintains a set of tables using the major
device numbers.
• The switch table is used to find out which device driver should be
invoked
• For example : fd –> file table –> inode table –> switch table –>
device drivers
Device Special Files
24. Gnu compiler
• A C source code should have a .c extension.
• The command used to compile a C source code is
gcc [options] <filename>.c
• When a C source code is compiled, the compiler generates
the executable.
• By default, a.out ( the name of the executable) is generated,
if no options are given to the compiler.
25. • size command
$ size sample
text data bss dec hex filename
968 260 12 1240 4d8 sample
text executable code in bytes (in decimal format)
data initialized data in bytes (in decimal format)
bss uninitialized data in bytes (in decimal format)
Total size of text, data, bss in
decimal & hex notation
26. • readelf
• readelf is useful to inspect an executable in a detailed way.
• Use –d option to get the list of dynamic libraries an executable depends
on
$ readelf -d sample
Dynamic segment at offset 0x4f4 contains 20 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libc.so.6]
$ readelf -d /bin/ls
Dynamic segment at offset 0x104d4 contains 22 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libacl.so.1]
0x00000001 (NEEDED) Shared library: [libtermcap.so.2]
0x00000001 (NEEDED) Shared library: [libc.so.6]
27. Memory layout of a C program
Stack
Heap
Uninitialized data( bss)
Initialized data
Text
High
address
Low
address
Command line args &
environment variables
Gets initialized to 0 at runtime
Initialized global data
Program code
28. 28
Parent process creates children processes, which, in turn create other processes,
forming a tree of processes.
Resource sharing
– Parent and children share all resources.
– Children share subset of parent’s resources.
– Parent and child share no resources.
Execution
– Parent and children execute concurrently.
– Parent waits until children terminate.
Address space
– Child duplicate of parent.
– Child has a program loaded into it.
Process Creation
29. 29
pid_t fork (void); creates a new process
All statements after the fork() system call in a program are executed by two
processes - the original process that used fork(), plus the new process that
is created by fork( )
main ( ) {
printf (“ Hello fork %dn, fork ( ) ”);
}
– Hello fork: 0
– Hello fork: x ( > 0);
– Hello fork: -1
fork ( )
30. 30
if (!fork) {
/* Child Code */
}
else {
/* parent code */
wait (0); /* or */
waitpid(pid, ….);
}
Parent and child
31. 31
To run a new program in a process, you use one of the “exec” family of calls
(such as “execl”) and specify following:
– the pathname of the program to run
– the name of the program
– each parameter to the program
– (char *)0 or NULL as the last parameter to specify end of parameter list
execl
32. 32
int execl (const char *path, const char *arg, …..);
int execlp (const char *file, const char *arg);
int execle (const char *path, const char *arg, ……., char *const envp[ ]);
int execv (const char *path, char *const argv[ ]);
int execvp (const char *file, char *const argv[ ]);
All the above library functions call internally execve system call.
int execve (const char *filename, char *const argv [ ] , char *const evnp [ ]);
exec family
33. 33
An executable image
Stack
Heap
Uninitialised data
Initialized read-write data
Initialized Read-only data
Text
$ size a.out (man size )
text data bss dec hex filename
920 268 24 1212 4bc a.out
34. 34
Very flexible and ease of use.
Fastest IPC mechanisms
shared memory is used to provide access to
– Global variable
– Shared libraries
– Word processors
– Multi-player gaming environment
– Http daemons
– Other programs written in languages like Perl, C etc.,
Shared Memory - Introduction
35. 35
Shared memory is a much faster method of communication than
either semaphores or message queues.
does not require an intermediate kernel buffer
Using shared memory is quite easy. After a shared memory segment
is set up, it is manipulated exactly like any other memory area.
Why go for Shared Memory
36. 36
The steps involved are
– Creating shared memory
– Connecting to the memory & obtaining a pointer to the memory
– Reading/Writing & changing access mode to the memory
– Detaching from memory
– Deleting the shared segment
Steps to access Shared Memory
37. 37
used to attach the created shared memory segment onto a process
address space.
void *shmat(int shmid,void *shmaddr,int shmflg)
Example: data=shmat(shmid,(void *)0,0);
A pointer is returned on the successful execution of the system call
and the process can read or write to the segment using the pointer.
shmat
38. 38
Reading or writing to a shared memory is the easiest part.
The data is written on to the shared memory as we do it with normal
memory using the pointers
Eg. Read:
printf(“SHM contents : %s n”, data);
Write:
prinf(“”Enter a String : ”);
scanf(“ %[^n]”,data);
Reading/ writing to SM
39. 39
The detachment of an attached shared memory segment is done by shmdt
to pass the address of the pointer as an argument.
Syntax: int shmdt(void *shmaddr);
To remove shared memory call:
int shmctl(shmid,IPC_RMID,NULL);
These functions return –1 on error and 0 on successful execution.
Shmdt & shmctl