Analyzing a decade of Linux
system calls
Mojtaba Bagherzadeh, Nafiseh Kahani, Cor-Paul Bezemer,
Ahmed E. Hassan, Juergen Dingel, James R. Cordy
Journal First — Empirical Software Engineering Journal
The Linux kernel forms the central part of various
operating systems used by millions of users
2
The Linux kernel forms the central part of various
operating systems used by millions of users
2
The Linux kernel forms the central part of various
operating systems used by millions of users
2
The Linux kernel provides its services
through 393 system calls
File system & I/O
Process management
IPC4 & network
Memory management
Signal handling
Time operations
System info & settings
Scheduling
Security & capabilities
Modules
% of system calls
1%
2%
4%
5%
6%
6%
7%
13%
18%
37%
3
The Linux kernel provides its services
through 393 system calls
File system & I/O
Process management
IPC4 & network
Memory management
Signal handling
Time operations
System info & settings
Scheduling
Security & capabilities
Modules
% of system calls
1%
2%
4%
5%
6%
6%
7%
13%
18%
37%
3
System calls are extensively used by
most applications
4
Running the ls command
requires the execution of 26
different system calls!
System call execution requires
context switching
System call interface
User Mode
Kernel Mode
5
System call execution requires
context switching
User application
call
open()
System call interface
User Mode
Kernel Mode
5
System call execution requires
context switching
User application
call
open()
System call interface
User Mode
Kernel Mode
System call table
xx sys_open
5
System call execution requires
context switching
User application
call
open()
System call interface
User Mode
Kernel Mode
Service function
Implementation
of open()
System call table
xx sys_open
5
System call execution requires
context switching
User application
call
open()
System call interface
User Mode
Kernel Mode
return
Service function
Implementation
of open()
System call table
xx sys_open
5
Studying the evolution of the system calls can
lead to valuable insights for several groups
6
Studying the evolution of the system calls can
lead to valuable insights for several groups
Developers
6
Studying the evolution of the system calls can
lead to valuable insights for several groups
Developers Researchers
6
7
We extracted 8,770 changes (commits) of
system calls from April 2005 to December 2014
7
1
Man pages
syscalls*.tbl
syscall*.S
393 system calls
names
Extract system calls name
We extracted 8,770 changes (commits) of
system calls from April 2005 to December 2014
7
1
Man pages
syscalls*.tbl
syscall*.S
393 system calls
names
Extract system calls name
2
Extract commits
(2005-2014)
Filter system call
commits
≅12k
Commits
≅500k
Commits
Linux Kernel
We extracted 8,770 changes (commits) of
system calls from April 2005 to December 2014
7
1
Man pages
syscalls*.tbl
syscall*.S
393 system calls
names
Extract system calls name
3 ≅12k
Commits
Manual cleanup
≅9k
Commits
2
Extract commits
(2005-2014)
Filter system call
commits
≅12k
Commits
≅500k
Commits
Linux Kernel
We extracted 8,770 changes (commits) of
system calls from April 2005 to December 2014
We classified 8,770 changes (commits) of
system calls
8
We classified 8,770 changes (commits) of
system calls
8
3,164 (35%)
Refactoring
We classified 8,770 changes (commits) of
system calls
8
3,164 (35%)
Refactoring
3,247 (36%)
Bug fixes
We classified 8,770 changes (commits) of
system calls
8
3,164 (35%)
Refactoring
2,131 (25%)
Improvement
3,247 (36%)
Bug fixes
We classified 8,770 changes (commits) of
system calls
8
Add/remove
482 (5%)
3,164 (35%)
Refactoring
2,131 (25%)
Improvement
3,247 (36%)
Bug fixes
We classified 8,770 changes (commits) of
system calls
8
Add/remove
482 (5%)
3,164 (35%)
Refactoring
2,131 (25%)
Improvement
3,247 (36%)
Bug fixes
We classified 3,247 bug fix related commits
Semantic
Concurrency
Memory
Compatibility
Error code
% of commits
7%
8%
10%
16%
61%
9
We classified 3,247 bug fix related commits
Semantic
Concurrency
Memory
Compatibility
Error code
% of commits
7%
8%
10%
16%
61%
9
Our study provides several insightful results
for the development of system calls
10
Our study provides several insightful results
for the development of system calls
10
Complex system calls
Sibling system calls
The cost of supporting several architectures
Race graph
4498 (50%) of commits were made to only
25 (6%) of the system calls
11
ptrace()
signal()
ioctl()
futex()
ipc()
% of commits
3%
3%
5%
8%
8%
4498 (50%) of commits were made to only
25 (6%) of the system calls
11
ptrace()
signal()
ioctl()
futex()
ipc()
% of commits
3%
3%
5%
8%
8%
Our study provides several insightful results
for the development of system calls
12
Our study provides several insightful results
for the development of system calls
12
Complex system calls
Sibling system calls
The cost of supporting several architectures
Race graph
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Parameter extension,
e.g, dup, dup2
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Parameter extension,
e.g, dup, dup2
Architecture support,
e.g, truncate, truncate64
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Working directory,
e.g., open, openat
Parameter extension,
e.g, dup, dup2
Architecture support,
e.g, truncate, truncate64
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Backward compatibility,
e.g., vm86, vm86old
Working directory,
e.g., open, openat
Parameter extension,
e.g, dup, dup2
Architecture support,
e.g, truncate, truncate64
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Backward compatibility,
e.g., vm86, vm86old
Working directory,
e.g., open, openat
Real time support
e.g, sigreturn, rtsigreturn
Parameter extension,
e.g, dup, dup2
Architecture support,
e.g, truncate, truncate64
A sibling call is a system call that is similar in
functionality, often in name, to another system call
13
Backward compatibility,
e.g., vm86, vm86old
Working directory,
e.g., open, openat
Real time support
e.g, sigreturn, rtsigreturn
E.g., waitpid(), wait4()
Parameter extension,
e.g, dup, dup2
Architecture support,
e.g, truncate, truncate64
53% of the new system calls and 26% of the
existing system calls are sibling calls
Parameter extension
Architecture
Working directory
Backwards compatibility
Real time
Others
% of sibling calls
29%
8%
6%
14%
31%
12%
14
In most cases, sibling calls
are a repayment of technical
debt in the Linux kernel API
15
Kernel developers use flag and
struct parameter to minimize
the number of new sibling calls
16
Our study provides several insightful results
for the development of system calls
17
Our study provides several insightful results
for the development of system calls
17
Complex system calls
Sibling system calls
The cost of supporting several architectures
Race graph
A new system call is usually not activated
for all architectures at the same time
18
2010 2013
XtensaARM
A new system call is usually not activated
for all architectures at the same time
18
2010 2013
XtensaARM
2008
Activation year of the accept4 system call on different architectures
Sparc-64
A new system call is usually not activated
for all architectures at the same time
18
2010 2013
XtensaARM
2008
Activation year of the accept4 system call on different architectures
Sparc-64
A new system call is usually not activated
for all architectures at the same time
18
2010 2013
XtensaARM
2008
Activation year of the accept4 system call on different architectures
Sparc-64
Kernel developers made
many changes to make the
system call code more generic
19
Our study provides several insightful results
for the development of system calls
20
Our study provides several insightful results
for the development of system calls
20
Complex system calls
Sibling system calls
The cost of supporting several architectures
Race graph
Race graphs based on prior bugs
21
Race graphs based on prior bugs
21
Race graph of memory management system calls
mprotect
move_pages migrate_pages
mremap
mmap
truncatemunmapmlock
swapoff
swapon
set_mempolicy
Other components
Our findings are of value to other UNIX-
based operating systems
22
Our findings are of value to other UNIX-
based operating systems
22
# of system calls447 393
Our findings are of value to other UNIX-
based operating systems
22
# of system calls447 393
# of system calls with the same signature 199199 (44%)
Our findings are of value to other UNIX-
based operating systems
22
# of system calls447 393
# of system calls with the same signature 199199 (44%)
# of system calls provide similar services 164164 (37%)
mojtaba@cs.queensu.ca

Studying a decade of Linux system calls

  • 1.
    
 
 Analyzing a decadeof Linux system calls Mojtaba Bagherzadeh, Nafiseh Kahani, Cor-Paul Bezemer, Ahmed E. Hassan, Juergen Dingel, James R. Cordy Journal First — Empirical Software Engineering Journal
  • 2.
    The Linux kernelforms the central part of various operating systems used by millions of users 2
  • 3.
    The Linux kernelforms the central part of various operating systems used by millions of users 2
  • 4.
    The Linux kernelforms the central part of various operating systems used by millions of users 2
  • 5.
    The Linux kernelprovides its services through 393 system calls File system & I/O Process management IPC4 & network Memory management Signal handling Time operations System info & settings Scheduling Security & capabilities Modules % of system calls 1% 2% 4% 5% 6% 6% 7% 13% 18% 37% 3
  • 6.
    The Linux kernelprovides its services through 393 system calls File system & I/O Process management IPC4 & network Memory management Signal handling Time operations System info & settings Scheduling Security & capabilities Modules % of system calls 1% 2% 4% 5% 6% 6% 7% 13% 18% 37% 3
  • 7.
    System calls areextensively used by most applications 4 Running the ls command requires the execution of 26 different system calls!
  • 8.
    System call executionrequires context switching System call interface User Mode Kernel Mode 5
  • 9.
    System call executionrequires context switching User application call open() System call interface User Mode Kernel Mode 5
  • 10.
    System call executionrequires context switching User application call open() System call interface User Mode Kernel Mode System call table xx sys_open 5
  • 11.
    System call executionrequires context switching User application call open() System call interface User Mode Kernel Mode Service function Implementation of open() System call table xx sys_open 5
  • 12.
    System call executionrequires context switching User application call open() System call interface User Mode Kernel Mode return Service function Implementation of open() System call table xx sys_open 5
  • 13.
    Studying the evolutionof the system calls can lead to valuable insights for several groups 6
  • 14.
    Studying the evolutionof the system calls can lead to valuable insights for several groups Developers 6
  • 15.
    Studying the evolutionof the system calls can lead to valuable insights for several groups Developers Researchers 6
  • 16.
    7 We extracted 8,770changes (commits) of system calls from April 2005 to December 2014
  • 17.
    7 1 Man pages syscalls*.tbl syscall*.S 393 systemcalls names Extract system calls name We extracted 8,770 changes (commits) of system calls from April 2005 to December 2014
  • 18.
    7 1 Man pages syscalls*.tbl syscall*.S 393 systemcalls names Extract system calls name 2 Extract commits (2005-2014) Filter system call commits ≅12k Commits ≅500k Commits Linux Kernel We extracted 8,770 changes (commits) of system calls from April 2005 to December 2014
  • 19.
    7 1 Man pages syscalls*.tbl syscall*.S 393 systemcalls names Extract system calls name 3 ≅12k Commits Manual cleanup ≅9k Commits 2 Extract commits (2005-2014) Filter system call commits ≅12k Commits ≅500k Commits Linux Kernel We extracted 8,770 changes (commits) of system calls from April 2005 to December 2014
  • 20.
    We classified 8,770changes (commits) of system calls 8
  • 21.
    We classified 8,770changes (commits) of system calls 8 3,164 (35%) Refactoring
  • 22.
    We classified 8,770changes (commits) of system calls 8 3,164 (35%) Refactoring 3,247 (36%) Bug fixes
  • 23.
    We classified 8,770changes (commits) of system calls 8 3,164 (35%) Refactoring 2,131 (25%) Improvement 3,247 (36%) Bug fixes
  • 24.
    We classified 8,770changes (commits) of system calls 8 Add/remove 482 (5%) 3,164 (35%) Refactoring 2,131 (25%) Improvement 3,247 (36%) Bug fixes
  • 25.
    We classified 8,770changes (commits) of system calls 8 Add/remove 482 (5%) 3,164 (35%) Refactoring 2,131 (25%) Improvement 3,247 (36%) Bug fixes
  • 26.
    We classified 3,247bug fix related commits Semantic Concurrency Memory Compatibility Error code % of commits 7% 8% 10% 16% 61% 9
  • 27.
    We classified 3,247bug fix related commits Semantic Concurrency Memory Compatibility Error code % of commits 7% 8% 10% 16% 61% 9
  • 28.
    Our study providesseveral insightful results for the development of system calls 10
  • 29.
    Our study providesseveral insightful results for the development of system calls 10 Complex system calls Sibling system calls The cost of supporting several architectures Race graph
  • 30.
    4498 (50%) ofcommits were made to only 25 (6%) of the system calls 11 ptrace() signal() ioctl() futex() ipc() % of commits 3% 3% 5% 8% 8%
  • 31.
    4498 (50%) ofcommits were made to only 25 (6%) of the system calls 11 ptrace() signal() ioctl() futex() ipc() % of commits 3% 3% 5% 8% 8%
  • 32.
    Our study providesseveral insightful results for the development of system calls 12
  • 33.
    Our study providesseveral insightful results for the development of system calls 12 Complex system calls Sibling system calls The cost of supporting several architectures Race graph
  • 34.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13
  • 35.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Parameter extension, e.g, dup, dup2
  • 36.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Parameter extension, e.g, dup, dup2 Architecture support, e.g, truncate, truncate64
  • 37.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Working directory, e.g., open, openat Parameter extension, e.g, dup, dup2 Architecture support, e.g, truncate, truncate64
  • 38.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Backward compatibility, e.g., vm86, vm86old Working directory, e.g., open, openat Parameter extension, e.g, dup, dup2 Architecture support, e.g, truncate, truncate64
  • 39.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Backward compatibility, e.g., vm86, vm86old Working directory, e.g., open, openat Real time support e.g, sigreturn, rtsigreturn Parameter extension, e.g, dup, dup2 Architecture support, e.g, truncate, truncate64
  • 40.
    A sibling callis a system call that is similar in functionality, often in name, to another system call 13 Backward compatibility, e.g., vm86, vm86old Working directory, e.g., open, openat Real time support e.g, sigreturn, rtsigreturn E.g., waitpid(), wait4() Parameter extension, e.g, dup, dup2 Architecture support, e.g, truncate, truncate64
  • 41.
    53% of thenew system calls and 26% of the existing system calls are sibling calls Parameter extension Architecture Working directory Backwards compatibility Real time Others % of sibling calls 29% 8% 6% 14% 31% 12% 14
  • 42.
    In most cases,sibling calls are a repayment of technical debt in the Linux kernel API 15
  • 43.
    Kernel developers useflag and struct parameter to minimize the number of new sibling calls 16
  • 44.
    Our study providesseveral insightful results for the development of system calls 17
  • 45.
    Our study providesseveral insightful results for the development of system calls 17 Complex system calls Sibling system calls The cost of supporting several architectures Race graph
  • 46.
    A new systemcall is usually not activated for all architectures at the same time 18 2010 2013 XtensaARM
  • 47.
    A new systemcall is usually not activated for all architectures at the same time 18 2010 2013 XtensaARM 2008 Activation year of the accept4 system call on different architectures Sparc-64
  • 48.
    A new systemcall is usually not activated for all architectures at the same time 18 2010 2013 XtensaARM 2008 Activation year of the accept4 system call on different architectures Sparc-64
  • 49.
    A new systemcall is usually not activated for all architectures at the same time 18 2010 2013 XtensaARM 2008 Activation year of the accept4 system call on different architectures Sparc-64
  • 50.
    Kernel developers made manychanges to make the system call code more generic 19
  • 51.
    Our study providesseveral insightful results for the development of system calls 20
  • 52.
    Our study providesseveral insightful results for the development of system calls 20 Complex system calls Sibling system calls The cost of supporting several architectures Race graph
  • 53.
    Race graphs basedon prior bugs 21
  • 54.
    Race graphs basedon prior bugs 21 Race graph of memory management system calls mprotect move_pages migrate_pages mremap mmap truncatemunmapmlock swapoff swapon set_mempolicy Other components
  • 55.
    Our findings areof value to other UNIX- based operating systems 22
  • 56.
    Our findings areof value to other UNIX- based operating systems 22 # of system calls447 393
  • 57.
    Our findings areof value to other UNIX- based operating systems 22 # of system calls447 393 # of system calls with the same signature 199199 (44%)
  • 58.
    Our findings areof value to other UNIX- based operating systems 22 # of system calls447 393 # of system calls with the same signature 199199 (44%) # of system calls provide similar services 164164 (37%)
  • 64.