Guest-lecture given at UC Davis during my interview day in May of 2018.
Description: Using the printf() function is one of the very first steps every beginner learns when taking a programming class. It is also one of the most ubiquitous functions in software programs, across the many languages that define it. But how many programmers actually know how this common function works behind the scenes?
During this lecture, I will trace a brief history of printf(), delve into the nuts of bolts of a simple implementation through interactive coding, and branch out into interesting facts related to this function.
printf("%s from %c to Z, in %d minutes!\n", "printf", 'A', 45);
1. printf("%s from %c to Z, in %d minutes!n",
"printf", 'A', 45);
Joรซl Porquet, PhD
UC Davis - May 15th, 2018
Copyright ยฉ 2018 Joรซl Porquet - CC BY-NC-SA 4.0 International License 1 / 18
2. Multi-talented printf()
101
printf("Hello world!n");
$ ./a.out
Hello world!
201
printf("%s from %c to Z, in %d minutes!n", "printf", 'A', 45);
$ ./a.out
printf from A to Z, in 45 minutes!
pow(101, 2)
int i;
printf(" %n", &i);
printf("%sbD is 033[1;31m#%d033[0m!n", "UCB", i);
2 / 18
3. printf(): an odyssey
Fortran I
Special statement for building formatting descriptions:
WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA
601 FORMAT (4H A= ,I5,5H B= ,I5,5H C= ,I5,8H AREA= ,F10.2,
+ 13H SQUARE UNITS)
(Approximate) translation in C:
printf(" A= %5d B= %5d C= %5d AREA= %10.2f SQUARE UNITS", a, b, c, area);
BCPL
Printing and formatting are merged into a single statement:
WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT)
(Approximate) translation in C:
printf("%2d-queens problem has %5d solutionsn", numqueens, count);
3 / 18
4. printf(): an odyssey
C
printf("Hello %s, you are %d years oldn", name, age);
Trickle-down string formatting
Unix printf
$ printf "%s, stop lying; you're not %d!n" Bob 21
Bob, stop lying; you're not 21!
$
Other languages...
awk, C++, Objective C, D, F#, G (LabVIEW), GNU MathProg, GNU Octave, Go, Haskell, J, Java
(since version 1.5) and JVM languages (Clojure, Scala), Lua (string.format), Maple, MATLAB,
Max (via the sprintf object), Mythryl, PARI/GP, Perl, PHP, Python (via % operator), R,
Red/System, Ruby, Tcl (via format command), Transact-SQL (via xp_sprintf), Vala.
4 / 18
5. Output
$ cowsay "I love lectures about printf!"
_______________________________
< I love lectures about printf! >
-------------------------------
^__^
(oo)_______
(__) )/
||----w |
|| ||
$
$ cowthink -f vader "I wish there was free pizza too..."
____________________________________
( I wish there was free pizza too... )
------------------------------------
o ,-^-.
o !oYo!
o /./=.______
## )/
||-----w||
|| ||
Cowth Vader
$
$ du /bin/* | sort -rn
20672 /bin/js52
19164 /bin/inkscape
19136 /bin/inkview
18720 /bin/node
18684 /bin/clementine
17452 /bin/mariabackup
17112 /bin/mysqld
16432 /bin/mysql_client_test_embedded
16332 /bin/mysql_embedded
16232 /bin/mysqltest_embedded
7600 /bin/gdb
...
Introspection
Booting kernel from Legacy Image at 20080000 ...
Image Name: Linux-2.6.37
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1256880 Bytes = 1.2 MiB
Load Address: 20008000
Entry Point: 20008000
Verifying Checksum ... OK
Loading Kernel Image ... OK
OK
Starting kernel ...
Uncompressing Linux... done, booting the kernel.
Linux version 2.6.37 (nkinar at matilda) (gcc version 4.3.5 (Buildro
ot 2011.02) ) #3 Sat Apr 2 17:28:21 CST 2011
CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177
CPU: VIVT data cache, VIVT instruction cache
Machine: Atmel AT91SAM9RL-EK
Memory policy: ECC disabled, Data cache writeback
Clocks: CPU 200 MHz, master 100 MHz, main 12.000 MHz
Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 16256
Kernel command line: console=ttyS0,115200
mtdparts=flash:10M(kernel),100M(root),-(storage) rw rootfstype=ubifs
PID hash table entries: 256 (order: -2, 1024 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 64MB = 64MB total
Memory: 62348k/62348k available, 3188k reserved, 0K highmem
...
Why printf()?
5 / 18
6. Tell me where you printf()!
Typical GNU/Linux computer
/usr/bin/date +"%Y" /usr/bin/xterm
User-space
Kernel-space
...
printf("2018n")
...
(Insanely) complex code
dealing with consoles...
Ingo Molnรกr (Linux kernel core developer): "The tty layer is one of the very few pieces of
kernel code that scares the hell out of me :-)"
6 / 18
7. Tell me where you printf()!
Embedded systems or when display is not available
/usr/bin/date +"%Y"
User-space
Kernel-space
...
printf("2018n")
...
Same (insanely) complex code
dealing with consoles...
Serial device (UART)
driver
HardwareSoftware
Characters are sent one by one over a serial port
7 / 18
9. Playstation 4 Linux on PS4
PS4 controller
A serial port on every board?
9 / 18
10. How to print?
putchar(): the cornerstone
/*
* Code for IBM-PC x86: write character to COM1 serial port
* (taken from NuttX)
*/
void putchar(char ch)
{
/* Wait until the Transmitter Holding Register (THR) is empty. */
while ((inb(COM1_PORT+COM_LSR) & LSR_THRE) == 0);
/* Then output the character to the THR */
outb(ch, COM1_PORT+COM_THR);
}
10 / 18
11. How to print?
No formatting is simply puts()
void my_printf(char *str)
{
/* Just print all the characters until the NULL terminator */
while (*str)
putchar(*str++);
}
int main(void)
{
/* printf 101 */
my_printf("Hello world!n");
}
$ ./a.out
Hello world!
$
11 / 18
12. How to print?
With formatting...
Input:
Output: printf from A to Z, in 45 minutes!n
For each placeholder, need to retrieve next parameter
Depending on placeholder:
'%s': get string
'%c': get character
'%d': convert integer into characters
12 / 18
13. Variadic functions
Example: function prototype
#include <stdarg.h> /* Macro/function definitions for variadic functions */
#include <stdio.h>
/*
* Sum a variable number of integers
* @count: the number of integer parameters
*
* Receive @count integers, sum them up and return the result
*/
int sum_ints(int count, ...)
{
/* ??? */
}
int main(void)
{
/* sum up 3 integers: 10, 20 and 30 */
printf("Sum is %dn", sum_ints(3, 10, 20, 30));
/* sum up 5 integers: 10, 20, 30, 40, 50 */
printf("Sum is %dn", sum_ints(5, 10, 20, 30, 40, 50));
return 0;
}
13 / 18
14. Variadic functions
Example: function implementation
int sum_ints(int count, ...)
{
int i, sum = 0;
va_list ap;
va_start(ap, count); /* Init variable parameter list */
for (i = 0; i < count; i++) {
int n = va_arg(ap, int); /* Get next integer parameter */
sum += n;
}
va_end(ap); /* Clean up parameter list */
return sum;
}
14 / 18
16. printf("Tell me more!n");
Complex family of functions
Call graph from musl (a lightweight libc implementation):
vfprintf() in various C standard libraries:
glibc: ~2400 SLOC
uclibc: ~2000 SLOC
musl: ~700 SLOC
16 / 18
17. #include <stdio.h>
int main(void)
{
char name[256];
fgets(name, 256, stdin);
printf("Your name is: ");
printf(name);
return 0;
}
#include <stdio.h>
int main(void)
{
char name[256];
fgets(name, 256, stdin);
printf("Your name is: ");
printf("%s", name);
return 0;
}
$ ./a.out
%s%s%s%s%s
Your name is: %s%s%s%s%s
$ ./a.out
%s%s%s%s%s
Segmentation fault (core dumped)
$ ./a.out
%08x.%08x.%08x.%08x.%08x.%08x
Your name is: 00000020.00000000.00000000.0000
0003.f7fbc4c0.78383025
printf("Tell me more!n");
Security vulnerability: Uncontrolled format string
17 / 18