2. Special application on top of general OS (Linux)
Problem
2
Linux
PostgreSQL
Hardware
General purpose: Web server, office, game, …
Bottleneck for DB performance
Special need: ACID, high concurrency, fast data access…
4. Solution
V1
Special application on top of special OS (AppOS)
Modified
Linux
PostgreSQL
Hardware
V2
Linux
PostgreSQL
Hardware
Module
Module
V3
Linux
PostgreSQL
Hardware
AppOS
4
6. AppOS provides
system call APIs
PostgreSQL is built
on top of Linux system calls
libhook
libapp
libcore
syscall
PostgreSQL
AppOS
AppOS seamlessly runs
by hooking system calls
open() read()...
libhook
libapp
libcore
PostgreSQL
AppOS
open() read()...
6
AppOS doesn’t require modifications
Technology: Portability
7. Technology: Portability
AppOS is highly portable because it is built on top of Linux ABIs
LinuxLinux Linux
LinuxLinux
7
AppOS runs in diverse environments
11. Technology: Performance
write() write()
T2
Linux File System
MetadataFile Journal
Linux Page Cache
T1
Page
T3
T4
rename()
T5
write()
write()
Only one thread
can write to a file
Complex locking
behaviors for file
system consistency
11
Linux imposes unnecessary overheads for file accesses
12. Technology: Performance
write() write()
T2
Linux File System
AppOS Page Cache
T1
Page
T4
rename()
T5
write()
Multiple threads
can concurrently
write to different
positions of a file
Simple locking based
on data consistency
guarantees provided
by DB
Page
Asynchronous
direct I/Os by
I/O workers
12
AppOS decouples page cache from file system
13. Technology: Performance
13
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
14. Technology: Performance
14
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
admission
control
15. Technology: Performance
15
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
admission
control
Block-level Q
16. Technology: Performance
16
Linux I/O path delays high-priority disk I/Os
Application
Storage Device
Caching Layer
File System Layer
Block Layer
Abstraction
Buffer Cache
reorder
FG FG BGBG
read() write()
17. Technology: Performance
17
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
FG FGBG
Device-internal Q
admission
control
19. Technology: Performance
19
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Locks
Condition variables
20. Technology: Performance
20
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer Condition variables
I/OFG
lock
BG
wait
21. Technology: Performance
21
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/OFG
lock
BG
wait
FG
wait
wait
BGvar
wake
22. Technology: Performance
22
Linux I/O path delays high-priority disk I/Os
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/O
FG
wait
wait
BGuser
var
wake
FG
wait
23. Technology: Performance
AppOS libapp
Storage
(I/O class, share)(latency, 1000) (throughput, 500)(latency, 1000)
AppOS I/O Engine
WAL write
Foreground
data read
Background read/write
(e.g., checkpoint, compaction)
Dispatch I/Os based on priority only if
there is no storage congestion
23
AppOS schedules I/Os based on DB-internal priority
24. Technology: Performance
Linux
Cache
PostgreSQL
Cache
Block 1
Data block
Block 2
Storage
1. Crash
happens
Block 1 Block 2
Torn page problem
Block 1
Data block
Block 2
2. Crash happens
Block 1 Block 2 WAL
Full page write
1. Write block
to WAL file
3. Recover upon reboot
2. Data block is corrupted
24
Linux doesn’t support atomic write
2X
Write