2. Current Ext4 Journaling
• On File System (FS) failure, disk
contents can be corrupted
• Journals keep data consistent
during failure by writing data
twice
• Write (Journal)-> commit -> write
• Failure will cause data to either
exist consistently or not at all
• Ordered Mode only journals
metadata, but ensures data is
written to disk first
3. Current Ext4 Journaling cont…
• Sometimes, we do not need to journal the data, only the metadata:
ie. data corruption is OK, breaking the directory tree is not OK
• Ordered Mode is default,
reduces the amount of
double writing, but allows
data corruption.
• Data mode is very slow
• Unordered mode exists, but
is much more dangerous
4. Current Ext4 Journaling cont…
• Fsync system call explicitly flushes OS
in-memory files to disk through Ext4’s
journaling mechanism
• Write barriers then forces a flush-to-disk
call after journal is sent to disk
• This ensures the journal is on
non-volatile disk area (instead of volatile
disk caches)
5. PROBLEM
• After OTA, SSHD NAND cache is filled with OTA data
• Dex2oat does ahead-of-time compilation for Android apps
• Dex2oat calls fdatasync (similar to fsync) at regular intervals,
causing disk flushes
• Since NAND is full, every fsync causes all dirty data on SSHD Cache
(upto 64MB) to be flushed to platter
• Fsync therefor causes a synchronous IO block, preempting any other
disk reads and writes
• Causes huge amount of sluggishness at user experience side
6. Disabling write barrier
• Allows disk to reorder cache-to-disk writes
• Does not block disk reads while writes are queued to disk
• Risks:
• On power failure we can not longer ensure journal is consistent, as volatile cache
will be lost
• Since only metadata is journaled, we can potentially introduce filesystem
corruption
• However…
• Filesystem metadata is rarely written to compared to data
• Disk drive uses a timeout system for cache-to-disk writes
• Power failures are uncommon as a set top box device
7. Dex2oat Fsync latency
HDD mounted with barrier
300ms latency
HDD mounted with
nobarrier
105ms latency
8. Androbench SQLite
HDD mounted with barrier
Transactions Per Second (TPS)
HDD mounted with nobarrier
Transactions Per Second (TPS)
9. SQLite Fsync latency
EMMC mounted with barrier
860us latency
EMMC mounted with
nobarrier
361us latency
10. Androbench SQLite
EMMC mounted with barrier
Transactions Per Second (TPS)
EMMC mounted with nobarrier
Transactions Per Second (TPS)