SlideShare a Scribd company logo
1 of 73
Download to read offline
Write Amplification: An Analysis of In-Memory
Database Durability Techniques
Jaemyung Kim, Kenneth Salem, Khuzaima Daudjee
University of Waterloo
IMDM 2015
Durability Matters
OLTP IMDB
cliparts from openclipart.org
2
Durability Matters
OLTP IMDB
I/O is unavoidable!
(ACID, Durability)
cliparts from openclipart.org
2
Durability Matters
OLTP IMDB
I/O is unavoidable!
(ACID, Durability)
Orders of magnitude faster!
cliparts from openclipart.org
2
Durability Matters
OLTP IMDB
I/O is unavoidable!
(ACID, Durability)
Orders of magnitude faster!
Write I/O efficiency of in-memory
DBMS is an important issue.
cliparts from openclipart.org
2
Write Amplification?
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
3
Write Amplification?
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
3
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Goal of Write Amplification Model
Quantify and compare the I/O efficiency of the persistent
storage management schemes
Provide us with some insight into the different natures of
update-in-place and copy-on-write storage managers
Lower cost for operating a database management system
(contributed by improved I/O efficiency)
Lead to better system performance in situations that I/O
capacity is constrained (restart recovery)
The following is not our goals:
Emulate a specific storage manager implementation
Compare specific implemenations: e.g., Hekaton is better than
H-Store
4
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
5
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J 0 0
DB LOG
Memory:
Disk:
LOG I/O History:
DB I/O History:
I/O Per Update = #DBIO+#LogIO
#Updates =
6
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J I 0
DB LOG
Memory:
Disk:
LOG I/O History: I
DB I/O History:
I/O Per Update = #DBIO+#LogIO
#Updates =0+1
1 = 1
6
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J I D
DB LOG
Memory:
Disk:
LOG I/O History: I,D
DB I/O History:
I/O Per Update = #DBIO+#LogIO
#Updates =0+2
2 = 1
6
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J 0 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D
DB I/O History: C,D,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =4+2
2 = 3
6
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J B 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D,B
DB I/O History: C,D,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =4+3
3 ≈ 2.33
6
UIP: Page-level Checkpoint
Example Space Constraint (α) = 1.2×DBSize, PageSize=2
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J B 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D,B
DB I/O History: C,D,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =4+3
3 ≈ 2.33
6
UIP-S: Snapshot Checkpoint
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J I D
DB LOG
Memory:
Disk:
LOG I/O History: I,D
DB I/O History:
I/O Per Update = #DBIO+#LogIO
#Updates =0+2
2 = 1
7
UIP-S: Snapshot Checkpoint
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J 0 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D
DB I/O History: A,B,C,D,E,F,G,H,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =10+2
2 = 6
7
UIP-S: Snapshot Checkpoint
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J B 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D,B
DB I/O History: A,B,C,D,E,F,G,H,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =10+3
3 ≈ 4.33
7
UIP-S: Snapshot Checkpoint
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
A B C D E F G H I J B 0
DB LOG
Memory:
Disk:
LOG I/O History: I,D,B
DB I/O History: A,B,C,D,E,F,G,H,I,J
I/O Per Update = #DBIO+#LogIO
#Updates =10+3
3 ≈ 4.33
7
Architectural Diversity in IMDB SM
Two broad classes: Update In-Place and Copy-On-Write
Update In-Place (UIP)
UIP: conventional page-based (e.g., Shore-MT)
random writes for checkpointing
device sensitive: e.g., HDD vs. SSD
UIP-S: snapshot checkpointing (e.g., H-Store, SiloR)
Copy-On-Write (COW)
COW-D: logging only (log-structured) database (e.g.,
Hekaton)
COW-M: log-structured memory and disk datbases (e.g.,
RAMCloud)
8
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
H A E I F G D C B J 0 0
Log-structured DB
Memory:
Disk:
DB I/O History:
Read:
Write:
I/O Per Update = #ReadIO+#WriteIO
#Updates =
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
H A E I F G D C B J I 0
Log-structured DB
Memory:
Disk:
DB I/O History:
Read:
Write: I
I/O Per Update = #ReadIO+#WriteIO
#Updates =0+1
1 = 1
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
H A E I F G D C B J I D
Log-structured DB
Memory:
Disk:
DB I/O History:
Read:
Write: I,D
I/O Per Update = #ReadIO+#WriteIO
#Updates =0+2
2 = 1
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
I F G D C B J I D H A E
Log-structured DB
Memory:
Disk:
DB I/O History:
Read: H,A,E
Write: I,D,H,A,E
I/O Per Update = #ReadIO+#WriteIO
#Updates =3+5
2 = 4
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
F G D C B J I D H A E 0
Log-structured DB
Memory:
Disk:
DB I/O History:
Read: H,A,E,I
Write: I,D,H,A,E
I/O Per Update = #ReadIO+#WriteIO
#Updates =4+5
2 = 4.5
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
F G D C B J I D H A E B
Log-structured DB
Memory:
Disk:
DB I/O History:
Read: H,A,E,I
Write: I,D,H,A,E,B
I/O Per Update = #ReadIO+#WriteIO
#Updates =4+6
3 ≈ 3.33
9
COW-D: Log-structured Disk
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
A B C D E F G H I J
F G D C B J I D H A E B
Log-structured DB
Memory:
Disk:
DB I/O History:
Read: H,A,E,I
Write: I,D,H,A,E,B
I/O Per Update = #ReadIO+#WriteIO
#Updates =4+6
3 ≈ 3.33
9
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
H A E I F G D C B J I
H A E I F G D C B J 0 0
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write:
I/O Per Update = #WriteIO
#Updates =
10
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
H A E F G D C B J I D
H A E I F G D C B J I 0
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write: I
I/O Per Update = #WriteIO
#Updates =1
1 = 1
10
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
H A E F G C B J I D B
H A E I F G D C B J I D
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write: I,D
I/O Per Update = #WriteIO
#Updates =2
2 = 1
10
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
F G C B J I D H A E B
F G D C B J I D H A E 0
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write: I,D,H,A,E
I/O Per Update = #WriteIO
#Updates =5
2 = 2.5
10
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
F G C J I D H A E B
F G D C B J I D H A E B
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write: I,D,H,A,E,B
I/O Per Update = #WriteIO
#Updates =6
3 = 2
10
COW-M: Log-structured Memory
Example Space Constraint (α) = 1.2× DB Size
Example Update Sequence: I,D,B
F G C J I D H A E B
F G D C B J I D H A E B
Log-structured IMDB
Log-structured DB
Memory:
Disk:
DB I/O History:
Write: I,D,H,A,E,B
I/O Per Update = #WriteIO
#Updates =6
3 = 2
10
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
What We Can Do Using WAF Model
Compare UIP and COW
Analyze the effect of persistent storage utilization (α)
Analyze the effect of update workload (uniform vs. skew)
Analyze the effect of page size on UIP
Find best page size for storage-specific performance
characteristic
Analyze the effect of SSD vs. HDD persistent storage
Weight random and sequential I/O differently, depending on
the device type
In-memory
Database
Persistent Storage
On-disk
Database
λP
λ
Application
I/O Per Update = λP
λ
11
Result (Upate In-Place)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
q
q
q
q
q
q
q
q
q
q
q q q q q q
q
q
UIP(HDD)
UIP(SSD)
UIP(HDD)−SKEW
UIP(SSD)−SKEW
UIP−S
HDD
SSD
12
Result (Upate In-Place)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
q
q
q
q
q
q
q
q
q
q
q q q q q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
UIP(HDD)
UIP(SSD)
UIP(HDD)−SKEW
UIP(SSD)−SKEW
UIP−S
HDD
SSD
skew
12
Result (Upate In-Place)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
q
q
q
q
q
q
q
q
q
q
q q q q q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
UIP(HDD)
UIP(SSD)
UIP(HDD)−SKEW
UIP(SSD)−SKEW
UIP−S
HDD
SSD
skew
skew
12
Result (Upate In-Place)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
q
q
q
q
q
q
q
q
q
q
q q q q q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
UIP(HDD)
UIP(SSD)
UIP(HDD)−SKEW
UIP(SSD)−SKEW
UIP−S
UIP-S
12
Result (Copy-On-Write)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
COW−D
COW−M
UIP−S
COW-D
13
Result (Copy-On-Write)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
COW−D
COW−M
UIP−S
COW-D
COW-M
13
Result (COW vs. UIP)
1.0 1.2 1.4 1.6 1.8 2.0
1251020
α
IOPerUpdate
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q q q q q q
q
COW−D
COW−M
UIP−S
COW-D
COW-M
UIP-S
14
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Summary and Discussion
Write amplification model for in-memory database storage
managers
Quantify and compare the I/O efficiency of two broad classes
of storage managers (UIP,COW)
UIP (page-level)
commonly used in disk-based database
effective when the capacity of persistent storage (or recovery
time) is tightly constrained and/or highly skewed (e.g., 99:1)
UIP-S
appealing option among UIPs (only sequential writes)
insensitive to update skew
COW-M
the most I/O efficient (lowest write amplification) in many
settings
require a log structure in memory that mirrors the log
structure in persistent storage
15
Mahalo!
16

More Related Content

What's hot

A First Look at the DB2 10 DSNZPARM Changes
A First Look at the DB2 10 DSNZPARM ChangesA First Look at the DB2 10 DSNZPARM Changes
A First Look at the DB2 10 DSNZPARM ChangesWillie Favero
 
DB2UDB_the_Basics Day 6
DB2UDB_the_Basics Day 6DB2UDB_the_Basics Day 6
DB2UDB_the_Basics Day 6Pranav Prakash
 
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsDB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsFlorence Dubois
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...John Campbell
 
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance AnalysisIBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysisbrettallison
 
DB2 Accounting Reporting
DB2  Accounting ReportingDB2  Accounting Reporting
DB2 Accounting ReportingJohn Campbell
 
Getting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleGetting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleNizameddin Ordulu
 
HBase Incremental Backup
HBase Incremental BackupHBase Incremental Backup
HBase Incremental BackupLee neal
 
Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero
 Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero
Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - faveroWillie Favero
 
ALL ABOUT DB2 DSNZPARM
ALL ABOUT DB2 DSNZPARMALL ABOUT DB2 DSNZPARM
ALL ABOUT DB2 DSNZPARMIBM
 
Db2 for z os trends
Db2 for z os trendsDb2 for z os trends
Db2 for z os trendsCuneyt Goksu
 
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...PingCAP
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity PlanningMartin Packer
 
DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2Pranav Prakash
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity PlanningMartin Packer
 
11 cool features in Defrag.nsf+ 11
11 cool features in Defrag.nsf+ 1111 cool features in Defrag.nsf+ 11
11 cool features in Defrag.nsf+ 11aosborne
 
DB2UDB_the_Basics Day 7
DB2UDB_the_Basics Day 7DB2UDB_the_Basics Day 7
DB2UDB_the_Basics Day 7Pranav Prakash
 
Db2 analytics accelerator technical update
Db2 analytics accelerator  technical updateDb2 analytics accelerator  technical update
Db2 analytics accelerator technical updateCuneyt Goksu
 

What's hot (18)

A First Look at the DB2 10 DSNZPARM Changes
A First Look at the DB2 10 DSNZPARM ChangesA First Look at the DB2 10 DSNZPARM Changes
A First Look at the DB2 10 DSNZPARM Changes
 
DB2UDB_the_Basics Day 6
DB2UDB_the_Basics Day 6DB2UDB_the_Basics Day 6
DB2UDB_the_Basics Day 6
 
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsDB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
 
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance AnalysisIBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysis
 
DB2 Accounting Reporting
DB2  Accounting ReportingDB2  Accounting Reporting
DB2 Accounting Reporting
 
Getting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleGetting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scale
 
HBase Incremental Backup
HBase Incremental BackupHBase Incremental Backup
HBase Incremental Backup
 
Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero
 Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero
Universal Table Spaces for DB2 10 for z/OS - IOD 2010 Seesion 1929 - favero
 
ALL ABOUT DB2 DSNZPARM
ALL ABOUT DB2 DSNZPARMALL ABOUT DB2 DSNZPARM
ALL ABOUT DB2 DSNZPARM
 
Db2 for z os trends
Db2 for z os trendsDb2 for z os trends
Db2 for z os trends
 
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity Planning
 
DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity Planning
 
11 cool features in Defrag.nsf+ 11
11 cool features in Defrag.nsf+ 1111 cool features in Defrag.nsf+ 11
11 cool features in Defrag.nsf+ 11
 
DB2UDB_the_Basics Day 7
DB2UDB_the_Basics Day 7DB2UDB_the_Basics Day 7
DB2UDB_the_Basics Day 7
 
Db2 analytics accelerator technical update
Db2 analytics accelerator  technical updateDb2 analytics accelerator  technical update
Db2 analytics accelerator technical update
 

Similar to Write Amplification: An Analysis of In-Memory Database Durability Techniques

IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...In-Memory Computing Summit
 
Leveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN PerformanceLeveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN Performancebrettallison
 
Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Anthony Vandewerdt
 
Dell Storage Management
Dell Storage ManagementDell Storage Management
Dell Storage ManagementDell World
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Resume_CQ_Edward
Resume_CQ_EdwardResume_CQ_Edward
Resume_CQ_Edwardcaiqi wang
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar storesIstvan Szukacs
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar storesIstvan Szukacs
 
SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)Michael Pirker
 
Truly non-intrusive OpenStack Cinder backup for mission critical systems
Truly non-intrusive OpenStack Cinder backup for mission critical systemsTruly non-intrusive OpenStack Cinder backup for mission critical systems
Truly non-intrusive OpenStack Cinder backup for mission critical systemsDipak Kumar Singh
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Community
 
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Principled Technologies
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebula Project
 
A4 oracle's application engineered storage your application advantage
A4   oracle's application engineered storage your application advantageA4   oracle's application engineered storage your application advantage
A4 oracle's application engineered storage your application advantageDr. Wilfred Lin (Ph.D.)
 
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2 VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2 VMworld
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Large Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint DeploymentsLarge Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint DeploymentsJoel Oleson
 

Similar to Write Amplification: An Analysis of In-Memory Database Durability Techniques (20)

IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
 
Leveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN PerformanceLeveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN Performance
 
Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000
 
Low-level Graphics APIs
Low-level Graphics APIsLow-level Graphics APIs
Low-level Graphics APIs
 
Dell Storage Management
Dell Storage ManagementDell Storage Management
Dell Storage Management
 
Ceph
CephCeph
Ceph
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Resume_CQ_Edward
Resume_CQ_EdwardResume_CQ_Edward
Resume_CQ_Edward
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
 
SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)
 
Truly non-intrusive OpenStack Cinder backup for mission critical systems
Truly non-intrusive OpenStack Cinder backup for mission critical systemsTruly non-intrusive OpenStack Cinder backup for mission critical systems
Truly non-intrusive OpenStack Cinder backup for mission critical systems
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
Performance of persistent apps on Container-Native Storage for Red Hat OpenSh...
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
 
A4 oracle's application engineered storage your application advantage
A4   oracle's application engineered storage your application advantageA4   oracle's application engineered storage your application advantage
A4 oracle's application engineered storage your application advantage
 
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2 VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2
VMworld 2013: Enterprise Architecture Design for VMware Horizon View 5.2
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Large Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint DeploymentsLarge Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint Deployments
 
Hptf 2240 Final
Hptf 2240 FinalHptf 2240 Final
Hptf 2240 Final
 

More from Jaemyung Kim

Database High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsDatabase High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsJaemyung Kim
 
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting StrategyReducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting StrategyJaemyung Kim
 
Altibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEAltibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEJaemyung Kim
 
Implementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationImplementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationJaemyung Kim
 
IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석Jaemyung Kim
 
정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어Jaemyung Kim
 

More from Jaemyung Kim (6)

Database High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsDatabase High Availability Using SHADOW Systems
Database High Availability Using SHADOW Systems
 
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting StrategyReducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy
Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy
 
Altibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEAltibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPE
 
Implementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationImplementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance Evaluation
 
IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석
 
정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어
 

Recently uploaded

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2
 
WSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2
 
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 

Recently uploaded (20)

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital Businesses
 
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 

Write Amplification: An Analysis of In-Memory Database Durability Techniques

  • 1. Write Amplification: An Analysis of In-Memory Database Durability Techniques Jaemyung Kim, Kenneth Salem, Khuzaima Daudjee University of Waterloo IMDM 2015
  • 2. Durability Matters OLTP IMDB cliparts from openclipart.org 2
  • 3. Durability Matters OLTP IMDB I/O is unavoidable! (ACID, Durability) cliparts from openclipart.org 2
  • 4. Durability Matters OLTP IMDB I/O is unavoidable! (ACID, Durability) Orders of magnitude faster! cliparts from openclipart.org 2
  • 5. Durability Matters OLTP IMDB I/O is unavoidable! (ACID, Durability) Orders of magnitude faster! Write I/O efficiency of in-memory DBMS is an important issue. cliparts from openclipart.org 2
  • 8. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 9. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 10. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 11. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 12. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 13. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 14. Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
  • 15. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 16. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 17. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 18. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 19. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 20. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 21. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 22. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 23. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 5
  • 24. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J 0 0 DB LOG Memory: Disk: LOG I/O History: DB I/O History: I/O Per Update = #DBIO+#LogIO #Updates = 6
  • 25. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J I 0 DB LOG Memory: Disk: LOG I/O History: I DB I/O History: I/O Per Update = #DBIO+#LogIO #Updates =0+1 1 = 1 6
  • 26. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J I D DB LOG Memory: Disk: LOG I/O History: I,D DB I/O History: I/O Per Update = #DBIO+#LogIO #Updates =0+2 2 = 1 6
  • 27. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J 0 0 DB LOG Memory: Disk: LOG I/O History: I,D DB I/O History: C,D,I,J I/O Per Update = #DBIO+#LogIO #Updates =4+2 2 = 3 6
  • 28. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J B 0 DB LOG Memory: Disk: LOG I/O History: I,D,B DB I/O History: C,D,I,J I/O Per Update = #DBIO+#LogIO #Updates =4+3 3 ≈ 2.33 6
  • 29. UIP: Page-level Checkpoint Example Space Constraint (α) = 1.2×DBSize, PageSize=2 Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J B 0 DB LOG Memory: Disk: LOG I/O History: I,D,B DB I/O History: C,D,I,J I/O Per Update = #DBIO+#LogIO #Updates =4+3 3 ≈ 2.33 6
  • 30. UIP-S: Snapshot Checkpoint Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J I D DB LOG Memory: Disk: LOG I/O History: I,D DB I/O History: I/O Per Update = #DBIO+#LogIO #Updates =0+2 2 = 1 7
  • 31. UIP-S: Snapshot Checkpoint Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J 0 0 DB LOG Memory: Disk: LOG I/O History: I,D DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = #DBIO+#LogIO #Updates =10+2 2 = 6 7
  • 32. UIP-S: Snapshot Checkpoint Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J B 0 DB LOG Memory: Disk: LOG I/O History: I,D,B DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = #DBIO+#LogIO #Updates =10+3 3 ≈ 4.33 7
  • 33. UIP-S: Snapshot Checkpoint Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J A B C D E F G H I J B 0 DB LOG Memory: Disk: LOG I/O History: I,D,B DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = #DBIO+#LogIO #Updates =10+3 3 ≈ 4.33 7
  • 34. Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP: conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S: snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D: logging only (log-structured) database (e.g., Hekaton) COW-M: log-structured memory and disk datbases (e.g., RAMCloud) 8
  • 35. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J H A E I F G D C B J 0 0 Log-structured DB Memory: Disk: DB I/O History: Read: Write: I/O Per Update = #ReadIO+#WriteIO #Updates = 9
  • 36. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J H A E I F G D C B J I 0 Log-structured DB Memory: Disk: DB I/O History: Read: Write: I I/O Per Update = #ReadIO+#WriteIO #Updates =0+1 1 = 1 9
  • 37. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J H A E I F G D C B J I D Log-structured DB Memory: Disk: DB I/O History: Read: Write: I,D I/O Per Update = #ReadIO+#WriteIO #Updates =0+2 2 = 1 9
  • 38. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J I F G D C B J I D H A E Log-structured DB Memory: Disk: DB I/O History: Read: H,A,E Write: I,D,H,A,E I/O Per Update = #ReadIO+#WriteIO #Updates =3+5 2 = 4 9
  • 39. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J F G D C B J I D H A E 0 Log-structured DB Memory: Disk: DB I/O History: Read: H,A,E,I Write: I,D,H,A,E I/O Per Update = #ReadIO+#WriteIO #Updates =4+5 2 = 4.5 9
  • 40. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J F G D C B J I D H A E B Log-structured DB Memory: Disk: DB I/O History: Read: H,A,E,I Write: I,D,H,A,E,B I/O Per Update = #ReadIO+#WriteIO #Updates =4+6 3 ≈ 3.33 9
  • 41. COW-D: Log-structured Disk Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B A B C D E F G H I J F G D C B J I D H A E B Log-structured DB Memory: Disk: DB I/O History: Read: H,A,E,I Write: I,D,H,A,E,B I/O Per Update = #ReadIO+#WriteIO #Updates =4+6 3 ≈ 3.33 9
  • 42. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B H A E I F G D C B J I H A E I F G D C B J 0 0 Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I/O Per Update = #WriteIO #Updates = 10
  • 43. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B H A E F G D C B J I D H A E I F G D C B J I 0 Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I I/O Per Update = #WriteIO #Updates =1 1 = 1 10
  • 44. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B H A E F G C B J I D B H A E I F G D C B J I D Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I,D I/O Per Update = #WriteIO #Updates =2 2 = 1 10
  • 45. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B F G C B J I D H A E B F G D C B J I D H A E 0 Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I,D,H,A,E I/O Per Update = #WriteIO #Updates =5 2 = 2.5 10
  • 46. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B F G C J I D H A E B F G D C B J I D H A E B Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I,D,H,A,E,B I/O Per Update = #WriteIO #Updates =6 3 = 2 10
  • 47. COW-M: Log-structured Memory Example Space Constraint (α) = 1.2× DB Size Example Update Sequence: I,D,B F G C J I D H A E B F G D C B J I D H A E B Log-structured IMDB Log-structured DB Memory: Disk: DB I/O History: Write: I,D,H,A,E,B I/O Per Update = #WriteIO #Updates =6 3 = 2 10
  • 48. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 49. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 50. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 51. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 52. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 53. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 54. What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization (α) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type In-memory Database Persistent Storage On-disk Database λP λ Application I/O Per Update = λP λ 11
  • 55. Result (Upate In-Place) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q q q q q q q q q q q q q q q q q q q UIP(HDD) UIP(SSD) UIP(HDD)−SKEW UIP(SSD)−SKEW UIP−S HDD SSD 12
  • 56. Result (Upate In-Place) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q UIP(HDD) UIP(SSD) UIP(HDD)−SKEW UIP(SSD)−SKEW UIP−S HDD SSD skew 12
  • 57. Result (Upate In-Place) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q UIP(HDD) UIP(SSD) UIP(HDD)−SKEW UIP(SSD)−SKEW UIP−S HDD SSD skew skew 12
  • 58. Result (Upate In-Place) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q UIP(HDD) UIP(SSD) UIP(HDD)−SKEW UIP(SSD)−SKEW UIP−S UIP-S 12
  • 59. Result (Copy-On-Write) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q COW−D COW−M UIP−S COW-D 13
  • 60. Result (Copy-On-Write) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q COW−D COW−M UIP−S COW-D COW-M 13
  • 61. Result (COW vs. UIP) 1.0 1.2 1.4 1.6 1.8 2.0 1251020 α IOPerUpdate q q q q q q q q q q q q q q q q q q q q q q q COW−D COW−M UIP−S COW-D COW-M UIP-S 14
  • 62. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 63. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 64. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 65. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 66. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 67. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 68. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 69. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 70. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 71. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15
  • 72. Summary and Discussion Write amplification model for in-memory database storage managers Quantify and compare the I/O efficiency of two broad classes of storage managers (UIP,COW) UIP (page-level) commonly used in disk-based database effective when the capacity of persistent storage (or recovery time) is tightly constrained and/or highly skewed (e.g., 99:1) UIP-S appealing option among UIPs (only sequential writes) insensitive to update skew COW-M the most I/O efficient (lowest write amplification) in many settings require a log structure in memory that mirrors the log structure in persistent storage 15