SlideShare a Scribd company logo
1 of 48
Download to read offline
Copyright©2018 NTT Corp. All Rights Reserved.
Vacuum More Efficient Than Ever
Masahiko Sawada
NTT Open Source Software Center
@PGCon 2018
2Copyright©2018 NTT Corp. All Rights Reserved.
• Masahiko Sawada

from Tokyo, Japan
• PostgreSQL contributor

Multiple synchronous replication: FIRST and ANY methods (9.6 and
10)

Freeze map (9.6)

Skipping cleanup index vacuum (11)
Who Am I?
3Copyright©2018 NTT Corp. All Rights Reserved.
• What Is Vacuum?
• Three Vacuum Improvements
• Problems
• Solutions
• Challenges
• Evaluations
• Conclusion
Agenda
4Copyright©2018 NTT Corp. All Rights Reserved.
• PostgreSQL garbage collection feature
• Recover or reuse disk space occupied
• VACUUM command
• =# VACUUM tbl1, tbl2;
• =# VACUUM (ANALYZE, VERBOSE) tbl1;
• Auto vacuum
• autovacuum launcher process
• autovacuum worker processes
What Is Vacuum?
5Copyright©2018 NTT Corp. All Rights Reserved.
• Auto Vacuum (8.1~)
• Vacuum Delay (8.1~)
• Visibility Map (8.4~)
• Freeze Map (part of visibility map) (9.6~)
• Skipping index cleanup (11~)
History of Vacuum Evolution
6Copyright©2018 NTT Corp. All Rights Reserved.
1. Shorten the vacuum execution time
• Use resource as much as possible
• Reduce the amount of work
• Work in parallel
2. But, reduce impact on transaction processing
• Work lazily
What’s Needed For “Good Vacuum”?
7Copyright©2018 NTT Corp. All Rights Reserved.
• Vacuum works with three phases:
1. Collecting dead tuple TIDs till maintenance_work_mem amount
of memory is consumed
2. Vacuum indexes
3. Vacuum table
• Vacuum is a disk-intensive operation
How Vacuum Actually Works
1. Collecting TIDs
3. Table vacuum
2-1. Index vacuum
2-2. Index vacuum
dead tuple
TIDs
8Copyright©2018 NTT Corp. All Rights Reserved.
• 2 bits per block: all-visible and all-frozen
• Track which pages “might” have garbage
• all-visible bit = 1 means the corresponding page has only visible
tuples so we don’t need to vacuum it
Vacuum With Visibility Map
1. Collecting TIDs
3. Table vacuum
2-1. Index vacuum
2-2. Index vacuum
dead tuple
TIDs
Skip all-visible
Pages :-)
9Copyright©2018 NTT Corp. All Rights Reserved.
• Table size
• Number of indexes
• Resources
Factors Of Vacuum Performance
• Visibility map
• Vacuum delays (make lazy)
• Skipping index cleanup
10Copyright©2018 NTT Corp. All Rights Reserved.
• Table size
• Number of indexes
• Resources
Factors Of Vacuum Performance
• Parallel vacuum
• Deferring index vacuums
• Range vacuum
• Visibility map
• Vacuum delays (make lazy)
• Skipping index cleanup
Today’s talk
11Copyright©2018 NTT Corp. All Rights Reserved.
• Table size
• Number of indexes
• Resources
Factors Of Vacuum Performance
• Parallel vacuum
• Deferring index vacuums
• Range vacuum
• Visibility map
• Vacuum delays (make lazy)
• Skipping index cleanup
Today’s talk
12Copyright©2018 NTT Corp. All Rights Reserved.
PARALLEL VACUUM
13Copyright©2018 NTT Corp. All Rights Reserved.
• Vacuum is performed by single process
• Vacuum could take a very long time
• Over days or even more!
• Taking longer time if table has multiple indexes
On Very Large Table
14Copyright©2018 NTT Corp. All Rights Reserved.
 Divide a large table
 Reduce autovacuum_delay_cost/limit
• Additional burden on the disk I/O instead
Current Solutions
15Copyright©2018 NTT Corp. All Rights Reserved.
• Execute vacuum with parallel workers
• Shorten the execution time of vacuum
• Note that this will consume more disk I/O
• A patch has been proposed
• “Block level Parallel Vacuum” (2016)
• However, must resolve RelExt lock issue first
• Please refer to “Moving relation extension locks out of heavyweight
lock manager” (2016)
Idea: Parallel Vacuum
16Copyright©2018 NTT Corp. All Rights Reserved.
How Does It Work?
Collect TIDs
Vacuum Table
Vacuum IndexA Vacuum IndexB
Collect TIDs Collect TIDs
Vacuum Table
Vacuum Table
Collect TIDs
Collect TIDs Collect TIDs
• Perform both TID collection and table vacuum with parallel
workers
• Dead tuple TIDs are shared on the shared memory(DSM)
• Each index is assigned to a worker
• Make some synchronizations among workers
Worker 1 Worker 3Worker 2
:
Clear
garbage TIDs
17Copyright©2018 NTT Corp. All Rights Reserved.
Evaluation (~8 indexes)
5x faster!!
RAM : 32GB
shared buffers: 512
MB
Table : 4GB
Index vacuum : 1
ioDrive SSD : 256GB
Time(ms)
18Copyright©2018 NTT Corp. All Rights Reserved.
• Parallel vacuum makes vacuums significantly faster
• This consume more CPUs and disk I/O
• Patch has been proposed
• Relation extension lock issue must be solved first!
Summary
19Copyright©2018 NTT Corp. All Rights Reserved.
• Table size
• Number of indexes
• Resources
Factors of Vacuum Performance
• Parallel vacuum
• Deferring index vacuums
• Range vacuum
• Visibility map
• Vacuum delays (make lazy)
• Skipping index cleanup
Today’s talk
20Copyright©2018 NTT Corp. All Rights Reserved.
DEFERRING INDEX VACUUM
21Copyright©2018 NTT Corp. All Rights Reserved.
• Index vacuums could still be very long
• Table vacuum can be skipped by Visibility Map but index vacuum doesn’t have
such facility
• Index vacuum could be invoked N times in a vacuum processing
• Almost all index AMs require a full scanning on index
• Only 10 dead tuples in 1TB table requires whole index scans!
Looking Back To Analysis of Vacuum
Collecting TIDs
Table vacuum
Index vacuum 1
Index vacuum 2
Collecting TIDs
Table vacuum
Index vacuum 1
Index vacuum 2
Vacuum Efficient vacuum with VM
22Copyright©2018 NTT Corp. All Rights Reserved.
 Don’t trigger auto-vacuum with a small threshold
• What about manual vacuum?
• Indexes are not easy to bloat than tables
 Increase maintenance_work_mem to avoid calling
index vacuuming multiple times
• However, still requires index vacuum at least once
Current Solutions
23Copyright©2018 NTT Corp. All Rights Reserved.
• Spool garbage TIDs
• Don't trigger index vacuum unless the amount of spooled
garbage TIDs reached to the threshold
• Reduce the number of index vacuum
Idea: Deferring Index Vacuum
24Copyright©2018 NTT Corp. All Rights Reserved.
• Amount of garbage TID < threshold
• Vacuum only table and spool dead tuple TIDs
• Amount of garbage TID >= threshold
• Vacuum indexes
How Does It Work?
Table
Spool
TIDs
Spool
TIDs
Spool
TIDs
Index
Spool area
Threshold
Reached!
Vacuum
table
Vacuum
table
Vacuum
table
Vacuum
table
Vacuum
table
Vacuum
table
Vacuum
index
Vacuum
index
25Copyright©2018 NTT Corp. All Rights Reserved.
• There are related discussions
• “Proposal: Another attempt at vacuum improvements” (2011)
• “Single pass vacuum – take1” (2011)
• But it breaks on-disk format
Related Discussions
26Copyright©2018 NTT Corp. All Rights Reserved.
• Evaluate the performance improvement by reducing the
number of index vacuums
 Spool garbage TIDs to DSM
 When bulk-deletion we look up both collected TIDs and spooled
TIDs
• Introduce new storage parameter
vacuum_index_defer_size which controls how much dead
tuples can be spilled out
• However, don’t care about concurrent update and
durability :(
Evaluation
27Copyright©2018 NTT Corp. All Rights Reserved.
=# dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+--------------+-------+----------+-------- +------------
public | defer_table | table | masahiko | 3458 MB |
public | normal_table | table | masahiko | 3458 MB |
(2 rows)
-- Spool size is 100kB
=# ALTER TABLE defer_table SET (vacuum_index_defer_size = 100);
-- Disable deferring index vacuum
=# ALTER TABLE normal_table SET (vacuum_index_defer_size = 0);
Evaluation
28Copyright©2018 NTT Corp. All Rights Reserved.
1. Load data
2. Vacuum table to make VM
3. Loop until the amount of garbage reached to the
threshold (= 17000 tuples)
1. Delete 5000 tuples to make garbage
2. Vacuum
Vacuum will be performed 4 times, and index vacuum will
be executed at only the 4th vacuum
Evaluation
29Copyright©2018 NTT Corp. All Rights Reserved.
Evaluation
• Skipped index vacuum at 1st, 2nd and 3rd vacuum
• Deferring index vacuum made vacuum 2.1x faster
• At the 4th vacuum, deferring index vacuum took twice time than
the normal
• Looking up the collected TIDs as well as the spooled TIDs
30Copyright©2018 NTT Corp. All Rights Reserved.
• Deferring index vacuum have potentials of speed up
vacuums much
• In this evaluation, it speeds up 2.1x faster
• More tricks are required for the correct implementation
• To prevent vacuumed item pointers from being reused before index
vacuum
• To avoid breaking on-disk format
Summary
31Copyright©2018 NTT Corp. All Rights Reserved.
• Table size
• Number of indexes
• Resources
Factors Of Vacuum Performance
• Parallel vacuum
• Deferring index vacuums
• Range vacuum
• Visibility map
• Vacuum delays (make lazy)
• Skipping index cleanup
Today’s talk
32Copyright©2018 NTT Corp. All Rights Reserved.
RANGE VACUUM
33Copyright©2018 NTT Corp. All Rights Reserved.
Dilemma
DBA wants to avoid both disk I/O bursts and
affecting to TPS by vacuum as much as possible
DBA wants to complete vacuum as quickly
as possible
34Copyright©2018 NTT Corp. All Rights Reserved.
• Long-running vacuum likely to be canceled
• Restart vacuum from the beginning of the table again
• Cannot reclaim garbage that is made since the vacuum
started
Is it possible to use vacuum delays and to complete
vacuum in a short time?
Long-running Vacuum Problems
35Copyright©2018 NTT Corp. All Rights Reserved.
• The cost of vacuum a block can be regard as almost
constant
• The most spent time is disk I/O (read buffer, write WAL)
• Garbage on table might have locality
• Even though vacuum reclaims a block the new free space
got by vacuum depends on how much garbage exists on
the block
Efficiency Analysis of Vacuum
36Copyright©2018 NTT Corp. All Rights Reserved.
• If we got free space N byte by vacuum on M byte, efficiency of
vacuum k is N/M
• k = 1 means we get free space as mush as we vacuumed
• k ≈ 0 means we don’t get free space even if vacuumed lots of blocks
• All-visible of VM is cleared if even one tuple is inserted/deleted
Efficiency Analysis of Vacuum
Block number
Amount of
garbage
Good efficiency (k≈1)
Poor efficiency (k≈0)
37Copyright©2018 NTT Corp. All Rights Reserved.
Range Vacuum with Garbage Map
Before After
Vacuum
higher 10%
ranges
• Garbage map

Track garbage status of bunch of blocks

Reproduce the garbage status on table
• Range vacuum

Preferably vacuum blocks having higher “k”

Trigger vacuum more frequently
38Copyright©2018 NTT Corp. All Rights Reserved.
• WAL-based
• WAL knows the all block change information
• Don’t increase transaction latency as mush as possible
• Logical decoding didn’t match (so far)
• Need to track block-level changes
• Need to track aborted transactions
• “WALker” module
• A background worker that continues to read WAL
• Invoke corresponding plugin callbacks
• “garbagemap” plugin builds garbage maps
• Repository at https://github.com/MasahikoSawada/walker
Building Garbage Maps
39Copyright©2018 NTT Corp. All Rights Reserved.
• Divide a table by 4096 blocks (32MB) logically into ranges
• Track of # of garbage tuples per range by integer. 4MB for 2^32 blocks.
• Reorder transaction information and make garbage maps
• In a commit transaction, deleted tuples become garbage tuples
• In a abort transaction, inserted tuples become garbage tuples
• Vacuum only ranges having higher efficiency
• Also added the lower bound of using range vacuum
Garbage Map Details
WAL
WALker
(bgworker)
Backend
Backend
Backend
auto-vacuum
worker
Table
write
range
vacuum
read
generate
garbage maps
use
Garbage Maps
write
40Copyright©2018 NTT Corp. All Rights Reserved.
• Machine
• 144cores, 126GB RAM, 1.5TB SSD
• Target
• master branch (ff49430 snapshot) and with range vacuum feature
• Configurations
• autovacuum_vacuum_scale_factor = 0.04
• autovacuum_vacuum_cost_limit = 1000 (default is 200)
• autovaucum_vacuum_cost_delay = 20ms (by default)
• Workload
• pgbench (TPC-B) at scale factor 16000 (about 200GB)
• Using custom script (gaussian : uniformly = 9 : 1)
• 5 hours
• Run open-transaction for 10 min with 30 min intervals (to generate garbage
faster)
• Observation
• Transaction TPS
• Transaction latency
• Relation size
Evaluation
41Copyright©2018 NTT Corp. All Rights Reserved.
Results: Relation size
• auto-vacuum started about 2 hours after
• Master branch
• Didn’t complete auto-vacuum within 5 hours
• Took over 9 hours (not recorded)
• Range vacuum
• Run 6 times
• Processed 800 ranges (27GB, 10% of table) within 50min at an average
Started auto-vacuum master
range vac
212GB
215GB
42Copyright©2018 NTT Corp. All Rights Reserved.
Results: TPS and latency
• In master branch, latency became sometimes large after auto-vacuum
started

Frequently updated blocks likely to be loaded to shared buffer
• TPS and latency of range vacuum branch was more stable
TPS
Latency
Master Range Vacuum
auto-vacuum starts auto-vacuum starts
43Copyright©2018 NTT Corp. All Rights Reserved.
• Range vacuum reclaims garbage space with minimum
side-affects in a short time
• Invoking range vacuum more frequently also means
calling index vacuum more frequently as well
• Combining with deferring index vacuum would be good idea
• Each range has the number of garbage tuples
• Could be the size of garbage instead
• Need to vacuum whole table if garbage placed uniformly
on the table
Summary
44Copyright©2018 NTT Corp. All Rights Reserved.
• Improvement ideas
• Parallel vacuum
• Deferring index vacuum
• Range vacuum and garbage map
• More improvement points
• Auto vacuum scheduling
• Patch is proposed
• Resource managements
• Using cgroups
• etc
Conclusion
45Copyright©2018 NTT Corp. All Rights Reserved.
Thank you!
Masahiko Sawada
Mail: sawada.mshk@gmail.com
Twitter: @sawada_masahiko
46Copyright©2018 NTT Corp. All Rights Reserved.
1. HOT-pruning and table vacuum mark all item pointers that are being pointed
by index tuple as VACUUM_DEAD
2. Spool dead tuple TIDs as bitmap per block
3. In an index vacuum, scan each index pages and check if index tuples are
pointing to spooled dead tuple TIDs
4. Reclaim matched index tuples and clear corresponding bits
• If all bits are cleared, record LSN where index vacuum invoked along with bitmap
5. At HOT-pruning or vacuum, mark VACUUM_DEAD item pointers as UNUSED
if current LSN > stored LSN
• Data representation of dead tuple TIDs
• dead tuple TIDs are stored into a new fork <relfilenode>_dt
• 300 bits (25 byte) for bitmap and 8 byte for LSN per 8kB block
• 1 dt page has 234 blocks information
• 1GB table -> 4MB dt fork, 1TB -> 4GB dt fork
• To existing check faster, before starting index vacuum create bloom filter for blocks of
which has any bits.
Spooling Dead Tuples TIDs
47Copyright©2018 NTT Corp. All Rights Reserved.
• autovacuum_vacuum_scale_factor = 0.04
• autovacuum_naptime = 10
• autovacuum_vacuum_cost_limit = 1000
• autovacuum_vacuum_cost_delay = 20ms
• checkpoint_completion_targt = 0.3
• garbagemap.min_range_vacuum_size = 10GB
• garbagemap.range_vacuum_percent = 30
• shared_buffers = 50GB
• max_wal_size = 100GB
• min_wal_size = 50GB
Configurations
48Copyright©2018 NTT Corp. All Rights Reserved.
Dead Tuples

More Related Content

What's hot

Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...Equnix Business Solutions
 
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合Hitoshi Sato
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...Equnix Business Solutions
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiEqunix Business Solutions
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Community
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Community
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Community
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Danielle Womboldt
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...Kohei KaiGai
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed_Hat_Storage
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 

What's hot (19)

Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
 
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash Storage
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage
 
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 

Similar to Vacuum more efficient than ever

Erasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William ByrneErasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William ByrneCeph Community
 
Architecture & Pitfalls of Logical Replication
Architecture & Pitfalls of Logical ReplicationArchitecture & Pitfalls of Logical Replication
Architecture & Pitfalls of Logical ReplicationAtsushi Torikoshi
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoKota Tsuyuzaki
 
Everything I Ever Learned About JVM Performance Tuning @Twitter
Everything I Ever Learned About JVM Performance Tuning @TwitterEverything I Ever Learned About JVM Performance Tuning @Twitter
Everything I Ever Learned About JVM Performance Tuning @TwitterAttila Szegedi
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopDataWorks Summit
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Fwdays
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Cloudera, Inc.
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and PigRicardo Varela
 
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Akihiro Suda
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...VirtualTech Japan Inc.
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a FoeHaim Yadid
 

Similar to Vacuum more efficient than ever (20)

Erasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William ByrneErasure Code at Scale - Thomas William Byrne
Erasure Code at Scale - Thomas William Byrne
 
Architecture & Pitfalls of Logical Replication
Architecture & Pitfalls of Logical ReplicationArchitecture & Pitfalls of Logical Replication
Architecture & Pitfalls of Logical Replication
 
03 performance
03 performance03 performance
03 performance
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
04 performance
04 performance04 performance
04 performance
 
More Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit JunoMore Efficient Object Replication in OpenStack Summit Juno
More Efficient Object Replication in OpenStack Summit Juno
 
Timesten Architecture
Timesten ArchitectureTimesten Architecture
Timesten Architecture
 
Everything I Ever Learned About JVM Performance Tuning @Twitter
Everything I Ever Learned About JVM Performance Tuning @TwitterEverything I Ever Learned About JVM Performance Tuning @Twitter
Everything I Ever Learned About JVM Performance Tuning @Twitter
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
Lec3 final
Lec3 finalLec3 final
Lec3 final
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
 

More from Masahiko Sawada

PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説Masahiko Sawada
 
行ロックと「LOG: process 12345 still waiting for ShareLock on transaction 710 afte...
行ロックと「LOG:  process 12345 still waiting for ShareLock on transaction 710 afte...行ロックと「LOG:  process 12345 still waiting for ShareLock on transaction 710 afte...
行ロックと「LOG: process 12345 still waiting for ShareLock on transaction 710 afte...Masahiko Sawada
 
PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報Masahiko Sawada
 
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -Masahiko Sawada
 
Database Encryption and Key Management for PostgreSQL - Principles and Consid...
Database Encryption and Key Management for PostgreSQL - Principles and Consid...Database Encryption and Key Management for PostgreSQL - Principles and Consid...
Database Encryption and Key Management for PostgreSQL - Principles and Consid...Masahiko Sawada
 
今秋リリース予定のPostgreSQL11を徹底解説
今秋リリース予定のPostgreSQL11を徹底解説今秋リリース予定のPostgreSQL11を徹底解説
今秋リリース予定のPostgreSQL11を徹底解説Masahiko Sawada
 
アーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーションアーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーションMasahiko Sawada
 
PostgreSQLでスケールアウト
PostgreSQLでスケールアウトPostgreSQLでスケールアウト
PostgreSQLでスケールアウトMasahiko Sawada
 
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~Masahiko Sawada
 
PostgreSQL10徹底解説
PostgreSQL10徹底解説PostgreSQL10徹底解説
PostgreSQL10徹底解説Masahiko Sawada
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorMasahiko Sawada
 
PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介Masahiko Sawada
 
pg_bigmと類似度検索
pg_bigmと類似度検索pg_bigmと類似度検索
pg_bigmと類似度検索Masahiko Sawada
 
pg_bigmを触り始めた人に伝えたいこと
pg_bigmを触り始めた人に伝えたいことpg_bigmを触り始めた人に伝えたいこと
pg_bigmを触り始めた人に伝えたいことMasahiko Sawada
 
XID周回問題に潜む別の問題
XID周回問題に潜む別の問題XID周回問題に潜む別の問題
XID周回問題に潜む別の問題Masahiko Sawada
 
PostgreSQL共有バッファと関連ツール
PostgreSQL共有バッファと関連ツールPostgreSQL共有バッファと関連ツール
PostgreSQL共有バッファと関連ツールMasahiko Sawada
 

More from Masahiko Sawada (20)

PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説
 
行ロックと「LOG: process 12345 still waiting for ShareLock on transaction 710 afte...
行ロックと「LOG:  process 12345 still waiting for ShareLock on transaction 710 afte...行ロックと「LOG:  process 12345 still waiting for ShareLock on transaction 710 afte...
行ロックと「LOG: process 12345 still waiting for ShareLock on transaction 710 afte...
 
PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報PostgreSQL 15 開発最新情報
PostgreSQL 15 開発最新情報
 
Vacuum徹底解説
Vacuum徹底解説Vacuum徹底解説
Vacuum徹底解説
 
PostgreSQL 12の話
PostgreSQL 12の話PostgreSQL 12の話
PostgreSQL 12の話
 
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
OSS活動のやりがいとそれから得たもの - PostgreSQLコミュニティにて -
 
Database Encryption and Key Management for PostgreSQL - Principles and Consid...
Database Encryption and Key Management for PostgreSQL - Principles and Consid...Database Encryption and Key Management for PostgreSQL - Principles and Consid...
Database Encryption and Key Management for PostgreSQL - Principles and Consid...
 
今秋リリース予定のPostgreSQL11を徹底解説
今秋リリース予定のPostgreSQL11を徹底解説今秋リリース予定のPostgreSQL11を徹底解説
今秋リリース予定のPostgreSQL11を徹底解説
 
Vacuumとzheap
VacuumとzheapVacuumとzheap
Vacuumとzheap
 
アーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーションアーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーション
 
Parallel Vacuum
Parallel VacuumParallel Vacuum
Parallel Vacuum
 
PostgreSQLでスケールアウト
PostgreSQLでスケールアウトPostgreSQLでスケールアウト
PostgreSQLでスケールアウト
 
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~
OSS 開発ってどうやっているの? ~ PostgreSQL の現場から~
 
PostgreSQL10徹底解説
PostgreSQL10徹底解説PostgreSQL10徹底解説
PostgreSQL10徹底解説
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介
 
pg_bigmと類似度検索
pg_bigmと類似度検索pg_bigmと類似度検索
pg_bigmと類似度検索
 
pg_bigmを触り始めた人に伝えたいこと
pg_bigmを触り始めた人に伝えたいことpg_bigmを触り始めた人に伝えたいこと
pg_bigmを触り始めた人に伝えたいこと
 
XID周回問題に潜む別の問題
XID周回問題に潜む別の問題XID周回問題に潜む別の問題
XID周回問題に潜む別の問題
 
PostgreSQL共有バッファと関連ツール
PostgreSQL共有バッファと関連ツールPostgreSQL共有バッファと関連ツール
PostgreSQL共有バッファと関連ツール
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Vacuum more efficient than ever

  • 1. Copyright©2018 NTT Corp. All Rights Reserved. Vacuum More Efficient Than Ever Masahiko Sawada NTT Open Source Software Center @PGCon 2018
  • 2. 2Copyright©2018 NTT Corp. All Rights Reserved. • Masahiko Sawada  from Tokyo, Japan • PostgreSQL contributor  Multiple synchronous replication: FIRST and ANY methods (9.6 and 10)  Freeze map (9.6)  Skipping cleanup index vacuum (11) Who Am I?
  • 3. 3Copyright©2018 NTT Corp. All Rights Reserved. • What Is Vacuum? • Three Vacuum Improvements • Problems • Solutions • Challenges • Evaluations • Conclusion Agenda
  • 4. 4Copyright©2018 NTT Corp. All Rights Reserved. • PostgreSQL garbage collection feature • Recover or reuse disk space occupied • VACUUM command • =# VACUUM tbl1, tbl2; • =# VACUUM (ANALYZE, VERBOSE) tbl1; • Auto vacuum • autovacuum launcher process • autovacuum worker processes What Is Vacuum?
  • 5. 5Copyright©2018 NTT Corp. All Rights Reserved. • Auto Vacuum (8.1~) • Vacuum Delay (8.1~) • Visibility Map (8.4~) • Freeze Map (part of visibility map) (9.6~) • Skipping index cleanup (11~) History of Vacuum Evolution
  • 6. 6Copyright©2018 NTT Corp. All Rights Reserved. 1. Shorten the vacuum execution time • Use resource as much as possible • Reduce the amount of work • Work in parallel 2. But, reduce impact on transaction processing • Work lazily What’s Needed For “Good Vacuum”?
  • 7. 7Copyright©2018 NTT Corp. All Rights Reserved. • Vacuum works with three phases: 1. Collecting dead tuple TIDs till maintenance_work_mem amount of memory is consumed 2. Vacuum indexes 3. Vacuum table • Vacuum is a disk-intensive operation How Vacuum Actually Works 1. Collecting TIDs 3. Table vacuum 2-1. Index vacuum 2-2. Index vacuum dead tuple TIDs
  • 8. 8Copyright©2018 NTT Corp. All Rights Reserved. • 2 bits per block: all-visible and all-frozen • Track which pages “might” have garbage • all-visible bit = 1 means the corresponding page has only visible tuples so we don’t need to vacuum it Vacuum With Visibility Map 1. Collecting TIDs 3. Table vacuum 2-1. Index vacuum 2-2. Index vacuum dead tuple TIDs Skip all-visible Pages :-)
  • 9. 9Copyright©2018 NTT Corp. All Rights Reserved. • Table size • Number of indexes • Resources Factors Of Vacuum Performance • Visibility map • Vacuum delays (make lazy) • Skipping index cleanup
  • 10. 10Copyright©2018 NTT Corp. All Rights Reserved. • Table size • Number of indexes • Resources Factors Of Vacuum Performance • Parallel vacuum • Deferring index vacuums • Range vacuum • Visibility map • Vacuum delays (make lazy) • Skipping index cleanup Today’s talk
  • 11. 11Copyright©2018 NTT Corp. All Rights Reserved. • Table size • Number of indexes • Resources Factors Of Vacuum Performance • Parallel vacuum • Deferring index vacuums • Range vacuum • Visibility map • Vacuum delays (make lazy) • Skipping index cleanup Today’s talk
  • 12. 12Copyright©2018 NTT Corp. All Rights Reserved. PARALLEL VACUUM
  • 13. 13Copyright©2018 NTT Corp. All Rights Reserved. • Vacuum is performed by single process • Vacuum could take a very long time • Over days or even more! • Taking longer time if table has multiple indexes On Very Large Table
  • 14. 14Copyright©2018 NTT Corp. All Rights Reserved.  Divide a large table  Reduce autovacuum_delay_cost/limit • Additional burden on the disk I/O instead Current Solutions
  • 15. 15Copyright©2018 NTT Corp. All Rights Reserved. • Execute vacuum with parallel workers • Shorten the execution time of vacuum • Note that this will consume more disk I/O • A patch has been proposed • “Block level Parallel Vacuum” (2016) • However, must resolve RelExt lock issue first • Please refer to “Moving relation extension locks out of heavyweight lock manager” (2016) Idea: Parallel Vacuum
  • 16. 16Copyright©2018 NTT Corp. All Rights Reserved. How Does It Work? Collect TIDs Vacuum Table Vacuum IndexA Vacuum IndexB Collect TIDs Collect TIDs Vacuum Table Vacuum Table Collect TIDs Collect TIDs Collect TIDs • Perform both TID collection and table vacuum with parallel workers • Dead tuple TIDs are shared on the shared memory(DSM) • Each index is assigned to a worker • Make some synchronizations among workers Worker 1 Worker 3Worker 2 : Clear garbage TIDs
  • 17. 17Copyright©2018 NTT Corp. All Rights Reserved. Evaluation (~8 indexes) 5x faster!! RAM : 32GB shared buffers: 512 MB Table : 4GB Index vacuum : 1 ioDrive SSD : 256GB Time(ms)
  • 18. 18Copyright©2018 NTT Corp. All Rights Reserved. • Parallel vacuum makes vacuums significantly faster • This consume more CPUs and disk I/O • Patch has been proposed • Relation extension lock issue must be solved first! Summary
  • 19. 19Copyright©2018 NTT Corp. All Rights Reserved. • Table size • Number of indexes • Resources Factors of Vacuum Performance • Parallel vacuum • Deferring index vacuums • Range vacuum • Visibility map • Vacuum delays (make lazy) • Skipping index cleanup Today’s talk
  • 20. 20Copyright©2018 NTT Corp. All Rights Reserved. DEFERRING INDEX VACUUM
  • 21. 21Copyright©2018 NTT Corp. All Rights Reserved. • Index vacuums could still be very long • Table vacuum can be skipped by Visibility Map but index vacuum doesn’t have such facility • Index vacuum could be invoked N times in a vacuum processing • Almost all index AMs require a full scanning on index • Only 10 dead tuples in 1TB table requires whole index scans! Looking Back To Analysis of Vacuum Collecting TIDs Table vacuum Index vacuum 1 Index vacuum 2 Collecting TIDs Table vacuum Index vacuum 1 Index vacuum 2 Vacuum Efficient vacuum with VM
  • 22. 22Copyright©2018 NTT Corp. All Rights Reserved.  Don’t trigger auto-vacuum with a small threshold • What about manual vacuum? • Indexes are not easy to bloat than tables  Increase maintenance_work_mem to avoid calling index vacuuming multiple times • However, still requires index vacuum at least once Current Solutions
  • 23. 23Copyright©2018 NTT Corp. All Rights Reserved. • Spool garbage TIDs • Don't trigger index vacuum unless the amount of spooled garbage TIDs reached to the threshold • Reduce the number of index vacuum Idea: Deferring Index Vacuum
  • 24. 24Copyright©2018 NTT Corp. All Rights Reserved. • Amount of garbage TID < threshold • Vacuum only table and spool dead tuple TIDs • Amount of garbage TID >= threshold • Vacuum indexes How Does It Work? Table Spool TIDs Spool TIDs Spool TIDs Index Spool area Threshold Reached! Vacuum table Vacuum table Vacuum table Vacuum table Vacuum table Vacuum table Vacuum index Vacuum index
  • 25. 25Copyright©2018 NTT Corp. All Rights Reserved. • There are related discussions • “Proposal: Another attempt at vacuum improvements” (2011) • “Single pass vacuum – take1” (2011) • But it breaks on-disk format Related Discussions
  • 26. 26Copyright©2018 NTT Corp. All Rights Reserved. • Evaluate the performance improvement by reducing the number of index vacuums  Spool garbage TIDs to DSM  When bulk-deletion we look up both collected TIDs and spooled TIDs • Introduce new storage parameter vacuum_index_defer_size which controls how much dead tuples can be spilled out • However, don’t care about concurrent update and durability :( Evaluation
  • 27. 27Copyright©2018 NTT Corp. All Rights Reserved. =# dt+ List of relations Schema | Name | Type | Owner | Size | Description --------+--------------+-------+----------+-------- +------------ public | defer_table | table | masahiko | 3458 MB | public | normal_table | table | masahiko | 3458 MB | (2 rows) -- Spool size is 100kB =# ALTER TABLE defer_table SET (vacuum_index_defer_size = 100); -- Disable deferring index vacuum =# ALTER TABLE normal_table SET (vacuum_index_defer_size = 0); Evaluation
  • 28. 28Copyright©2018 NTT Corp. All Rights Reserved. 1. Load data 2. Vacuum table to make VM 3. Loop until the amount of garbage reached to the threshold (= 17000 tuples) 1. Delete 5000 tuples to make garbage 2. Vacuum Vacuum will be performed 4 times, and index vacuum will be executed at only the 4th vacuum Evaluation
  • 29. 29Copyright©2018 NTT Corp. All Rights Reserved. Evaluation • Skipped index vacuum at 1st, 2nd and 3rd vacuum • Deferring index vacuum made vacuum 2.1x faster • At the 4th vacuum, deferring index vacuum took twice time than the normal • Looking up the collected TIDs as well as the spooled TIDs
  • 30. 30Copyright©2018 NTT Corp. All Rights Reserved. • Deferring index vacuum have potentials of speed up vacuums much • In this evaluation, it speeds up 2.1x faster • More tricks are required for the correct implementation • To prevent vacuumed item pointers from being reused before index vacuum • To avoid breaking on-disk format Summary
  • 31. 31Copyright©2018 NTT Corp. All Rights Reserved. • Table size • Number of indexes • Resources Factors Of Vacuum Performance • Parallel vacuum • Deferring index vacuums • Range vacuum • Visibility map • Vacuum delays (make lazy) • Skipping index cleanup Today’s talk
  • 32. 32Copyright©2018 NTT Corp. All Rights Reserved. RANGE VACUUM
  • 33. 33Copyright©2018 NTT Corp. All Rights Reserved. Dilemma DBA wants to avoid both disk I/O bursts and affecting to TPS by vacuum as much as possible DBA wants to complete vacuum as quickly as possible
  • 34. 34Copyright©2018 NTT Corp. All Rights Reserved. • Long-running vacuum likely to be canceled • Restart vacuum from the beginning of the table again • Cannot reclaim garbage that is made since the vacuum started Is it possible to use vacuum delays and to complete vacuum in a short time? Long-running Vacuum Problems
  • 35. 35Copyright©2018 NTT Corp. All Rights Reserved. • The cost of vacuum a block can be regard as almost constant • The most spent time is disk I/O (read buffer, write WAL) • Garbage on table might have locality • Even though vacuum reclaims a block the new free space got by vacuum depends on how much garbage exists on the block Efficiency Analysis of Vacuum
  • 36. 36Copyright©2018 NTT Corp. All Rights Reserved. • If we got free space N byte by vacuum on M byte, efficiency of vacuum k is N/M • k = 1 means we get free space as mush as we vacuumed • k ≈ 0 means we don’t get free space even if vacuumed lots of blocks • All-visible of VM is cleared if even one tuple is inserted/deleted Efficiency Analysis of Vacuum Block number Amount of garbage Good efficiency (k≈1) Poor efficiency (k≈0)
  • 37. 37Copyright©2018 NTT Corp. All Rights Reserved. Range Vacuum with Garbage Map Before After Vacuum higher 10% ranges • Garbage map  Track garbage status of bunch of blocks  Reproduce the garbage status on table • Range vacuum  Preferably vacuum blocks having higher “k”  Trigger vacuum more frequently
  • 38. 38Copyright©2018 NTT Corp. All Rights Reserved. • WAL-based • WAL knows the all block change information • Don’t increase transaction latency as mush as possible • Logical decoding didn’t match (so far) • Need to track block-level changes • Need to track aborted transactions • “WALker” module • A background worker that continues to read WAL • Invoke corresponding plugin callbacks • “garbagemap” plugin builds garbage maps • Repository at https://github.com/MasahikoSawada/walker Building Garbage Maps
  • 39. 39Copyright©2018 NTT Corp. All Rights Reserved. • Divide a table by 4096 blocks (32MB) logically into ranges • Track of # of garbage tuples per range by integer. 4MB for 2^32 blocks. • Reorder transaction information and make garbage maps • In a commit transaction, deleted tuples become garbage tuples • In a abort transaction, inserted tuples become garbage tuples • Vacuum only ranges having higher efficiency • Also added the lower bound of using range vacuum Garbage Map Details WAL WALker (bgworker) Backend Backend Backend auto-vacuum worker Table write range vacuum read generate garbage maps use Garbage Maps write
  • 40. 40Copyright©2018 NTT Corp. All Rights Reserved. • Machine • 144cores, 126GB RAM, 1.5TB SSD • Target • master branch (ff49430 snapshot) and with range vacuum feature • Configurations • autovacuum_vacuum_scale_factor = 0.04 • autovacuum_vacuum_cost_limit = 1000 (default is 200) • autovaucum_vacuum_cost_delay = 20ms (by default) • Workload • pgbench (TPC-B) at scale factor 16000 (about 200GB) • Using custom script (gaussian : uniformly = 9 : 1) • 5 hours • Run open-transaction for 10 min with 30 min intervals (to generate garbage faster) • Observation • Transaction TPS • Transaction latency • Relation size Evaluation
  • 41. 41Copyright©2018 NTT Corp. All Rights Reserved. Results: Relation size • auto-vacuum started about 2 hours after • Master branch • Didn’t complete auto-vacuum within 5 hours • Took over 9 hours (not recorded) • Range vacuum • Run 6 times • Processed 800 ranges (27GB, 10% of table) within 50min at an average Started auto-vacuum master range vac 212GB 215GB
  • 42. 42Copyright©2018 NTT Corp. All Rights Reserved. Results: TPS and latency • In master branch, latency became sometimes large after auto-vacuum started  Frequently updated blocks likely to be loaded to shared buffer • TPS and latency of range vacuum branch was more stable TPS Latency Master Range Vacuum auto-vacuum starts auto-vacuum starts
  • 43. 43Copyright©2018 NTT Corp. All Rights Reserved. • Range vacuum reclaims garbage space with minimum side-affects in a short time • Invoking range vacuum more frequently also means calling index vacuum more frequently as well • Combining with deferring index vacuum would be good idea • Each range has the number of garbage tuples • Could be the size of garbage instead • Need to vacuum whole table if garbage placed uniformly on the table Summary
  • 44. 44Copyright©2018 NTT Corp. All Rights Reserved. • Improvement ideas • Parallel vacuum • Deferring index vacuum • Range vacuum and garbage map • More improvement points • Auto vacuum scheduling • Patch is proposed • Resource managements • Using cgroups • etc Conclusion
  • 45. 45Copyright©2018 NTT Corp. All Rights Reserved. Thank you! Masahiko Sawada Mail: sawada.mshk@gmail.com Twitter: @sawada_masahiko
  • 46. 46Copyright©2018 NTT Corp. All Rights Reserved. 1. HOT-pruning and table vacuum mark all item pointers that are being pointed by index tuple as VACUUM_DEAD 2. Spool dead tuple TIDs as bitmap per block 3. In an index vacuum, scan each index pages and check if index tuples are pointing to spooled dead tuple TIDs 4. Reclaim matched index tuples and clear corresponding bits • If all bits are cleared, record LSN where index vacuum invoked along with bitmap 5. At HOT-pruning or vacuum, mark VACUUM_DEAD item pointers as UNUSED if current LSN > stored LSN • Data representation of dead tuple TIDs • dead tuple TIDs are stored into a new fork <relfilenode>_dt • 300 bits (25 byte) for bitmap and 8 byte for LSN per 8kB block • 1 dt page has 234 blocks information • 1GB table -> 4MB dt fork, 1TB -> 4GB dt fork • To existing check faster, before starting index vacuum create bloom filter for blocks of which has any bits. Spooling Dead Tuples TIDs
  • 47. 47Copyright©2018 NTT Corp. All Rights Reserved. • autovacuum_vacuum_scale_factor = 0.04 • autovacuum_naptime = 10 • autovacuum_vacuum_cost_limit = 1000 • autovacuum_vacuum_cost_delay = 20ms • checkpoint_completion_targt = 0.3 • garbagemap.min_range_vacuum_size = 10GB • garbagemap.range_vacuum_percent = 30 • shared_buffers = 50GB • max_wal_size = 100GB • min_wal_size = 50GB Configurations
  • 48. 48Copyright©2018 NTT Corp. All Rights Reserved. Dead Tuples