In this session we dive deep into HOT (Heap Only Tuple) update optimization. Utilizing this optimization can result in improved writes rates, less index bloat and reduced vacuum effort but to enable PostgreSQL to use this optimization may require changing your application design and database settings. We will examine how the number of indexes, frequency of updates, fillfactor and vacuum settings can influence when HOT will be utilized and what benefits you may be able to gain.
2. What is HOT - Heap Only Tuples
The Heap Only Tuple (HOT) feature eliminates redundant index entries and
allows the re-use of space taken by DELETEd or obsoleted UPDATEd tuples
without performing a table-wide vacuum. It does this by allowing
single-page vacuuming, also called "defragmentation".
Full description - src/backend/access/heap/README.HOT
35. Fillfactor on Insert & Single Update Workload
Insert at 2K TPS and update one row 100 times while
having a long running transaction
Fillfactor Single Key Fetch Table Scan
100
90
50
10
101 blocks
18 blocks
5 blocks
3 blocks
5.5K blocks
6K blocks
11K blocks
60K blocks
Bloat comes
in many
disguises
36. Fillfactor on Insert & Single Update Workload
Insert at 2K TPS and update one row 100 times while
having a long running transaction
Fillfactor Single Key Fetch Table Scan
100
90
50
10
5.5K blocks
6K blocks
11K blocks
60K blocks
2 blocks
1 block
1 block Bloat comes
in many
disguises
37. Fillfactor on Insert & Single Update Workload
Insert at 2K TPS and update one row 100 times while
having a long running transaction
Fillfactor Single Key Fetch Table Scan
100
90
50
10
5.5K blocks
6K blocks
11K blocks
60K blocks
2 blocks
1 block
1 block
1 block
Bloat comes
in many
disguises
38. 0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
SizeinKB
Minutes
Heap + One Index - Update Single Row
Regular Long Transaction Regular 1 Min Transaction
Regular No Transaction HOT Long Transaction
HOT 1 Min Transaction HOT No Transaction
39. 0
100
200
300
400
500
600
700
800
900
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
SizeinKB
Minutes
One Index - Update Single Row
Regular Long Transaction Regular 1 Min Transaction
Regular No Transaction Hot Long Transaction
Hot 1 Min Transaction Hot No Transaction
40. Measuring Longest Running Transaction
postgres=# select max(now() - xact_start ) from pg_stat_activity;
max
-----------------
02:02:48.021408
41. One the fly cleanup - HEAP
heap
1 2 lp
index
leaf
index A index B index C
tuple v1
block0 block1
tuple v2
42. One the fly cleanup - HEAP
heap
1 2 lp
index
leaf
index A index B index C
block0 block1
tuple v2
43. One the fly cleanup - HEAP
heap
1 2 lp
index
leaf
index A index B index C
block0 block1
tuple v2
62. Is that an update? ORM cases
ORM and other software sometimes update all columns in table including all indexed columns
UPDATE to NEW VALUE
postgres=# update benchmark_uuid2 set last_updated = now() where id=2;
UPDATE 1
postgres=# select n_tup_upd, n_tup_hot_upd from pg_stat_all_tables where relname = 'benchmark_uuid2';
n_tup_upd | n_tup_hot_upd
-----------+---------------
1 | 0
UPDATE to SAME VALUE (i.e. ORM case)
postgres=# update benchmark_uuid2 set last_updated = ( select last_updated benchmark_uuid2 where id=2)
where id=2;
UPDATE 1
postgres=# select n_tup_upd, n_tup_hot_upd from pg_stat_all_tables where relname = 'benchmark_uuid2';
n_tup_upd | n_tup_hot_upd
-----------+---------------
2 | 1
(1 row)
!=
63. Index Tests – A table with 2 to 64 indexes
Create the table with a PK, 64 Random int Columns and 1 last updated timestamp
create table benchmark_serial (
pk serial constraint pk_benchmark_serial_pk PRIMARY KEY,
a1 int not null,
…..
a64 int not null,
last_updated timestamp
);
Create between 2 and 64 indexes on the random int columns
create index i_benchmark_serial_a1 on benchmark_serial (a1);
….
create index i_benchmark_serial_a64 on benchmark_serial (a64);
For the Regular Test
create index i_benchmark_serial_lu on benchmark_serial (last_updated);
66. Full Page Writes
Block in
Memory
PostgreSQL
update t set y = 6;
Checkpoint
Datafile
Full
Block
WAL
Archive
67. Full Page Writes
Block in
Memory
PostgreSQL
update t set y = 6;
Checkpoint
Datafile
Full
Block
WAL
Archive
4K
4K
8K
68. Full Page Writes
Block in
Memory
PostgreSQL
update t set y = 6;
Checkpoint
Datafile
Full
Block
WAL
Archive
4K
4K
8K
During crash
recovery
PostgreSQL
uses the FPW
block in the
WAL to replace
the bad
checkpointed
block
80. Example – Keep track of table changes
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 7 19-OCT-2009
4 A A1 MFM USEFUL 2 21-OCT-1972
81. Example – Keep track of table changes
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 7 19-OCT-2009
4 A A1 MFM USEFUL 2 21-OCT-1972
Data Lake
82. Example – Keep track of table changes
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 7 19-OCT-2009
4 A A1 MFM USEFUL 2 21-OCT-1972
Data Lake Full
Table
Scan
83. Example – Build an index on Last Updated
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 8 15-OCT-2018
4 A A1 MFM USEFUL 2 21-OCT-1972
Data Lake Index
Scan
84. Example – Build an index on Last Updated
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 8 15-OCT-2018
4 A A1 MFM USEFUL 2 21-OCT-1972
Data Lake Index
Scan
Before: Updates to C1,C3,C4, Last Updated are HOT
After: Every update is a regular update
85. Example – Logical Replication
PK C1 C2 C3 C4 C4 Last Updated
1 X A1 FOO HOT 9 01-Jun-1999
2 Y A2 BAR IS 9 01-Jul-2001
3 Z A2 RLL REALLY 7 19-OCT-2009
4 A A1 MFM USEFUL 2 21-OCT-1972
All changes
WAL logged
WAL
Logical
Decoding
and
Replication
Data Lake