1. Tuning Autovacuum in PostgreSQL
Mohammad Zaid Patel
Database Consultant
Mydbops
Mydbops 16th MyWebinar
2. • Database Consultant for PostgreSQL
• Active Learner
• Technophile
• Intrigued in PostgreSQL and open source software
• Likes Cricket & Music
About Me
3. • Founded in 2016
• Services on top open source databases
• 80+ Member team
• AWS Partner
• PCI & ISO Certified Organisation
About Mydbops
7. Agenda
• Database Bloat
• What is Vacuum
• What is Autovacuum
• Why Autovacuum
• Tuning Autovacuum
• Best Practices
8. ID NAME
1 TONY
2 STEVE
3 THOR
AVENGERS
>> UPDATE AVENGERS SET NAME='BRUCE' WHERE ID=3 ;
ID NAME
1 TONY
2 STEVE
3 BRUCE
ID NAME
1 TONY
2 STEVE
3 THOR
3 BRUCE
• SELECTS, INSERTS, UPDATES and DELETES are some of the common actions in
any database
• UPDATE = Deletion of Old value + Insertion of New Value
User A
Introduction
9. • Rows marked invisible or as 'dead tuple'
• UNDO log maintained inside the table
• No space reclaimed
• Dead tuples accumulation
Introduction
11. ID NAME
1 TONY
2 STEVE
3 THOR
3 BRUCE
ID NAME
1 TONY
2 STEVE
3 BRUCE
User A
User B
>> SELECT NAME FROM AVENGERS WHERE ID=3 ;
ID NAME
1 TONY
2 STEVE
3 THOR
ID NAME
3 THOR
User B
ID NAME
1 TONY
2 STEVE
3 THOR
MVCC model
12. • Data consistency is maintained
• Past image and latest image
• Every row of PostgreSQL table has a version number
• Transaction isolation for each database session
• Locks acquired for querying data don't conflict with locks
acquired for writing
MVCC model
14. Query to check bloat :
postgres=# SELECT pg_size_pretty(pg_relation_size('test')) as
table_size,pgstattuple('test')).dead_tuple_percent;
• Bloating of the Database is caused due to dead tuples
• Increased Dead Tuples = Increased Bloat
Database bloat
15. Why is bloat bad ?
• Slow Sequential scans
• Transaction id wraparound
• Eats up Storage
• Dead tuples > Live tuple = No more updates on the table
16. ID NAME
3 THOR
Xmin Xmax
1 10
Postgres uses 32 bit Transaction ID i.e 2^32 transaction IDs
Transaction ID wraparound
18. • Limited Transaction IDs
• Once upper limit is reached , ids get wraparound
• Difficult for Postgres to figure out if data is transaction is in past or future
• Postgres shuts down daatabase to protect the data
• Warnings by Postgres
• Very Tedious to bring the database back
Transaction ID wraparound
23. VACUUM FULL :
• Reclaimed storage is returned back to the OS
• High on resource consumption
• Holds up lock
• Slower than Vacuum
• Requires extra disk space
Vacuum
24. Vacuum
VACUUM FREEZE :
• Marks a table's contents with a very special transaction timestamp
• Freezes committed transaction IDs
• Next update will unfreeze it.
25. VERBOSE :
• Prints a detailed vacuum activity report for each table
Vacuum
26. Vacuum
ANALYZE :
• Updates statistics used by the planner
• Determines the most efficient way to execute a query
31. • Released with Postgresql version 8.3
• Background utility
• Automate the execution of VACUUM and ANALYZE
• Not required to be triggered manually
• Does not free up the space
• Reallocates deleted blocks
Autovacuum
32. The autovacuum daemon consists of multiple processes :
• Autovacuum launcher
• Distributes the work across time, attempting to start one worker within each database
every 'autovacuum_nap' seconds
• Maximum of 'autovacuum_max_worker' worker processes are allowed to run at the
same time
• Checks each table within its database and execute VACUUM and/or ANALYZE as
needed
• Functionality of Autovacuum depends on the settings available inside postgresql.conf
file.
Autovacuum
33. Disabling AUTOVACUUM on a Table:
ALTER TABLE table_name SET (autovacuum_enabled = false);
Autovacuum
35. • Table bloating is prevented
• No locking on Tables
• Not resource-intensive
• Better Storage space utilization
• Improves FSM
Advantages of Autovacuum
36. • Prevents transaction id wraparound
• Faster query execution
• Faster sequential scans
• Better performance of Database
Advantages of Autovacuum
53. • Detecting Bloat
• Configuring Autovacuum suitable for most of the tables
• Maintenance windows
• Removing bloat for Indexes
• Avoid running manual vacuum/analyze for no reason
• Considering important and critical queries
• Performance after vacuum/analyze should be studied
Best Practises