SlideShare a Scribd company logo
1 of 46
MVCC
for Ruby developers
Michał Młoźniak
@roninek
Multiversion
Concurrency Control
Postgres internals
Motivation
To optimize your SQL queries
To quickly solve performance problems
MVCC, why?
● Support for many concurrent users
● Atomicity and Isolation (ACID)
● Performance
● Fewer locks
MVCC, how?
● Postgres stores multiple versions of the same row in the table
● INSERT is just plain insert
● DELETE is marking row as deleted
● UPDATE is DELETE old row and INSERT new row
● Postgres shows different versions of a row to different transactions
● After a while deleted rows are not visible to any running transactions
● They are called dead rows
● Postgres needs to cleanup dead rows from time to time
● It is like Garbage Collection in dynamic languages
Postgres stores multiple versions
of the same row in the table
MVCC, how?
● Postgres stores multiple versions of the same row in the table
● INSERT is just plain insert
● DELETE is marking row as deleted
● UPDATE is DELETE old row and INSERT new row
● Postgres shows different versions of a row to different transactions
● After a while deleted rows are not visible to any running transactions
● They are called dead rows
● Postgres needs to cleanup dead rows from time to time
● It is like Garbage Collection in dynamic languages
MVCC, details
● Postgres stores two additional columns for each row
● ID of transaction that created a row
● ID of transaction that deleted a row
● Each transaction gets its own id (TXID) at start of first modify statement
● TXIDs are 32-bits incremental integers
● Lower TXIDs mean earlier transactions
MVCC, details
● Two additional columns: xmin and xmax
● xmin is transaction ID that created the row
● xmax is transaction ID that deleted the row
● Those are hidden columns available in all tables
● You can see them by using explicit select statements
● You will get an error if you add columns with such names
MVCC, inspecting
● You can look into physical table files and find deleted rows
● Or you can use pageinspect extension
● It can fetch raw page data, page headers, page rows, etc
Transaction Snapshots
Transaction Snapshots
● Frozen view of current transactions status
● Snapshot has format xmin:xmax:xip, for example 12:16:12,14
● xmin = 12, this means that earliest running transaction id is 10
● All earliest transactions (less than 12) are either committed and visible, or
aborted and dead
● xmax = 16, first as-yet unassigned transaction id
● All transaction equal or greater than 16 are not yet started and thus invisible
● xip = [12, 14], active transactions only between xmin and xmax
● Transactions 13 and 15 are either committed and visible, or aborted and dead
MVCC, visibility checks
current snapshot 101:101:, all transactions were committed
xmin xmax visible?
25 YES
25 50 NO
50 110 YES
110 NO
110 120 NO
MVCC, visibility checks
Current snapshot 25:101:25,50,75, all transactions were committed
xmin xmax visible?
30 YES
50 NO
110 NO
30 80 NO
30 75 YES
30 110 YES
MVCC, visibility checks
Current snapshot 101:101:, transaction 75 was aborted.
xmin xmax visible?
30 YES
30 75 YES
75 NO
Snapshots and Isolation Levels
● Postgres supports 3 isolation levels (READ COMMITTED, REPEATABLE READ
and SERIALIZABLE)
● In READ COMMITTED snapshot is recorded at start of each SQL statement
● And at transaction start in higher isolation levels
Model.transaction(isolation: :repeatable_read) do
# transaction block ...
end
Model.transaction(isolation: :serializable) do
# transaction block ...
end
Commit Log
Commit Log
● 2 bits per transaction (in progress, committed, aborted, ...)
● Committing or aborting a transaction is just flipping a bit in Commit Log
● All transactions (committed and aborted) have side-effects
● Hint bits in table rows, optimization to avoid Commit Log lookups
● Innocent table scan can possibly update a lot of hint bits and perform heavy
table write
All transactions
(committed or aborted)
have side-effects
Commit Log
● 2 bits per transaction (in progress, committed, aborted, ...)
● Committing or aborting a transaction is just flipping a bit in Commit Log
● All transactions (committed and aborted) have side-effects
● Hint bits in table rows, optimization to avoid Commit Log lookups
● Innocent table scan can possibly update a lot of hint bits and perform heavy
table write
Vacuuming
MVCC, vacuuming
● Vacuum is like a Garbage Collector
● Looks for rows that are no longer visible to any running transactions and
removes them
● Avoid long-running transactions
● Makes room for new rows in existing pages
● Autovacuum can happen at any time
Avoid long-running transactions
MVCC, vacuuming
● Vacuum is like a Garbage Collector
● Looks for rows that are no longer visible to any running transactions and
removes them
● Avoid long-running transactions
● Makes room for new rows in existing pages
● Autovacuum can happen at any time
Autovacuum can happen at any time
Transaction Wraparound
Transaction Wraparound, huh?
● Transaction IDs (TIDs) are 32-bit integers
● That is ~ 4 billion transactions
● With enough traffic it can quickly wraparound
● Suddenly transactions that were in the past appear to be in the future
● And their output is invisible
Transaction Wraparound, solutions?
● Vacuum freezes old transactions, that are way in the past
● Freezing sets special flag on rows
● Set flag means that this row is visible to all transactions
● Can be done manually with VACUUM FREEZE
Main takeaways
● Postgres stores multiple versions of the same row in the table
● All transactions (committed or aborted) have side-effects
● All updates to the table create bloat
● Vacuum removes bloat and can happen at any time
● Avoid long-running transactions
More resources
● https://momjian.us/main/presentations/internals.html
● https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.html
● https://www.joyent.com/blog/manta-postmortem-7-27-2015
● http://www.interdb.jp/pg/index.html
● https://queue.acm.org/detail.cfm?id=3099561
Thank You!
Michał Młoźniak
@roninek
www.michalmlozniak.com

More Related Content

Similar to MVCC for Ruby developers

Keeping business logic out of your UIs
Keeping business logic out of your UIsKeeping business logic out of your UIs
Keeping business logic out of your UIsPetter Holmström
 
MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMydbops
 
LeanXcale Presentation - Waterloo University
LeanXcale Presentation - Waterloo UniversityLeanXcale Presentation - Waterloo University
LeanXcale Presentation - Waterloo UniversityRicardo Jimenez-Peris
 
Multi version Concurrency Control and its applications in Advanced database s...
Multi version Concurrency Control and its applications in Advanced database s...Multi version Concurrency Control and its applications in Advanced database s...
Multi version Concurrency Control and its applications in Advanced database s...GauthamSK4
 
Spring Transaction Management
Spring Transaction ManagementSpring Transaction Management
Spring Transaction ManagementYe Win
 
MongoDB WiredTiger Internals: Journey To Transactions
  MongoDB WiredTiger Internals: Journey To Transactions  MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsM Malai
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Mydbops
 
M|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsM|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsMariaDB plc
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEMushfekur Rahman
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020OdessaJS Conf
 
Understanding and controlling transaction logs
Understanding and controlling transaction logsUnderstanding and controlling transaction logs
Understanding and controlling transaction logsRed Gate Software
 
"Payment System: Survival Guide", Oleksandr Tarasenko
"Payment System: Survival Guide",  Oleksandr Tarasenko"Payment System: Survival Guide",  Oleksandr Tarasenko
"Payment System: Survival Guide", Oleksandr TarasenkoFwdays
 
Transaction and concurrency pitfalls in Java
Transaction and concurrency pitfalls in JavaTransaction and concurrency pitfalls in Java
Transaction and concurrency pitfalls in JavaErsen Öztoprak
 
MySQL Transaction Isolation Levels (lightning talk)
MySQL Transaction Isolation Levels (lightning talk)MySQL Transaction Isolation Levels (lightning talk)
MySQL Transaction Isolation Levels (lightning talk)Federico Razzoli
 
MySQL Timeout Variables Explained
MySQL Timeout Variables Explained MySQL Timeout Variables Explained
MySQL Timeout Variables Explained Mydbops
 
SQL Server Transaction Management
SQL Server Transaction ManagementSQL Server Transaction Management
SQL Server Transaction ManagementMark Ginnebaugh
 

Similar to MVCC for Ruby developers (20)

Transactions
TransactionsTransactions
Transactions
 
Keeping business logic out of your UIs
Keeping business logic out of your UIsKeeping business logic out of your UIs
Keeping business logic out of your UIs
 
MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
 
LeanXcale Presentation - Waterloo University
LeanXcale Presentation - Waterloo UniversityLeanXcale Presentation - Waterloo University
LeanXcale Presentation - Waterloo University
 
Multi version Concurrency Control and its applications in Advanced database s...
Multi version Concurrency Control and its applications in Advanced database s...Multi version Concurrency Control and its applications in Advanced database s...
Multi version Concurrency Control and its applications in Advanced database s...
 
Spring Transaction Management
Spring Transaction ManagementSpring Transaction Management
Spring Transaction Management
 
MongoDB WiredTiger Internals: Journey To Transactions
  MongoDB WiredTiger Internals: Journey To Transactions  MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication
 
M|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsM|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write Paths
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEE
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
 
Understanding and controlling transaction logs
Understanding and controlling transaction logsUnderstanding and controlling transaction logs
Understanding and controlling transaction logs
 
"Payment System: Survival Guide", Oleksandr Tarasenko
"Payment System: Survival Guide",  Oleksandr Tarasenko"Payment System: Survival Guide",  Oleksandr Tarasenko
"Payment System: Survival Guide", Oleksandr Tarasenko
 
Transaction and concurrency pitfalls in Java
Transaction and concurrency pitfalls in JavaTransaction and concurrency pitfalls in Java
Transaction and concurrency pitfalls in Java
 
MySQL Transaction Isolation Levels (lightning talk)
MySQL Transaction Isolation Levels (lightning talk)MySQL Transaction Isolation Levels (lightning talk)
MySQL Transaction Isolation Levels (lightning talk)
 
Transactions
TransactionsTransactions
Transactions
 
MySQL Timeout Variables Explained
MySQL Timeout Variables Explained MySQL Timeout Variables Explained
MySQL Timeout Variables Explained
 
SQL Server Transaction Management
SQL Server Transaction ManagementSQL Server Transaction Management
SQL Server Transaction Management
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

MVCC for Ruby developers

  • 1. MVCC for Ruby developers Michał Młoźniak @roninek
  • 5. To optimize your SQL queries
  • 6. To quickly solve performance problems
  • 7.
  • 8. MVCC, why? ● Support for many concurrent users ● Atomicity and Isolation (ACID) ● Performance ● Fewer locks
  • 9. MVCC, how? ● Postgres stores multiple versions of the same row in the table ● INSERT is just plain insert ● DELETE is marking row as deleted ● UPDATE is DELETE old row and INSERT new row ● Postgres shows different versions of a row to different transactions ● After a while deleted rows are not visible to any running transactions ● They are called dead rows ● Postgres needs to cleanup dead rows from time to time ● It is like Garbage Collection in dynamic languages
  • 10. Postgres stores multiple versions of the same row in the table
  • 11. MVCC, how? ● Postgres stores multiple versions of the same row in the table ● INSERT is just plain insert ● DELETE is marking row as deleted ● UPDATE is DELETE old row and INSERT new row ● Postgres shows different versions of a row to different transactions ● After a while deleted rows are not visible to any running transactions ● They are called dead rows ● Postgres needs to cleanup dead rows from time to time ● It is like Garbage Collection in dynamic languages
  • 12. MVCC, details ● Postgres stores two additional columns for each row ● ID of transaction that created a row ● ID of transaction that deleted a row ● Each transaction gets its own id (TXID) at start of first modify statement ● TXIDs are 32-bits incremental integers ● Lower TXIDs mean earlier transactions
  • 13. MVCC, details ● Two additional columns: xmin and xmax ● xmin is transaction ID that created the row ● xmax is transaction ID that deleted the row ● Those are hidden columns available in all tables ● You can see them by using explicit select statements ● You will get an error if you add columns with such names
  • 14.
  • 15. MVCC, inspecting ● You can look into physical table files and find deleted rows ● Or you can use pageinspect extension ● It can fetch raw page data, page headers, page rows, etc
  • 16.
  • 18. Transaction Snapshots ● Frozen view of current transactions status ● Snapshot has format xmin:xmax:xip, for example 12:16:12,14 ● xmin = 12, this means that earliest running transaction id is 10 ● All earliest transactions (less than 12) are either committed and visible, or aborted and dead ● xmax = 16, first as-yet unassigned transaction id ● All transaction equal or greater than 16 are not yet started and thus invisible ● xip = [12, 14], active transactions only between xmin and xmax ● Transactions 13 and 15 are either committed and visible, or aborted and dead
  • 19.
  • 20.
  • 21. MVCC, visibility checks current snapshot 101:101:, all transactions were committed xmin xmax visible? 25 YES 25 50 NO 50 110 YES 110 NO 110 120 NO
  • 22. MVCC, visibility checks Current snapshot 25:101:25,50,75, all transactions were committed xmin xmax visible? 30 YES 50 NO 110 NO 30 80 NO 30 75 YES 30 110 YES
  • 23. MVCC, visibility checks Current snapshot 101:101:, transaction 75 was aborted. xmin xmax visible? 30 YES 30 75 YES 75 NO
  • 24. Snapshots and Isolation Levels ● Postgres supports 3 isolation levels (READ COMMITTED, REPEATABLE READ and SERIALIZABLE) ● In READ COMMITTED snapshot is recorded at start of each SQL statement ● And at transaction start in higher isolation levels
  • 25.
  • 26.
  • 27.
  • 28. Model.transaction(isolation: :repeatable_read) do # transaction block ... end Model.transaction(isolation: :serializable) do # transaction block ... end
  • 30. Commit Log ● 2 bits per transaction (in progress, committed, aborted, ...) ● Committing or aborting a transaction is just flipping a bit in Commit Log ● All transactions (committed and aborted) have side-effects ● Hint bits in table rows, optimization to avoid Commit Log lookups ● Innocent table scan can possibly update a lot of hint bits and perform heavy table write
  • 31. All transactions (committed or aborted) have side-effects
  • 32.
  • 33. Commit Log ● 2 bits per transaction (in progress, committed, aborted, ...) ● Committing or aborting a transaction is just flipping a bit in Commit Log ● All transactions (committed and aborted) have side-effects ● Hint bits in table rows, optimization to avoid Commit Log lookups ● Innocent table scan can possibly update a lot of hint bits and perform heavy table write
  • 35. MVCC, vacuuming ● Vacuum is like a Garbage Collector ● Looks for rows that are no longer visible to any running transactions and removes them ● Avoid long-running transactions ● Makes room for new rows in existing pages ● Autovacuum can happen at any time
  • 37. MVCC, vacuuming ● Vacuum is like a Garbage Collector ● Looks for rows that are no longer visible to any running transactions and removes them ● Avoid long-running transactions ● Makes room for new rows in existing pages ● Autovacuum can happen at any time
  • 38. Autovacuum can happen at any time
  • 40.
  • 41.
  • 42. Transaction Wraparound, huh? ● Transaction IDs (TIDs) are 32-bit integers ● That is ~ 4 billion transactions ● With enough traffic it can quickly wraparound ● Suddenly transactions that were in the past appear to be in the future ● And their output is invisible
  • 43. Transaction Wraparound, solutions? ● Vacuum freezes old transactions, that are way in the past ● Freezing sets special flag on rows ● Set flag means that this row is visible to all transactions ● Can be done manually with VACUUM FREEZE
  • 44. Main takeaways ● Postgres stores multiple versions of the same row in the table ● All transactions (committed or aborted) have side-effects ● All updates to the table create bloat ● Vacuum removes bloat and can happen at any time ● Avoid long-running transactions
  • 45. More resources ● https://momjian.us/main/presentations/internals.html ● https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.html ● https://www.joyent.com/blog/manta-postmortem-7-27-2015 ● http://www.interdb.jp/pg/index.html ● https://queue.acm.org/detail.cfm?id=3099561