Scaling a data-tier requires multiple concurrent database connections that are all vying for read and write access of the same data. In order to cater to this complex demand, PostgreSQL implements a concurrency method known as Multi Version Concurrency Control, or MVCC. By understating MVCC, you will be able to take advantage of advanced features such as transactional memory, atomic data isolation, and point in time consistent views.
This presentation will show you how MVCC works in both a theoretical and practical level. Furthermore, you will learn how to optimize common tasks such as database writes, vacuuming, and index maintenance. Afterwards, you will have a fundamental understanding on how PostgreSQL operates on your data.
Key points discussed:
* MVCC; what is really happening when I write data.
* Vacuuming; why it is needed and what is really going on.
* Transactions; much more then just an undo button.
* Isolation levels; seeing only the data you want to see.
* Locking; ensure writes happen in the order you choose.
* Cursors; how to stream chronologically correct data more efficiency.
SQL examples given during the presentation are available here: http://www.reactive.io/academy/presentations/postgresql/mvcc/mvcc-examples.zip
3. AGENDA
MVCC: what it is and it matters
Transactions: more then just an undo button
Isolation Levels: seeing what you need to see
Locking: control when your data is written
Cursors: stream chronologically correct data
Summary: bringing it all together
Questions: ready, fire, aim
6. WHAT IS MVCC USING
BOXES AND ARROWS
Declarative
Control
Isolated
Multi Version
Data
Parallel
Multi-User
Concurrency
of
7. WHAT DOES MVCC DO?
1: Multiple users are able access the
same data at the same time.
2:
Every user sees their own isolated
snapshot of the database.
3:
Changes made by one user, will not be
seen by any other user until their
transaction is committed.
8. THE MVCC SALES PITCH
MULTI VERSION
Atomic Updates
Consistent Data
Isolated Reads
CONCURRENCY
Higher efficiency
Simpler operations
Engineering agility
19. HOW TO SOLVE THIS PROBLEM
Pessimistic locking: lock everything during writes
Imperative controls: synchronization and mutexes
System build out: everyone gets their own database
Let the cards fall: whatever happens, happens…
20. HOW TO SOLVE THIS PROBLEM
Pessimistic locking: lock everything during writes
Imperative controls: synchronization and mutexes
System build out: everyone gets their own database
Let the cards fall: whatever happens, happens
MVCC: Let the database handle the particulars
45. ROW LEVEL LOCKING
Name Lock Type Blocks Update
Blocks Select
For Update
For Share Row Share ✔
select * from people where id = 1 for share;
For Update Row Exclusive ✔ ✔
select * from people where id = 1 for update;
47. CURSORS
Streaming: break large datasets in smaller segments
Efficient: reduce a queries memory consumption
Isolated: return chronologically correct data
Traversable: can scan forward, backwards and more
Flexible: PL/pgSQL functions can return/accept cursors
49. SUMMARY
Powerful: interact with your data on your terms
Declarative: easy to use, less chance of mistakes
Efficient: use less resources to work with more data
Scalable: handle more processes with larger volume
Flexible: do what you need do when you need it