Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Postgres vision 2018: The Promise of zheap


Published on

PostgreSQL has advanced in many ways but bloat remains a challenge. A solution for this in development is zheap, a new storage format in which only the latest version of the data is kept in main storage and the old version will be moved to an undo log. In this presentation delivered at Postgres Vision 2018, Robert Haas, a Major Contributor to the PostgreSQL project who is leading development of zheap at EnterpriseDB, where he is Vice President, Chief Database Architect, explains the project.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Postgres vision 2018: The Promise of zheap

  1. 1. © 2013 EDB All rights reserved. 1 zheap: Why is EntepriseDB developing a new storage format for PostgreSQL? • Robert Haas | 2018-06-05
  2. 2. © 2018 EDB All rights reserved. 2 • New storage format being developed by EnterpriseDB • Work in progress is already released on github under PostgreSQL license • Basics work, but much remains to be done • Goal is to get it integrated into PostgreSQL zheap: What is it?
  3. 3. © 2018 EDB All rights reserved. 3 • Original motivation for zheap: In some workloads, PostgreSQL tables tend to bloat, and when they do, it’s hard to get rid of the bloat. • Bloat occurs when the table and indexes grow even though the amount of real data being stored has not increased. • Bloat is caused mainly by updates, because we must keep both the old and new row versions for a period of time. • Bloat can be a concern because of increased disk consumption, but typically a bigger concern is performance loss – if a table is twice as big as it “should be”, scanning it takes twice as long. Bloat: Motivation and Definition
  4. 4. © 2018 EDB All rights reserved. 4 • All systems that use MVCC must deal with multiple row versions, but they store them in different places. – PostgreSQL and Firebird put all row versions in the table. – Oracle and MySQL put old row versions in the undo log. – SQL Server puts old row versions in tempdb. • Leaving old row versions in the table makes cleanup harder – sometimes need to use CLUSTER or VACUUM FULL. • Improving VACUUM helps contain bloat, but can’t prevent it completely. Bloat: Why a new storage format?
  5. 5. © 2018 EDB All rights reserved. 5 • Whenever a transaction performs an operation on a tuple, the operation is also recorded in an undo log. • If the transaction aborts, the undo log can be used to reverse all of the operations performed by the transaction. • The undo log also contains most of the data we need for MVCC purposes, so little transactional data needs to be stored in the table itself. This, and a reduction in alignment padding, mean that zheap is smaller on disk. • We avoid dirtying the page except when the data has been modified, or after an abort. No VACUUM, no freezing, no “routine maintenance” at all! zheap: How does it work?
  6. 6. © 2018 EDB All rights reserved. 6 • INSERT: Same as current heap, but in case of an abort, dead row versions will be removed immediately by undo, not at a later time by VACUUM. • DELETE: Same as current heap … mostly. • UPDATE: Very different! Whenever possible, we want to update the tuple “in place,” without storing a second version in the heap. – The old version of the tuple will be stored in the undo log. – In-place updates prevent bloat! The undo log may bloat, but it will shrink as soon as the relevant transactions are no longer running. zheap: Basic Operations
  7. 7. © 2018 EDB All rights reserved. 7 • In the existing heap, every update is either HOT (no indexes updated) or non-HOT (insert into every index). • In zheap, every update is either in-place or not. At present, like a HOT update, an in-place update cannot modify any indexed columns. • An “in-place” update is significantly better than a HOT update because it does not require that the page contain adequate space for the entire new tuple. – We don’t need any extra space at all unless the new version of the tuple is wider than the old one. • In the future, we will also be able to perform in-place updates when indexed columns have been modified. Updates: “In Place” or not?
  8. 8. © 2018 EDB All rights reserved. 8 • Current version of zheap works without any changes to index access methods. • We plan to continue supporting the use of unmodified index access methods with zheap. • However, if indexes are modified to support “delete- marking,” we could do in-place updates even when indexed columns are modified. • When performing an in-place update, mark the old index entry as possibly-deleted, and insert a new one. No changes to indexes where the corresponding columns aren’t modified! zheap: Index Support
  9. 9. © 2018 EDB All rights reserved. 9 • In the current heap, each non-HOT update incurs one insert to every index. • With zheap, each in-place update will incur one insert and one delete-marking operation for each index, but only for indexes where the indexed columns are modified. • By removing the restriction that no indexed columns can be modified, we will be able to perform nearly all updates in place! • Only updates that expand the row so that it no longer fits on the page will need to be performed as not-in- place. zheap: Index Support (2)
  10. 10. © 2018 EDB All rights reserved. 10 • If all indexes on the table support delete-marking, maybe we don’t need VACUUM any more. • Remember, zheap pages don’t need to be hinted, frozen, etc. If there are leftover tuples, we can remove them when we want to reuse the space, rather than in advance. • Delete-marked index tuples can be removed “lazily” - perhaps when they are scanned, or when they are evicted from shared buffers. Index pages that are never accessed again might be bloated, but that doesn’t have much impact on performance. Eliminating VACUUM
  11. 11. © 2018 EDB All rights reserved. 11 • If we don’t VACUUM, we can’t ever “lose” free space. This will require changes to free space tracking. – UPDATE can create free space if the new row version is narrower than the old one, or if the update is not-in-place. – DELETE always creates free space. – In either case, the free space can’t be used until the transaction commits. • VACUUM could still be an option for users wanting to clean up more aggressively. Eliminating VACUUM (2)
  12. 12. © 2018 EDB All rights reserved. 12 • pgbench, scale factor 1000 • Simple-update test (1 select, 1 insert, 1 update) • 64-bit Linux, x86, 2 sockets, 14 cores/socket, 64GB • shared_buffers=32GB, min_wal_size=15GB, max_wal_size=20GB, checkpoint_timeout=1200, maintenance_work_mem=1GB, checkpoint_completion_target=0.9, synchronous_commit=off Performance Data – Test Setup
  13. 13. © 2018 EDB All rights reserved. 13 ● Initial size of accounts table is 13GB in heap – only 11GB in zheap. ● heap grows to 19GB at 8 clients count test and 26GB at 64-clients. zheap stays at 11GB! ● All the undo generated during test gets discarded within a few seconds after the open transaction is ended. ● TPS for zheap is ~40% more than heap in above tests at 8 client-count. In some other high-end machines, we have seen up to ~100% improvement for similar test.
  14. 14. © 2018 EDB All rights reserved. 14 • Because zheap is smaller on-disk, we get a small performance boost. • No worries about VACUUM kicking in unexpectedly. • Undo bloat is self-healing – good for cloud or other “unattended” workloads. • In workloads where the heap bloats and zheap only bloats the undo, we get a massive performance boost. • Discarding undo happens in the background and is cheaper than HOT pruning; that helps, too! Benefits
  15. 15. © 2018 EDB All rights reserved. 15 • Transaction abort will be more expensive. • Deletes might not perform as well. • Could be slow if most/all indexed columns are updated at the same time. • Huge amount of development work. Drawbacks
  16. 16. © 2018 EDB All rights reserved. 16 • Allow PostgreSQL to support pluggable storage formats. • Allows innovation – major changes to the heap are impossible because everyone relies on it. Can’t go backwards for any use case! • Allows for user choice – if there are multiple storage formats available, pick the one that is best for your use case. • Hope to see this in PostgreSQL 12 (Fall 2019). Pluggable Storage: Plan
  17. 17. © 2018 EDB All rights reserved. 17 • Columnar storage – Most queries don’t need all columns. • Write-once read-many (WORM) – No support UPDATE, DELETE, or SELECT FOR UPDATE/SHARE. • Index-organized storage – One index is more important than all of the others. • In-memory storage – No need to spill to disk. • Non-transactional storage. – No MVCC. Pluggable Storage: Examples
  18. 18. © 2018 EDB All rights reserved. 18 • PostgreSQL 12 or 13 • There will still be much more to do for “v2”. When?
  19. 19. © 2018 EDB All rights reserved. 19 zheap • Amit Kapila (development lead) • Dilip Kumar • Kuntal Ghosh • Mithun CY • Ashutosh Sharma • Rafia Sabih • Beena Emerson • Amit Khandekar • Thomas Munro Who? Pluggable Storage • Haribabu Kommi (Fujitsu) • Alexander Korotkov (Postgres Pro) • Andres Freund • Ashutosh Bapat
  20. 20. © 2018 EDB All rights reserved. 20 • Any Questions? Thanks