Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PostgreSQL Logical Decoding


Published on

Logical decoding is one of the most prominent features of PostgreSQL 9.4 but, probably, not the most praised or known. It is the basis for in-core logical replication and many other replication solutions that will be coming to future PostgreSQL versions. But logical decoding is much more than that, it opens the doors to a whole new slew of applications that are not related, or do not require, replication.

Following PostgreSQL's extensibility tradition, logical decoding is pluggable (you can plug in code to determine what the output format is) and, most importantly, it is externally controllable. With this toolset, database DMLs can be streamed to third-party applicationss, enabling middleware solutions for data replication, filtering, modification, routing or interchange with different systems. A new world of data interchange and integration.

This talk explained how logical decoding works, what features does it have, and how to use it from external applications, diving deep into the techniques and the code. Real world examples would be provided and a discussion about new PostgreSQL-based middleware solutions would be started.

Published in: Technology

PostgreSQL Logical Decoding

  1. 1. Logical decoding Logical decoding: a new world of data exchange applications for PostgreSQL Álvaro Hernández <>
  2. 2. Logical Decoding ● Research & Development in Databases ● Consulting, Training and Support in PostgreSQL ● Founders of PostgreSQL España, 5th largest PUG in the world (500 members) ● Myself: Álvaro Hernández, 8Kdata CTO Twitter: @ahachete LinkedIn: About
  3. 3. Logical Decoding ● Extract changes (INSERT, UPDATE, DELETE) from PostgreSQL in a database-independent way ● Changes are per-row and (commit) ordered. Changes can be streamed from PostgreSQL ● Replicate the state of the database externally: ✔ Replication solutions ✔ Materialized databases ✔ Third-party data-processing applications Logical Decoding in 1 slide
  4. 4. Logical Decoding: How it works
  5. 5. Logical Decoding “Logical decoding is the process of extracting all persistent changes to a database’s tables into a coherent, easy to understand format which can be interpreted without detailed knowledge of the database’s internal state” ● Changes are decoded row by row, even if they were produced by a single command ● Unlogged and temp tables are not decoded Logical Decoding: the basics
  6. 6. Logical Decoding ● As of today, no DDL is decoded (empty tx may appear in the stream) ● Requires superuser or replication permissions ● Logical decoding works on logical replication slots: fine control the amount of WAL to be kept at the server Logical Decoding: the basics
  7. 7. Logical Decoding ● The output (database-independent representation of the change) format is controlled by an output plugin ● Loaded dynamically (shared library) ● Text or binary output ● A default one is in contrib ('test_decoding') Logical Decoding Plugins
  8. 8. Logical Decoding SQL ● Poll for changes ● SQL interface (function calls) ● Primarily meant for testing/debugging Logical Decoding interfaces Streaming Replication ● Changes are pushed by PostgreSQL ● Exports the snapshot while connection open ● Allows for synchronous (logical) replication
  9. 9. Logical Decoding ● wal_level = logical ● max_replication_slots = <x> ● If accessed over the replication interface: ➔ max_wal_senders = <y> ➔ Configure pg_hba.conf to allow replication ● synchronous_commit = on (optional; decoding starts as data is flushed) Logical Decoding configuration
  10. 10. Logical Decoding SELECT * FROM pg_create_logical_replication_slot ('<slot_name>', 'test_decoding'); -- do changes in the db SELECT * FROM pg_logical_slot_get_changes ('<slot_name>', null, null); SQL interface
  11. 11. Logical Decoding ● Obtain the changes: ➔ get vs peek (consume / not consume changes) ➔ {get,peek}_binary: output is bytea ● Output plugin options: control output format test_decoding options: ➔ …, 'include-timestamp', 'on', … ➔ …, 'include-xids', '1', ... ● Other arguments: upto_lsn, n_entries SQL interface
  12. 12. Logical Decoding ● Drop the slot when finished using it: select pg_drop_replication_slot('<slot_name>'); ● If slot is not consumed, all WAL since slot creation are retained! ● Also influences vacuum on catalog tables ● If logical decoding client crashes, your database may end up stopping if pg_xlog fills! Slots lifecycle
  13. 13. Logical Decoding ● pg_replication_slots View with information about both physical and logical replication slots ● pg_stat_replication View replication statistics (logical decoding only if connected via the replication interface) Checking status
  14. 14. Logical Decoding ● On UPDATE or DELETE the old row information is decoded depending on REPLICA IDENTITY: ➔ DEFAULT: values from PK (if any) ➔ FULL: all values ➔ NOTHING: none ➔ USING INDEX <index_name>: values covered by the index (not null, not partial) ● ALTER TABLE … REPLICA IDENTITY … REPLICA IDENTITY
  15. 15. Logical Decoding ● Send commands over replication protocol ● Or test with psql: psql "dbname=postgres replication=database" -c "CREATE_REPLICATION_SLOT slotname LOGICAL test_decoding <options>" ● Or use pg_recvlogical! pg_recvlogical --slot slotname --create -d db pg_recvlogical --slot slotname --start -f - -d db Replication interface
  16. 16. Logical Decoding: the door to a new world of data exchange and integration applications for PostgreSQL
  17. 17. Logical Decoding ● Logical Decoding is primarily used for replication (UDR, BDR, Slony dev branch, others?) ● But it is much more than that. Much more! ➔ Extract data from PostgreSQL, reproduce in other systems ➔ Create externally-controlled data applications ➔ Integrate with other systems (even PostgreSQL!), like in a sharding environment Data exchange, data integration
  18. 18. Logical Decoding ● test_decoding Text output, not easily parseable, but works. Included in contrib ● master/decoder_raw Decodes to SQL ● Decodes to protocol buffers Output plugins
  19. 19. Logical Decoding ● Decodes to JSON ● Decodes to avro, injects into Kafka! ● Your plugin? Bottled Water
  20. 20. Logical Decoding ● Logical Decoding is event sourcing for Postgres ● Extract your changes, process them and: ➔ Data changes queueing ➔ Cache synchronization ➔ Perform real-time processing (in-memory dbs) ➔ Materialize databases ➔ Audit systems ➔ (Logical) Replication! Event sourcing
  21. 21. Logical Decoding Example architecture Source:
  22. 22. Logical Decoding ● Implement the replication protocol ● Open replication connection. Get snapshot ● Open new SQL connection. SET TRANSACTION SNAPSHOT, repeatable read tx. Dump all data ● COPY BOTH (replication connection). Receive changes ● Send feedback to the server! External Logical Decoding 101
  23. 23. Logical Decoding ● libpq If you are on C, that's all you need ;) ● Python: Zalando's psycopg fork ● Java: 8Kdata's incoming Phoebe project (open source, but not published yet) Logical Decoding Clients
  24. 24. Logical Decoding ● PostgreSQL docs ● decoding-pgcon-2014-05-23.pdf ● g.pdf ● n_logirep.pdf More information