SlideShare a Scribd company logo
1 of 41
Download to read offline
walbouncer: Filtering WAL
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
walbouncer: Filtering WAL
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Current status
WAL streaming is the basis for replication
Limitations:
Currently an entire database instance has to be replicated
There is no way to replicate single databases
WAL used to be hard to read
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
The goal
Create a “WAL-server” to filter the transaction log
Put walbouncer between the PostgreSQL master and the
“partial” slave
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
How is it done?
The basic structure of the WAL is very actually fairly nice to
filter
Each WAL record that accesses a database has a RelFileNode:
database OID
tablespace OID
data file OID
What more do we need?
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
WAL format
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
WAL format
WAL logically is a stream of records.
Each record is identified by position in this stream.
WAL is stored in 16MB files called segments.
Each segment is composed of 8KB WAL pages.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
WAL pages
Wal pages have a header containing:
16bit “magic” value
Flag bits
Timeline ID
Xlog position of this page
Length of data remaining from last record on previous page
Additionally first page of each segment has the following
information for correctness validation:
System identifier
WAL segment and block sizes
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Xlog record structre
Total length of record
Transaction ID that produced the record
Length of record specific data excluding header and backup
blocks
Flags
Record type (e.g. Xlog checkpoint, transaction commit, btree
insert)
Start position of previous record
Checksum of this record
Record specific data
Full page images
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Handling WAL positions
WAL positions are highly critical
WAL addresses must not be changed
addresses are stored in data page headers to decide if replay is
necessary.
The solution:
inject dummy records into the WAL
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Dummy records
PostgreSQL has infrastructure for dummy WAL entries
(basically “zero” values)
Valid WAL records can therefore be replaced with dummy
ones quite easily.
The slave will consume and ignore them
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Question: What to filter?
What about the shared catalog?
We got to replicate the shared catalog
This has some consequences:
The catalog might think that a database called X is around
but in fact files are missing.
This is totally desirable
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Catalog issues
There is no reasonably simple way to filter the content of the
shared catalog (skip rows or so).
It is hardly possible to add semantics to the filtering
But, this should be fine for most users
If you want to access a missing element, you will simply get
an error (missing file, etc.).
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Replication protocol
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Streaming replication
Replication connections use the same wire protocol as regular
client connections.
However they are serviced by special backend processes called
walsenders
Walsenders are started by including “replication=true” in the
startup packet.
Replication connections support different commands from a
regular backend.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Starting up replication 1/2
When a slave connects it first executes IDENTIFY SYSTEM
postgres=# IDENTIFY_SYSTEM;
systemid | timeline | xlogpos | dbname
---------------------+----------+-----------+--------
6069865756165247251 | 2 | 0/3B7E910 |
Then any necessary timeline history files are fetched:
postgres=# TIMELINE_HISTORY 2;
filename | content
------------------+----------------------------------
00000002.history | 1 0/3000090 no recovery
target specified+
|
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Starting up replication 2/2
Streaming of writeahead log is started by executing:
START_REPLICATION [SLOT slot_name] [PHYSICAL] XXX/XXX
[TIMELINE tli]
START REPLICATION switches the connection to a
bidirectional COPY mode.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Replication messages
Replication messages are embedded in protocol level copy
messages. 4 types of messages:
XLogData (server -> client)
Keepalive message (server -> client)
Standby status update (client -> server)
Hot Standby feedback (client -> server)
To end replication either end can close the copying with a
CopyDone protocol message.
If the WAL stream was from an old timeline the server sends a
result set with the next timeline ID and start position upon
completion.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Other replication commands
Other replication commands:
START REPLICATION ... LOGICAL ...
CREATE REPLICATION SLOT
DROP REPLICATION SLOT
BASE BACKUP
Not supported by walbouncer yet.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Software design
Client connects to the WAL bouncer instead of the master.
WAL bouncer forks, connects to the master and streams xlog
to the client.
A this point the WAL proxy does not buffer stuff.
One connection to the master per client.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Software design - filtering
WAL stream is split into records.
Splitting works as a state machine consuming input and
immediately forwarding it. We only buffer when we need to
wait for a RelFileNode to decide whether we need to filter.
Based on record type we extract the RelFileNode from the
WAL record and decide if we want to filter it out.
If we want to filter we replace the record with a XLOG NOOP
that has the same total length.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Synchronizing record decoding
Starting streaming is hard. Client may request streaming from
the middle of a record.
We have a state machine for synchronization.
Determine if we are in a continuation record from WAL page
header.
If we are, stream data until we have buffered the next record
header.
From next record header we read the previous record link, then
restart decoding from that position.
Once we hit the requested position stream the filtered data out
to client.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Using WAL bouncer
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Streaming
At this point we did not want to take the complexity of
implementing buffering.
Getting right what we got is already an important step
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
How to set things up
First f all an initial base backup is needed
The easiest thing here is rsync
Skip all directories in “base”, which do not belong to your
desired setup
pg database is needed to figure out, what to skip.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Start streaming
Once you got the initial copy, setup streaming replication just
as if you had a “normal” master.
Use the address of the walbouncer in your primary conninfo
You will not notice any difference during replication
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Important aspects
If you use walbouncer use the usual streaming replication
precautions
enough wal keep segments or use
take care of conflicts (hot standby feedback)
etc.
You cannot promote a slave that has filtered data to master.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Adding databases
List of database OIDs to filter out is only fetched at
walbouncer backend startup.
Disconnect any streaming slaves and reconfigure filtering while
you are creating the database.
If you try to do it online you the slaves will not know to filter
the new database.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Dropping databases
Have all slaves that want to filter out the dropped database
actively streaming before you execute the drop.
Otherwise the slaves will not know to skip the drop record and
xlog replay will fail with an error.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Can we filter on individual tables
For filtering individual tables you can use tablespace filtering
functionality.
Same caveats apply for adding-removing tablespaces as for
databases.
You can safely add/remove tables in filtered tablespaces and
even move tables between filtered/non-filtered tablespaces.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Bonus feature
You can use walbouncer to switch the master server without
restarting the slave.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Limitations
Currently PostgreSQL 9.4 only.
No SSL support yet.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Simple configuration
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A sample config file (1)
listen_port: 5433
master:
host: localhost
port: 5432
configurations:
- slave1:
match:
application_name: slave1
filter:
include_tablespaces: [spc_slave1]
exclude_databases: [test]
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A sample config file (2)
- slave2:
match:
application_name: slave2
filter:
include_tablespaces: [spc_slave2]
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
application name
The application name is needed to support synchronous
replication as well as better monitoring
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
include tablespaces / exclude tablespaces
A good option to exclude entire groups of databases
In a perverted way this can be used to filter on tables
No need to mention that you should not
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Finally
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Where can we download the stuff?
For download and more information visit::
www.postgresql-support.de/walbouncer/
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Thank you for your attention
Any questions?
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Contact
Cybertec Sch¨onig & Sch¨onig GmbH
Gr¨ohrm¨uhlgasse 26
A-2700 Wiener Neustadt
www.postgresql-support.de
Hans-J¨urgen Sch¨onig
www.postgresql-support.de

More Related Content

What's hot

Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationAlexey Lesovsky
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterAlexey Lesovsky
 
PostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL-Consulting
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 
Managing a 14 TB reporting datawarehouse with postgresql
Managing a 14 TB reporting datawarehouse with postgresql Managing a 14 TB reporting datawarehouse with postgresql
Managing a 14 TB reporting datawarehouse with postgresql Soumya Ranjan Subudhi
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingAlexey Bashtanov
 
collectd & PostgreSQL
collectd & PostgreSQLcollectd & PostgreSQL
collectd & PostgreSQLMark Wong
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internalsAlexey Ermakov
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesOdoo
 
Understanding Autovacuum
Understanding AutovacuumUnderstanding Autovacuum
Understanding AutovacuumDan Robinson
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsOdoo
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollPGConf APAC
 
12c SQL Plan Directives
12c SQL Plan Directives12c SQL Plan Directives
12c SQL Plan DirectivesFranck Pachot
 
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
SCALE 15x Minimizing PostgreSQL Major Version Upgrade DowntimeSCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
SCALE 15x Minimizing PostgreSQL Major Version Upgrade DowntimeJeff Frost
 

What's hot (20)

Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
PostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQ
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
Managing a 14 TB reporting datawarehouse with postgresql
Managing a 14 TB reporting datawarehouse with postgresql Managing a 14 TB reporting datawarehouse with postgresql
Managing a 14 TB reporting datawarehouse with postgresql
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
collectd & PostgreSQL
collectd & PostgreSQLcollectd & PostgreSQL
collectd & PostgreSQL
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues
 
Understanding Autovacuum
Understanding AutovacuumUnderstanding Autovacuum
Understanding Autovacuum
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo apps
 
Strategic autovacuum
Strategic autovacuumStrategic autovacuum
Strategic autovacuum
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'roll
 
12c SQL Plan Directives
12c SQL Plan Directives12c SQL Plan Directives
12c SQL Plan Directives
 
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
SCALE 15x Minimizing PostgreSQL Major Version Upgrade DowntimeSCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
 

Similar to Walbouncer: Filtering PostgreSQL transaction log

Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxnadirpervez2
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecFIRAT GULEC
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sitesYann Malet
 
Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features" Anar Godjaev
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
 
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011Mike Willbanks
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterSrihari Sriraman
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
 
The Magic of Hot Streaming Replication, Bruce Momjian
The Magic of Hot Streaming Replication, Bruce MomjianThe Magic of Hot Streaming Replication, Bruce Momjian
The Magic of Hot Streaming Replication, Bruce MomjianFuenteovejuna
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recoverymaclean liu
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Clusteryoku0825
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualizationFranck Pachot
 
pg_prefaulter: Scaling WAL Performance
pg_prefaulter: Scaling WAL Performancepg_prefaulter: Scaling WAL Performance
pg_prefaulter: Scaling WAL PerformanceSean Chittenden
 
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...NETWAYS
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Lviv Startup Club
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
 

Similar to Walbouncer: Filtering PostgreSQL transaction log (20)

Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
 
Oracle NOLOGGING
Oracle NOLOGGINGOracle NOLOGGING
Oracle NOLOGGING
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat Gulec
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sites
 
Performance
PerformancePerformance
Performance
 
Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features"
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 
The Magic of Hot Streaming Replication, Bruce Momjian
The Magic of Hot Streaming Replication, Bruce MomjianThe Magic of Hot Streaming Replication, Bruce Momjian
The Magic of Hot Streaming Replication, Bruce Momjian
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Cluster
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
 
pg_prefaulter: Scaling WAL Performance
pg_prefaulter: Scaling WAL Performancepg_prefaulter: Scaling WAL Performance
pg_prefaulter: Scaling WAL Performance
 
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...
OSBConf 2015 | Using aws virtual tape library as storage for bacula bareos by...
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 

Recently uploaded

Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Recently uploaded (20)

Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Walbouncer: Filtering PostgreSQL transaction log

  • 1. walbouncer: Filtering WAL Hans-J¨urgen Sch¨onig www.postgresql-support.de Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 2. walbouncer: Filtering WAL Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 3. Current status WAL streaming is the basis for replication Limitations: Currently an entire database instance has to be replicated There is no way to replicate single databases WAL used to be hard to read Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 4. The goal Create a “WAL-server” to filter the transaction log Put walbouncer between the PostgreSQL master and the “partial” slave Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 5. How is it done? The basic structure of the WAL is very actually fairly nice to filter Each WAL record that accesses a database has a RelFileNode: database OID tablespace OID data file OID What more do we need? Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 7. WAL format WAL logically is a stream of records. Each record is identified by position in this stream. WAL is stored in 16MB files called segments. Each segment is composed of 8KB WAL pages. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 8. WAL pages Wal pages have a header containing: 16bit “magic” value Flag bits Timeline ID Xlog position of this page Length of data remaining from last record on previous page Additionally first page of each segment has the following information for correctness validation: System identifier WAL segment and block sizes Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 9. Xlog record structre Total length of record Transaction ID that produced the record Length of record specific data excluding header and backup blocks Flags Record type (e.g. Xlog checkpoint, transaction commit, btree insert) Start position of previous record Checksum of this record Record specific data Full page images Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 10. Handling WAL positions WAL positions are highly critical WAL addresses must not be changed addresses are stored in data page headers to decide if replay is necessary. The solution: inject dummy records into the WAL Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 11. Dummy records PostgreSQL has infrastructure for dummy WAL entries (basically “zero” values) Valid WAL records can therefore be replaced with dummy ones quite easily. The slave will consume and ignore them Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 12. Question: What to filter? What about the shared catalog? We got to replicate the shared catalog This has some consequences: The catalog might think that a database called X is around but in fact files are missing. This is totally desirable Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 13. Catalog issues There is no reasonably simple way to filter the content of the shared catalog (skip rows or so). It is hardly possible to add semantics to the filtering But, this should be fine for most users If you want to access a missing element, you will simply get an error (missing file, etc.). Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 15. Streaming replication Replication connections use the same wire protocol as regular client connections. However they are serviced by special backend processes called walsenders Walsenders are started by including “replication=true” in the startup packet. Replication connections support different commands from a regular backend. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 16. Starting up replication 1/2 When a slave connects it first executes IDENTIFY SYSTEM postgres=# IDENTIFY_SYSTEM; systemid | timeline | xlogpos | dbname ---------------------+----------+-----------+-------- 6069865756165247251 | 2 | 0/3B7E910 | Then any necessary timeline history files are fetched: postgres=# TIMELINE_HISTORY 2; filename | content ------------------+---------------------------------- 00000002.history | 1 0/3000090 no recovery target specified+ | Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 17. Starting up replication 2/2 Streaming of writeahead log is started by executing: START_REPLICATION [SLOT slot_name] [PHYSICAL] XXX/XXX [TIMELINE tli] START REPLICATION switches the connection to a bidirectional COPY mode. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 18. Replication messages Replication messages are embedded in protocol level copy messages. 4 types of messages: XLogData (server -> client) Keepalive message (server -> client) Standby status update (client -> server) Hot Standby feedback (client -> server) To end replication either end can close the copying with a CopyDone protocol message. If the WAL stream was from an old timeline the server sends a result set with the next timeline ID and start position upon completion. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 19. Other replication commands Other replication commands: START REPLICATION ... LOGICAL ... CREATE REPLICATION SLOT DROP REPLICATION SLOT BASE BACKUP Not supported by walbouncer yet. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 20. Software design Client connects to the WAL bouncer instead of the master. WAL bouncer forks, connects to the master and streams xlog to the client. A this point the WAL proxy does not buffer stuff. One connection to the master per client. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 21. Software design - filtering WAL stream is split into records. Splitting works as a state machine consuming input and immediately forwarding it. We only buffer when we need to wait for a RelFileNode to decide whether we need to filter. Based on record type we extract the RelFileNode from the WAL record and decide if we want to filter it out. If we want to filter we replace the record with a XLOG NOOP that has the same total length. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 22. Synchronizing record decoding Starting streaming is hard. Client may request streaming from the middle of a record. We have a state machine for synchronization. Determine if we are in a continuation record from WAL page header. If we are, stream data until we have buffered the next record header. From next record header we read the previous record link, then restart decoding from that position. Once we hit the requested position stream the filtered data out to client. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 23. Using WAL bouncer Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 24. Streaming At this point we did not want to take the complexity of implementing buffering. Getting right what we got is already an important step Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 25. How to set things up First f all an initial base backup is needed The easiest thing here is rsync Skip all directories in “base”, which do not belong to your desired setup pg database is needed to figure out, what to skip. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 26. Start streaming Once you got the initial copy, setup streaming replication just as if you had a “normal” master. Use the address of the walbouncer in your primary conninfo You will not notice any difference during replication Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 27. Important aspects If you use walbouncer use the usual streaming replication precautions enough wal keep segments or use take care of conflicts (hot standby feedback) etc. You cannot promote a slave that has filtered data to master. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 28. Adding databases List of database OIDs to filter out is only fetched at walbouncer backend startup. Disconnect any streaming slaves and reconfigure filtering while you are creating the database. If you try to do it online you the slaves will not know to filter the new database. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 29. Dropping databases Have all slaves that want to filter out the dropped database actively streaming before you execute the drop. Otherwise the slaves will not know to skip the drop record and xlog replay will fail with an error. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 30. Can we filter on individual tables For filtering individual tables you can use tablespace filtering functionality. Same caveats apply for adding-removing tablespaces as for databases. You can safely add/remove tables in filtered tablespaces and even move tables between filtered/non-filtered tablespaces. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 31. Bonus feature You can use walbouncer to switch the master server without restarting the slave. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 32. Limitations Currently PostgreSQL 9.4 only. No SSL support yet. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 34. A sample config file (1) listen_port: 5433 master: host: localhost port: 5432 configurations: - slave1: match: application_name: slave1 filter: include_tablespaces: [spc_slave1] exclude_databases: [test] Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 35. A sample config file (2) - slave2: match: application_name: slave2 filter: include_tablespaces: [spc_slave2] Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 36. application name The application name is needed to support synchronous replication as well as better monitoring Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 37. include tablespaces / exclude tablespaces A good option to exclude entire groups of databases In a perverted way this can be used to filter on tables No need to mention that you should not Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 39. Where can we download the stuff? For download and more information visit:: www.postgresql-support.de/walbouncer/ Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 40. Thank you for your attention Any questions? Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 41. Contact Cybertec Sch¨onig & Sch¨onig GmbH Gr¨ohrm¨uhlgasse 26 A-2700 Wiener Neustadt www.postgresql-support.de Hans-J¨urgen Sch¨onig www.postgresql-support.de