PostgreSQL...
awesome?
So, philip gave me the title for the talk and I've to run with it! ;)
Michael Renner
@terrorobe
https://pganalyze.com
Mein Name ist Michael Renner
Twitter Handle - der mich auch schon in Probl...
Quick poll!
Who's using Postgres?
...with replication?
Postgres.
A free RDBMS done right
Relational database management system
It does SELECT, INSERT, UPDATE, DELETE
In a sane &...
Community-driven.
No single commercial entity behind the project.
Multiple consulting companies, distros, large companies ...
One major release per year
Five years maintenance
Multiple maintenance releases per year
Does...
Friendly & Competent
Community
• http://www.postgresql.org/list/
• Freenode: #postgresql(-de)
• http://pgconf.(de|eu|us)
m...
9.4 ante portas
~Sep 2014
http://www.postgresql.org/docs/devel/static/release-9-4.html
That being said, the next major rel...
"ordered-set aggregate
functions"
http://www.postgresql.org/docs/devel/static/functions-
aggregate.html#FUNCTIONS-ORDEREDS...
Calculate 95th percentile
postgres=# SELECT percentile_disc(0.95) WITHIN GROUP(ORDER BY i) FROM
generate_series(1,100) AS ...
json(b)
http://www.postgresql.org/docs/devel/static/datatype-json.html
Most importantly - native datatype with jsonb
In th...
New JSON functions
$ SELECT * FROM json_to_recordset(
'[
{"name":"e","value":2.718},
{"name":"pi","value":3.141},
{"name":...
Replication features...
...covered later
and quite a bit of new replication features.
which we'll cover later
Database Replication
Which brings us right up to the replication
A tale of sorrows
or: "Brewer hates us"
If you've got a strong stomach, read through:
http://aphyr.com/tags/jepsen
which i...
Brewer's CAP Theorem
• it is impossible for a distributed system to
simultaneously provide these guarantees:
• Consistency...
PG Mantra:
Scale up, not out
Postgres, in the past, solved this problem by not dealing with it in the first
place!
So that ...
Real world says:
"NO"
But that's not always possible.
You might need to have geo-redundant database servers, you might run...
So we need replication.
What are our options?
So we need replication... Postgres has a bit of a Perl problem - TMTOWTDI
shared storage
...one of the oldest options
Usually achieved by using a SAN or DRBD
HA solution tacked on top of it, if on...
Trigger-based
Add a trigger to all replicated tables
Changes get written to a separate table
Daemon reads changes from sou...
Statement-based
or "The proxy approach"
Connect to middleware instead of real database
All queries executed on middleware ...
(Write Ahead) Log-based
And the most common ones
* Postgres writes all changes it does to the table & index files into a lo...
What should you use?
With all those options the question that comes up is...
and since "it depends" is probably not a suff...
For now:
log-based
asynchronous
master→slave
I'd recommend to look at log-based replication first and only reconsider this
...
Two flavors
• Log-Shipping
• Completed WAL-segments are copied to
slave and applied there
• Streaming replication
• Transac...
On WAL handling
• Server generates WAL with every
modifying operation, 16MB segments
• Normally gets rotated after success...
Master config
$ $EDITOR pg_hba.conf
host replication replication 192.0.2.0/24 trust
$ $EDITOR postgresql.conf
wal_level = h...
Slave config
$ pg_basebackup -R -D /path/to/cluster --host=master --port=5432
$ $EDITOR postgresql.conf
hot_standby = on
$ ...
Caveats
• Slaves are 100% identical to master
• No selective replication (DBs,Tables, etc.)
• No slave-only indexes
• WAL ...
Coming in 9.4
Q3 2014
All of the stuff works out of the box with 9.3
There are a few new things coming in postgres 9.4
Logical decoding
One of the most interesting additions is logical decoding
Master Server generates a list of tuple modifica...
$ INSERT INTO z (whatever) VALUES ('row2');
INSERT 0 1
$ SELECT * FROM pg_logical_slot_get_changes('depesz', null, null, '...
Replication slots
Replication slots are an additional feedback mechanism between slave and
master to communicate which WAL...
Time-delayed
replication
Time-delayed rep allows an additional mechanism against operational
accidents...
commit/checkpoin...
What's coming in 9.5+?
These were the things that are already included in 9.4,
for the coming development cycles there're ...
Logical replication
cont'd
What's currently missing is a reliable consumer for the data generated by 9.4
logical replicati...
SQL MERGE
"Upserts"
http://wiki.postgresql.org/wiki/SQL_MERGE
http://www.postgresql.org/message-id/CAM3SWZTG4pnn5DfVm0J6e_...
Fragen?
Ideen?
Abschliessendes?
That's all for now
Any questions, ideas?
Danke!
Michael Renner
@terrorobe
https://pganalyze.com
Thanks!
You can hit me up on twitter or via Mail
Link CollectionTrigger-Based:
http://bucardo.org/wiki/Bucardo
http://slony.info/
Statement-based:
http://www.pgpool.net/me...
Upcoming SlideShare
Loading in …5
×

Postgres Vienna DB Meetup 2014

1,037 views

Published on

Short overview of PostgreSQL in 2014 and assorted replication methods.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,037
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Postgres Vienna DB Meetup 2014

  1. 1. PostgreSQL... awesome? So, philip gave me the title for the talk and I've to run with it! ;)
  2. 2. Michael Renner @terrorobe https://pganalyze.com Mein Name ist Michael Renner Twitter Handle - der mich auch schon in Probleme gebracht hat. Web Operations, starkes Interesse an Datenbanken, Skalierung und Performance. PG-Enthusiast seit 2004 If you've got questions - please just ask!
  3. 3. Quick poll! Who's using Postgres? ...with replication?
  4. 4. Postgres. A free RDBMS done right Relational database management system It does SELECT, INSERT, UPDATE, DELETE In a sane & maintainable way. Tries hard to not surprise users, hype resistant.
  5. 5. Community-driven. No single commercial entity behind the project. Multiple consulting companies, distros, large companies who are core developers and have commit access.
  6. 6. One major release per year Five years maintenance Multiple maintenance releases per year Does...
  7. 7. Friendly & Competent Community • http://www.postgresql.org/list/ • Freenode: #postgresql(-de) • http://pgconf.(de|eu|us) more often than not the consultants from various companies are hanging out in the channels
  8. 8. 9.4 ante portas ~Sep 2014 http://www.postgresql.org/docs/devel/static/release-9-4.html That being said, the next major release will come after the summer, extrapolating from past releases it should be here around September. It'll bring quite a bit of new features, I selected a few interesting ones.
  9. 9. "ordered-set aggregate functions" http://www.postgresql.org/docs/devel/static/functions- aggregate.html#FUNCTIONS-ORDEREDSET-TABLE ... are aggregate functions over ordered sets! Aggregate functions are things like sum or count which can operate on random sets of data. If the set is ordered you can do additional things like...
  10. 10. Calculate 95th percentile postgres=# SELECT percentile_disc(0.95) WITHIN GROUP(ORDER BY i) FROM generate_series(1,100) AS s(i); percentile_disc ----------------- 95 (1 row) ...calculate percentiles
  11. 11. json(b) http://www.postgresql.org/docs/devel/static/datatype-json.html Most importantly - native datatype with jsonb In the past, stored only text which was validated as correct json Now separate on-disk representation format Bit more expensive while writing (serialization) but much faster while querying, since json doesn't need to be reparsed each time while accessing.
  12. 12. New JSON functions $ SELECT * FROM json_to_recordset( '[ {"name":"e","value":2.718}, {"name":"pi","value":3.141}, {"name":"tau","value":6.283} ]', TRUE) AS x (name text, value numeric); name | value ------+------- e | 2.718 pi | 3.141 tau | 6.283 (3 rows) http://www.postgresql.org/docs/devel/static/functions-json.html http://www.depesz.com/2014/01/30/waiting-for-9-4-new-json-functions/ ...and to complement the new data type, there are also new accessor functions
  13. 13. Replication features... ...covered later and quite a bit of new replication features. which we'll cover later
  14. 14. Database Replication Which brings us right up to the replication
  15. 15. A tale of sorrows or: "Brewer hates us" If you've got a strong stomach, read through: http://aphyr.com/tags/jepsen which is a tale of sorrows, and this is not limited to Postgres or SQL databases. Getting distributed database systems right is _HARD_. And even the distributed database poster childs get it wrong
  16. 16. Brewer's CAP Theorem • it is impossible for a distributed system to simultaneously provide these guarantees: • Consistency • Availability • Partition tolerance In a nutshell Consistency - all nodes see the same data at the same time Availability - a guarantee that every request receives a response about whether it was successful or failed Partition tolerance - the system continues to operate despite arbitrary message loss or failure of part of the system Brewer says: It's impossible to get all three Managers like things available & partition tolerant
  17. 17. PG Mantra: Scale up, not out Postgres, in the past, solved this problem by not dealing with it in the first place! So that we don't have to bother with this, most people will usually tell you to just scale up Throw more/bigger hardware at the problem and be done with it.
  18. 18. Real world says: "NO" But that's not always possible. You might need to have geo-redundant database servers, you might run in an environment where "scaling up" is no feasible option (hello ec2!)
  19. 19. So we need replication. What are our options? So we need replication... Postgres has a bit of a Perl problem - TMTOWTDI
  20. 20. shared storage ...one of the oldest options Usually achieved by using a SAN or DRBD HA solution tacked on top of it, if one server goes down, other starts up
  21. 21. Trigger-based Add a trigger to all replicated tables Changes get written to a separate table Daemon reads changes from source DB and writes to destination DB
  22. 22. Statement-based or "The proxy approach" Connect to middleware instead of real database All queries executed on middleware will be sent to many databases That's fine until one of the servers isn't reachable!
  23. 23. (Write Ahead) Log-based And the most common ones * Postgres writes all changes it does to the table & index files into a log, which would be used during crash recovery * Send log contents to a secondary server * Secondary server does "continuous crash recovery"
  24. 24. What should you use? With all those options the question that comes up is... and since "it depends" is probably not a sufficient answer for most of you
  25. 25. For now: log-based asynchronous master→slave I'd recommend to look at log-based replication first and only reconsider this when you're sure it won't fit you Has it's own bag of things to look out for, but the stuff where most of development and operations resources are spent nowadays
  26. 26. Two flavors • Log-Shipping • Completed WAL-segments are copied to slave and applied there • Streaming replication • Transactions are streamed to slave servers • Can also be configured for synchronous replication Log-based replication in Postgres comes in two flavors
  27. 27. On WAL handling • Server generates WAL with every modifying operation, 16MB segments • Normally gets rotated after successful checkpoint • Lots of conditions and config settings that can change the behaviour • Slave needs base copy from master + all WAL files to reach consistent state
  28. 28. Master config $ $EDITOR pg_hba.conf host replication replication 192.0.2.0/24 trust $ $EDITOR postgresql.conf wal_level = hot_standby max_wal_senders = 5 wal_keep_segments = 32 http://wiki.postgresql.org/wiki/Streaming_Replication http://www.postgresql.org/docs/current/static/warm-standby.html This is a strict streaming replication example, no log archiving If the slave server is offline too long, it needs to be freshly initialized from the master.
  29. 29. Slave config $ pg_basebackup -R -D /path/to/cluster --host=master --port=5432 $ $EDITOR postgresql.conf hot_standby = on $ $EDITOR recovery.conf standby_mode = 'on' primary_conninfo = 'host=master port=5432 user=replication' trigger_file = '/path/to/trigger'
  30. 30. Caveats • Slaves are 100% identical to master • No selective replication (DBs,Tables, etc.) • No slave-only indexes • WAL segment handling can be tricky • Slave Query conflicts due to master TXs • Excessive disk space usage on master • Broken replication due to already-recycled segments on master But when running with log based replication there are things to look out for
  31. 31. Coming in 9.4 Q3 2014 All of the stuff works out of the box with 9.3 There are a few new things coming in postgres 9.4
  32. 32. Logical decoding One of the most interesting additions is logical decoding Master Server generates a list of tuple modifications Similar to trigger-based replication, but much more efficient and easier to maintain Almost identical to "row based replication" format in MySQL
  33. 33. $ INSERT INTO z (whatever) VALUES ('row2'); INSERT 0 1 $ SELECT * FROM pg_logical_slot_get_changes('depesz', null, null, 'include-xids', '0'); location | xid | data ------------+-----+------------------------------------------------------------ 0/5204A858 | 932 | BEGIN 0/5204A858 | 932 | table public.z: INSERT: id[integer]:1 whatever[text]:'row2' 0/5204A928 | 932 | COMMIT (3 rows) http://www.depesz.com/2014/03/06/waiting-for-9-4-introduce-logical- decoding/ Here's an example of what logical decoding will produce You can find more extensive examples at Hubert Depesz blog
  34. 34. Replication slots Replication slots are an additional feedback mechanism between slave and master to communicate which WAL files are still needed Also the backbone for logical replication
  35. 35. Time-delayed replication Time-delayed rep allows an additional mechanism against operational accidents... commit/checkpoint records are only applied after a configured time value has passed since the TX has been completed
  36. 36. What's coming in 9.5+? These were the things that are already included in 9.4, for the coming development cycles there're already a few things in the pipeline
  37. 37. Logical replication cont'd What's currently missing is a reliable consumer for the data generated by 9.4 logical replication People, mostly Andres Freund from 2nd Quadrant, are working on this topic and I expect that there's more to talk about next year with 9.5 Will be possible to build Galera-Like systems with the infrastructure
  38. 38. SQL MERGE "Upserts" http://wiki.postgresql.org/wiki/SQL_MERGE http://www.postgresql.org/message-id/CAM3SWZTG4pnn5DfVm0J6e_f +JBvDbD7JH_4hRfavq_RuxJ-0=g@mail.gmail.com ...or INSERT ON DUPLICATE KEY ... Was planned for 9.4, but turned out to be more complicated than anticipated Developer meeting later this year where the course of action will be decided
  39. 39. Fragen? Ideen? Abschliessendes? That's all for now Any questions, ideas?
  40. 40. Danke! Michael Renner @terrorobe https://pganalyze.com Thanks! You can hit me up on twitter or via Mail
  41. 41. Link CollectionTrigger-Based: http://bucardo.org/wiki/Bucardo http://slony.info/ Statement-based: http://www.pgpool.net/mediawiki/index.php/Main_Page Log-Shipping/Streaming replication: https://github.com/2ndQuadrant/repmgr https://github.com/omniti-labs/omnipitr Backup: http://dalibo.github.io/pitrery/ http://www.pgbarman.org/ http://www.postgresql.org/docs/current/static/app-pgbasebackup.html http://www.postgresql.org/docs/current/static/app-pgreceivexlog.html And there's also a link collection of tools and projects to look at when you're building your own replication setup

×