Přehled variant databázové replikace (master-slave vs. master-master, physical vs. logical) a jejich pro a proti. Krátká historie replikace v PostgreSQL s přehled nástrojů které dnes máme k dispozici (slony-I, pgpool-II, Londiste a Bucardo). A nakonec stručné srovnání s replikací ve dvou populárních databázích (MySQL a Oracle).
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFSAndré Oriani
The document describes research on improving the high availability of HDFS (Hadoop Distributed File System) by implementing a hot standby node. The researchers modified HDFS's existing backup node functionality to evolve it into a hot standby node that can immediately take over if the primary namenode fails. This was done by replicating additional state like leases and block locations to the standby and using ZooKeeper for failure detection. Experiments showed the solution added little overhead while reducing failover time from minutes to seconds. The hot standby implementation addressed a single point of failure in HDFS and improved its ability to tolerate namenode failures.
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...SERENEWorkshop
Specifications of modern railway control systems often include resilience requirements in order to quickly and safely recovery from disasters (e.g. system-level failures). To that aim, spatial redundancy is required, with main and backup systems installed in fully isolated buildings, together with very short switchover times from main to backup systems in case of disasters. In order to fulfil those requirements, Ansaldo STS has developed a system-level hot stand-by solution allowing to quickly and smoothly switch from the main system to the back-up one, ensuring the necessary continuity of service and transparency to train supervisors and other operators. The functional architecture of such a solution is able to keep aligned the safety-critical nucleuses, typically based on N-modular redundancy (i.e. ‘KooM’ voting), of the main and the back-up systems. Such a coherent alignment must be kept in terms of both interfaced field devices (e.g. interlocking signals, track circuits, switch points, etc.) – on the ‘bottom’ level – and control room Human Machine Interfaces (HMI) – on the ‘top’ level. The solution is based on heterogeneous and redundant network links (copper/fiber Ethernet/HyperRing) at different levels of system architecture. In this speech, the reference architecture and the fault-tolerance functionalities for disaster recovery are provided, considering the requirements of real railway and mass-transit installations.
CREATE STATISTICS - What is it for? (PostgresLondon)Tomas Vondra
CREATE STATISTICS command was introduced in PostgreSQL 10, and improved in PostgreSQL 12. Let's see what queries it's meant to improve and how it works.
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFSAndré Oriani
The document describes research on improving the high availability of HDFS (Hadoop Distributed File System) by implementing a hot standby node. The researchers modified HDFS's existing backup node functionality to evolve it into a hot standby node that can immediately take over if the primary namenode fails. This was done by replicating additional state like leases and block locations to the standby and using ZooKeeper for failure detection. Experiments showed the solution added little overhead while reducing failover time from minutes to seconds. The hot standby implementation addressed a single point of failure in HDFS and improved its ability to tolerate namenode failures.
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...SERENEWorkshop
Specifications of modern railway control systems often include resilience requirements in order to quickly and safely recovery from disasters (e.g. system-level failures). To that aim, spatial redundancy is required, with main and backup systems installed in fully isolated buildings, together with very short switchover times from main to backup systems in case of disasters. In order to fulfil those requirements, Ansaldo STS has developed a system-level hot stand-by solution allowing to quickly and smoothly switch from the main system to the back-up one, ensuring the necessary continuity of service and transparency to train supervisors and other operators. The functional architecture of such a solution is able to keep aligned the safety-critical nucleuses, typically based on N-modular redundancy (i.e. ‘KooM’ voting), of the main and the back-up systems. Such a coherent alignment must be kept in terms of both interfaced field devices (e.g. interlocking signals, track circuits, switch points, etc.) – on the ‘bottom’ level – and control room Human Machine Interfaces (HMI) – on the ‘top’ level. The solution is based on heterogeneous and redundant network links (copper/fiber Ethernet/HyperRing) at different levels of system architecture. In this speech, the reference architecture and the fault-tolerance functionalities for disaster recovery are provided, considering the requirements of real railway and mass-transit installations.
CREATE STATISTICS - What is it for? (PostgresLondon)Tomas Vondra
CREATE STATISTICS command was introduced in PostgreSQL 10, and improved in PostgreSQL 12. Let's see what queries it's meant to improve and how it works.
Let's discuss the various sources of data corruption in databases (and in PostgreSQL in general) - cosmic rays, storage, bugs in the OS, database or application. I'll share some general observations about those various sources, and also some recommendations how to detect those issues in PostgreSQL (data checksums, ...) and how to deal with them.
The document discusses different approaches to storing sensitive data like credit card numbers in a database. It compares using full-disk encryption versus application-level encryption. Full-disk encryption protects against theft but prevents indexing and queries on encrypted fields. Application-level encryption requires decrypting and re-encrypting all data, moving processing to the application. The document proposes a solution of using a separate trusted component that can compare encrypted values to enable indexing and aggregation while keeping data encrypted in the database.
PostgreSQL performance improvements in 9.5 and 9.6Tomas Vondra
The document summarizes performance improvements in PostgreSQL versions 9.5 and 9.6. Some key improvements discussed include optimizations to sorting, hash joins, BRIN indexes, parallel query processing, aggregate functions, checkpoints, and freezing. Performance tests on sorting, hash joins, and parallel queries show significant speedups from these changes, such as faster sorting times and better scalability with parallel queries.
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016Tomas Vondra
The document provides an overview of different file systems for PostgreSQL including EXT3/4, XFS, BTRFS and ZFS. It discusses the evolution and improvements made to EXT3/4 and XFS over time to address scalability, bugs and new storage technologies like SSDs. BTRFS and ZFS are described as designed for large data volumes and built-in features like snapshots and checksums but BTRFS is still considered experimental. Benchmark results show ZFS and optimized EXT4/XFS performing best and BTRFS performance significantly reduced due to copy-on-write. The conclusion recommends EXT4/XFS for traditional needs and ZFS for advanced features, avoiding BTRFS.
PostgreSQL on EXT4, XFS, BTRFS and ZFSTomas Vondra
One of the most common question I'm asked by users and customer about configuring a Linux-based system for PostgreSQL is "What file fystem should I use to get the best performance?" Sadly, most of the recommendations is based on obsolete "common knowledge" passed from generation to generation. But in recent years the file systems improved a lot - we've seen both evolution of established file systems (EXT family, XFS, ...) and revolution in the form of BTRFS, ZFS or F2FS. And of course new kinds of hardware (SSDs) which certainly impacts the design of a file system.
Performance improvements in PostgreSQL 9.5 and beyondTomas Vondra
This document discusses several performance improvements made in PostgreSQL versions 9.5 and beyond. Some key improvements discussed include:
- Faster sorting through allowing sorting by inlined functions, abbreviated keys for VARCHAR/TEXT/NUMERIC, and Sort Support benefits.
- Improved hash joins through reduced palloc overhead, smaller NTUP_PER_BUCKET, and dynamically resizing the hash table.
- Index improvements like avoiding index tuple copying, GiST and bitmap index scan optimizations, and block range tracking in BRIN indexes.
- Aggregate functions see speedups through using 128-bit integers for internal state instead of NUMERIC in some cases.
- Other optimizations affect PL/pgSQL performance,
The document summarizes performance benchmarks comparing PostgreSQL versions from 7.4 to the latest version. The benchmarks show significant performance improvements across different workloads over time, including:
- Up to 6x faster transaction processing rates in pgbench
- Much better scalability to multiple cores and clients
- 6x faster query times for TPC-DS data warehouse benchmark
- Over 10x speedup for some full text search queries in latest version
The benchmarks demonstrate that PostgreSQL has become dramatically faster and more scalable over the past 10+ years, while also using less disk space and memory.
Standardní implementace ispell slovníků v PostgreSQL má bohužel několik nevýhod co se CPU a paměti týká. Napsal jsem extension která umožňuje slovníky inicializovat jen jednou a sdílet je mezi spojeními.
Jaký dopad na výkon má přesun transakčního logu nebo indexů na samostatný disk? Jak výsledek závisí na typu workloadu (OLTP vs. DWH) a na typu disku (HDD vs. SSD)?
Stručný úvod do čtení explain planu - seznámení s principem jak databáze vybírá postup vyhodnocení dotazu, na jaké základní operace v něm můžeme narazit (varianty čtení tabulek a joinů). Zejména se ale podíváme na to jak identifikovat problematická místa v exekučním plánu, v čem může být příčina a jak ji odstranit.
Prezentace připravená pro konferenci Prague PostgreSQL Developers' Day 2010. Hlavním tématem je monitoring výkonu PostgreSQL databázového serveru, nástrojů které k tomu lze použít, atd. Na začátku jsou představeny základní pojmy (např. co je to snapshot), a poté je předvedeno několik užitečných nástrojů (jeden z nich vyvíjím já).
Let's discuss the various sources of data corruption in databases (and in PostgreSQL in general) - cosmic rays, storage, bugs in the OS, database or application. I'll share some general observations about those various sources, and also some recommendations how to detect those issues in PostgreSQL (data checksums, ...) and how to deal with them.
The document discusses different approaches to storing sensitive data like credit card numbers in a database. It compares using full-disk encryption versus application-level encryption. Full-disk encryption protects against theft but prevents indexing and queries on encrypted fields. Application-level encryption requires decrypting and re-encrypting all data, moving processing to the application. The document proposes a solution of using a separate trusted component that can compare encrypted values to enable indexing and aggregation while keeping data encrypted in the database.
PostgreSQL performance improvements in 9.5 and 9.6Tomas Vondra
The document summarizes performance improvements in PostgreSQL versions 9.5 and 9.6. Some key improvements discussed include optimizations to sorting, hash joins, BRIN indexes, parallel query processing, aggregate functions, checkpoints, and freezing. Performance tests on sorting, hash joins, and parallel queries show significant speedups from these changes, such as faster sorting times and better scalability with parallel queries.
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016Tomas Vondra
The document provides an overview of different file systems for PostgreSQL including EXT3/4, XFS, BTRFS and ZFS. It discusses the evolution and improvements made to EXT3/4 and XFS over time to address scalability, bugs and new storage technologies like SSDs. BTRFS and ZFS are described as designed for large data volumes and built-in features like snapshots and checksums but BTRFS is still considered experimental. Benchmark results show ZFS and optimized EXT4/XFS performing best and BTRFS performance significantly reduced due to copy-on-write. The conclusion recommends EXT4/XFS for traditional needs and ZFS for advanced features, avoiding BTRFS.
PostgreSQL on EXT4, XFS, BTRFS and ZFSTomas Vondra
One of the most common question I'm asked by users and customer about configuring a Linux-based system for PostgreSQL is "What file fystem should I use to get the best performance?" Sadly, most of the recommendations is based on obsolete "common knowledge" passed from generation to generation. But in recent years the file systems improved a lot - we've seen both evolution of established file systems (EXT family, XFS, ...) and revolution in the form of BTRFS, ZFS or F2FS. And of course new kinds of hardware (SSDs) which certainly impacts the design of a file system.
Performance improvements in PostgreSQL 9.5 and beyondTomas Vondra
This document discusses several performance improvements made in PostgreSQL versions 9.5 and beyond. Some key improvements discussed include:
- Faster sorting through allowing sorting by inlined functions, abbreviated keys for VARCHAR/TEXT/NUMERIC, and Sort Support benefits.
- Improved hash joins through reduced palloc overhead, smaller NTUP_PER_BUCKET, and dynamically resizing the hash table.
- Index improvements like avoiding index tuple copying, GiST and bitmap index scan optimizations, and block range tracking in BRIN indexes.
- Aggregate functions see speedups through using 128-bit integers for internal state instead of NUMERIC in some cases.
- Other optimizations affect PL/pgSQL performance,
The document summarizes performance benchmarks comparing PostgreSQL versions from 7.4 to the latest version. The benchmarks show significant performance improvements across different workloads over time, including:
- Up to 6x faster transaction processing rates in pgbench
- Much better scalability to multiple cores and clients
- 6x faster query times for TPC-DS data warehouse benchmark
- Over 10x speedup for some full text search queries in latest version
The benchmarks demonstrate that PostgreSQL has become dramatically faster and more scalable over the past 10+ years, while also using less disk space and memory.
Standardní implementace ispell slovníků v PostgreSQL má bohužel několik nevýhod co se CPU a paměti týká. Napsal jsem extension která umožňuje slovníky inicializovat jen jednou a sdílet je mezi spojeními.
Jaký dopad na výkon má přesun transakčního logu nebo indexů na samostatný disk? Jak výsledek závisí na typu workloadu (OLTP vs. DWH) a na typu disku (HDD vs. SSD)?
Stručný úvod do čtení explain planu - seznámení s principem jak databáze vybírá postup vyhodnocení dotazu, na jaké základní operace v něm můžeme narazit (varianty čtení tabulek a joinů). Zejména se ale podíváme na to jak identifikovat problematická místa v exekučním plánu, v čem může být příčina a jak ji odstranit.
Prezentace připravená pro konferenci Prague PostgreSQL Developers' Day 2010. Hlavním tématem je monitoring výkonu PostgreSQL databázového serveru, nástrojů které k tomu lze použít, atd. Na začátku jsou představeny základní pojmy (např. co je to snapshot), a poté je předvedeno několik užitečných nástrojů (jeden z nich vyvíjím já).
1. Replikace v PostgreSQL
CSPUG, Praha
Tom´ˇ Vondra (tv@fuzzy.cz)
as
Czech and Slovak PostgreSQL Users Group
19.4.2011
2. Agenda
´c
Uˇely replikace
Varianty replikace
Historie replikace v PostgreSQL
Zabudovan´ replikace
a
Extern´ n´stroje
ı a
Srovn´n´ s dalˇ´ DB
a ı sımi
T. Vondra (CSPUG) Replikace v PostgreSQL
3. ´c
Uˇely replikace
high-availability
ˇk´lov´n´ v´konu (read vs. write)
s a a ı y
load balancing
query partitioning
migrace a upgrade bez v´padku
y
jin´ verze stejn´ datab´ze
a e a
uplnˇ jin´ datab´ze
´ e a a
rychlejˇ´ pˇıstup pˇes WAN
sı r´ r
lok´ln´ kopie na poboˇk´ch
a ı c a
road warriors / kopie pro mobiln´ zaˇızen´
ı r´ ı
T. Vondra (CSPUG) Replikace v PostgreSQL
4. Varianty replikace
master-master vs. master-slave
synchronn´ vs. asynchronn´
ı ı
fyzick´ vs. logick´
a a
warm standby vs. hot standby
zp˚sob implementace
u
xlog (log file shipping vs. streaming)
trigger
statement-based
T. Vondra (CSPUG) Replikace v PostgreSQL
5. Master vs. slave
master
autoritativn´ zdroj informac´
ı ı
zpracov´v´ poˇadavky na zmˇny, pˇed´v´ je d´le
a a z e r a a a
slave
zmˇny se pˇej´ ı z master datab´ze, jinak read-only
e r ımaj´ a
v podstatˇ jen “kopie” master datab´ze
e a
master-slave
jednoduˇˇ´ “jednosmˇrn´” replikace
ssı e a
read scalability
master-master
obousmˇrn´ replikace - nutno ˇeˇit kolize
e a r s
write scalability
T. Vondra (CSPUG) Replikace v PostgreSQL
6. Synchronn´ vs. asynchronn´ replikace
ı ı
synchronn´
ı
commit ˇek´ na potvrzen´ dokonˇen´ replikace
c a ı c ı
pomalejˇ´ ale vˇdy konzistentn´ jako celek
sı, z ı
asynchronn´
ı
na dokonˇen´ replikace se neˇek´, zap´se se jen lok´lnˇ
c ı c a ıˇ a e
rychlejˇ´ ale m˚ˇe doj´ k nekonzistenci mezi origin´lem
sı, uz ıt a
a replikou (replika je “pozadu”)
semi-synchronn´
ı
kompromis mezi spolehlivost´ a v´konem
ı y
v´ replik, ˇek´ se jen na potvrzen´ z prvn´
ıce c a ı ı
T. Vondra (CSPUG) Replikace v PostgreSQL
7. Fyzick´ replikace
a
bin´rn´ kopie datov´ch blok˚ (xlog)
a ı y u
masteru do XLogu zap´se “bin´rn´ diff” (zmˇˇ byte X v
ıˇ a ı en
bloku Y na Z)
slave pˇeˇte a aplikuje na bloky (v podstatˇ recovery)
r c e
T. Vondra (CSPUG) Replikace v PostgreSQL
8. Fyzick´ replikace
a
klady
minim´ln´ overhead (1%)
a ı
velmi jednoduch´ na nastaven´
e ı
z´pory
a
jen kompletn´ datab´ze
ı a
stejn´ verze DB (form´t)
a a
stejn´ HW (zejm´na CPU architektura)
y e
jen master-slave (nemoˇnost ˇeˇen´ konflikt˚)
z r s ı u
T. Vondra (CSPUG) Replikace v PostgreSQL
9. Fyzick´ replikace / Log file shipping
a
1 UPDATE znamen´ zmˇnu nˇkolika datov´ch blok˚
a e e y u
2 kaˇd´ zmˇna generuje z´znam v transakˇn´ logu
z a e a c ım
3 pˇi zaplnˇn´ segmentu (16MB) se archivuje (NFS, ...)
r e ı
4 archivovan´ segment se aplikuje na slave datab´zi
y a
5 v´sledkem je bin´rn´ kopie
y a ı
T. Vondra (CSPUG) Replikace v PostgreSQL
10. Fyzick´ replikace / Streaming replication
a
1 UPDATE znamen´ zmˇnu nˇkolika datov´ch blok˚
a e e y u
2 kaˇd´ zmˇna generuje z´znam v transakˇn´ logu
z a e a c ım
3 z´znamy se (asynchronnˇ) pˇen´ˇ´ do slave datab´ze
a e r ası a
4 zmˇny se aplikuj´
e ı
5 v´sledkem je (opˇt) bin´rn´ kopie
y e a ı
T. Vondra (CSPUG) Replikace v PostgreSQL
11. Warm standby vs. Hot standby
warm standby
datab´ze nastartovan´ v “recovery m´du”
a a o
pˇij´ a z master datab´ze zmˇny a aplikuje je
r ım´ a e
slouˇ´ jen pro HA - nelze se pˇipojit a spouˇtˇt dotazy
zı r se
hot standby
datab´ze nastartovan´ v “read-only m´du”
a a o
lze spouˇtˇt read-only dotazy (nedostane XID)
se
T. Vondra (CSPUG) Replikace v PostgreSQL
12. Logick´ replikace
a
ne aplikace bin´rn´ zmˇn bez znalosti struktury dat
a ıch e
aplikace logick´ch operac´ (INSERT/UPDATE/DELETE)
y ı
klady
replikace jen ˇ´sti datab´ze (napˇ. jedna tabulka)
ca a r
replikace do jin´ verze / jin´ DB (uprage a migrace)
e e
umoˇnuje multi-master replikaci
zˇ
z´pory
a
n´roˇnˇjˇ´ na nastaven´ i na zdroje
a c e sı ı
nutnost ˇeˇen´ konflikt˚ (specifick´ dle aplikace)
r s ı u e
T. Vondra (CSPUG) Replikace v PostgreSQL
13. Logick´ replikace / zp˚soby implementace
a u
log vˇech SQL / opakovan´ aplikace na repliku
s a
zpˇtn´ interpretace xlog z´znam˚
e a a u
triggery
proxy zachycuj´ ı SQL
ıc´
T. Vondra (CSPUG) Replikace v PostgreSQL
14. Replikace nenahrazuje z´lohov´n´
a a ı
replikuje se vˇechno (vˇetnˇ omyl˚)
s c e u
hacker, idiot, unit test omylem na produkˇn´ DB
c ı
DROP DATABASE, DELETE, ...
Z´lohujte! Z´lohujte! Z´lohujte!
a a a
T. Vondra (CSPUG) Replikace v PostgreSQL
15. Replikace v PostgreSQL
Historie, souˇasnost a budoucnost
c
T. Vondra (CSPUG) Replikace v PostgreSQL
16. Historie replikace v PostgreSQL
“Unixov´ mentalita” core teamu
a
menˇ´ flexibiln´ n´stroje, moˇnost kombinace
sı ı a z
mnoho moˇnost´ implementace - radˇji externˇ
z ı e e
odpor k pˇid´v´n´ takov´ch vlastnost´
r a a ı y ı
do verze 8.4 (vˇetnˇ)
c e
XLog file shipping replikace + “warm standby” (HA)
zaj´
ımav´ extern´ n´stroje (Bucardo, Londiste, slony-I, ...)
e ı a
verze 9.0
asynchronn´ XLog streaming replikace
ı
moˇnost “hot standby”
z
verze 9.1
synchronn´ XLog streaming replikace
ı
T. Vondra (CSPUG) Replikace v PostgreSQL
17. Zabudovan´ replikace
a
fyzick´ (a)synchronn´ replikace
a ı
rozpor mezi n´roky na HA a reporting (zab´ ı queries)
a ıjen´
trochu probl´m pˇi ´mrt´ mastera (s v´ slavy)
e r u ı ıce
T. Vondra (CSPUG) Replikace v PostgreSQL
18. HA vs. reporting / query cancellation
high-availability
c´
ılem je minim´ln´ delay oproti masteru (kv˚li failoveru)
a ı u
reporting
dlouho bˇˇ´ ı dotazy nad velk´mi datov´mi objemy
ezıc´ y y
dotaz potˇebuje blok kter´ se zmˇnil − je zabit
r y e →
rychl´ aplikace zmˇn − vˇtˇ´ pravdˇpodobnost zabit´
a e → e sı e ı
zaj´
ımav´ nastaven´
a ı
vacuum defer cleanup age (master)
hot standby feedback (standby)
T. Vondra (CSPUG) Replikace v PostgreSQL
19. HA vs. reporting / query cancellation
ˇeˇen´ - dva slaves, jeden pro HA a druh´ pro reporting
r s ı y
T. Vondra (CSPUG) Replikace v PostgreSQL
20. Demo
1 vytvoˇıme a nakonfigurujeme mastera
r´
2 vytvoˇıme slave, napoj´ na master
r´ ıme
3 provedeme nˇco na masterovi
e
4 pod´ ame se jak se to zpropagovalo na slave
ıv´
5 zkus´ nˇjak´ dotazy nad slave (read / write)
ıme e e
6 zastav´ mastera
ıme
7 provedeme failover
T. Vondra (CSPUG) Replikace v PostgreSQL
21. Demo / master
postgresql.conf
listen_addresses = ’127.0.0.1’
port = 5432
# archivn´ reˇim
ı z
wal_level = hot_standby
max_wal_senders = 10
# archivn´ reˇim
ı z
archive_mode = on
archive_command = ’cp %p /var/pg9/archive/%f’
pg hba.conf
# IPv4 local connections
host replication repuser 127.0.0.1/32 trust
T. Vondra (CSPUG) Replikace v PostgreSQL
22. Demo / slave
postgresql.conf
listen_addresses = ’127.0.0.1’
port = 5433
hot_standby = on
recovery.conf
standby_mode = ’on’
primary_conninfo = ’host=127.0.0.1 port=5432 user=repuser’
# ukonˇen´ recovery (touch)
c ı
trigger_file = ’/var/pg9/failover’
# naˇten´ z archivu log˚
c ı u
restore_command = ’cp /var/pg9/archive/%f "%p"’
T. Vondra (CSPUG) Replikace v PostgreSQL
23. Nev´hody zabudovan´ replikace
y e
1 slave je jen pro ˇten´
c ı
nenapln´ si TEMP tabulku (probl´m pro reporting)
ıte e
nenaˇtete hodnotu ze sekvence
c
nelze udˇlat standardn´ z´lohu
e ı a
2 ne ´plnˇ elegantn´ monitoring
u e ı
lag replikace na slave se d´ monitorovat pˇes “ps”
a r
v´raznˇ se zlepˇ´ ve verzi 9.1
y e sı
3 nelze dˇlat kask´du (vˇichni vis´ na jednom masterovi)
e a s ı
T. Vondra (CSPUG) Replikace v PostgreSQL
24. Extern´ n´stroje
ı a
typ technika M/M M/S sync async
PostgreSQL 9.0 fyzick´
a xlog ne ano ne ano
PostgreSQL 9.1 fyzick´
a xlog ne ano ano ano
Londiste logick´
a triggers ne ano ne ano
Bucardo logick´
a triggers ano ano ne ano
slony-I logick´
a triggers ne ano ne ano
pgpool-II logick´
a proxy ano* ne* ano ne
Postgres-XC cluster - ano ne ne ano
* u proxy kategorie jako master nebo slave nemaj´ ´plnˇ smysl
ıu e
T. Vondra (CSPUG) Replikace v PostgreSQL
25. Londiste
naps´no Skype, souˇ´st SkyTools (i dalˇ´ n´stroje)
a ca sı a
implementov´no v Pythonu (jako skoro vˇe ve Skype)
a s
PgQ - vlastn´ implementace fronty
ı
jen master/slave replikace (logick´)
a
http://wiki.postgresql.org/wiki/Londiste Tutorial
http://wiki.postgresql.org/wiki/Skytools
T. Vondra (CSPUG) Replikace v PostgreSQL
26. Bucardo
http://bucardo.org/
triggery a d´mon - implementov´no v Perlu (PL/Perl)
e a
zaloˇeno na LISTEN/NOTIFY
z
transakˇn´ notifikace zabudovan´ pˇımo do DB
c ı e r´
jednoduch´ komunikace sessions pˇes frontu
a r
nedok´ˇe replikovat DDL (nejsou DDL triggery)
az
master to master - aktu´lnˇ jen dva mastery
a e
master to many slaves
T. Vondra (CSPUG) Replikace v PostgreSQL
27. pgpool-II
http://pgpool.projects.postgresql.org/
pouˇ´ a proxy koncept (statement-based middleware)
zıv´
spojuje nˇkolik pokroˇil´ch vlastnost´
e c y ı
connection pooling
replikace (vˇetnˇ online recovery)
c e
load balancing (rozhazov´n´ queries na repliky)
a ı
parallel queries (distribuovan´ tabulky)
e
nˇkolik m´d˚, ne vˇdy je moˇno vˇe (parallel vs. failover)
e o u z z s
pokud chcete HA ˇeˇen´ jsou asi jednoduˇˇ´ n´stroje
r s ı, ssı a
T. Vondra (CSPUG) Replikace v PostgreSQL
28. slony-I
http://slony.info/
master-slave replikace (max. 20 subscriber˚)
u
zaloˇeno na triggerech a C funkc´
z ıch
plusy
5 let zkuˇenost´ z provozu
s ı
t´mˇˇ kompletn´ ˇeˇen´ (failover, provisioning, ...)
e er ır s ı
m´
ınusy
fronta ud´lost´ je ˇeˇena pˇes tabulku (nutno VACUUM)
a ı r s r
vyˇˇ´ overhead neˇ ˇeˇen´ s jinak ˇeˇenou frontou
ssı zr s ı r s
komplexn´ - sloˇit´ nastaven´ obt´zn´ ˇeˇen´ probl´m˚
ı z e ı, ıˇ e r s ı e u
T. Vondra (CSPUG) Replikace v PostgreSQL
29. Jin´ DB
e
Oracle & MySQL
T. Vondra (CSPUG) Replikace v PostgreSQL
30. Jin´ DB / Oracle
e
DataGuard
pouˇ´ a XLog, dva m´dy - “Redo Apply” a “SQL Apply”
zıv´ o
Redo Apply - fyzick´ replikace (= streaming replikace)
a
SQL Apply - logick´ replikace, obohacen´ XLog, r˚zn´
a y u a
omezen´ (ne vˇechny objekty, ne vˇechny datov´ typy)
ı s s e
Active Data Guard (dalˇ´ $) umoˇnuje “hot standby”
sı zˇ
Streams
logick´ replikace, postaven´ nad Advanced Queueing
a a
obecnˇ n´stroj pro distribuci informac´ (ne jen replikace)
e a ı
GoldenGate
log-based logick´ replikace pro heterogenn´ prostˇed´
a ı r ı
(Oracle, DB2, MSSQL, MySQL, ...)
Oracle doporuˇuje jako n´hradu za Streams
c a
T. Vondra (CSPUG) Replikace v PostgreSQL
31. Jin´ DB / MySQL
e
asynchronn´ logick´ master-slave replikace (od 5.5
ı a
semi-synchronn´ı)
postaveno na tzv. “binlogu” (statement-based log)
statement-based (SBR)
loguje kompletn´ SQL pˇıkazy (kter´ zmˇnily data)
ı r´ e e
ne vˇechny SQL pˇıkazy jsou “bezpeˇn´”
s r´ c e
row-based (RBR)
loguj´ se fin´ln´ zmˇny jednotliv´ch ˇ´dk˚
ı a ı e y ra u
bezpeˇnˇjˇ´ ale vˇtˇ´ objem dat neˇ SBR
c e sı e sı z
mixed-based (MBR)
SBR nebo RBR podle typu eventu
MySQL Cluster (NDB engine) - synchronn´ replikace
ı
zaloˇen´ na 2PC (ne na binlogu)
z a
T. Vondra (CSPUG) Replikace v PostgreSQL
32. Odkazy / obecn´
e
Replication @ wikipedia
http://en.wikipedia.org/wiki/Replication (computer science)
MySQL 5.5 Replication
http://dev.mysql.com/doc/refman/5.5/en/replication.html
http://dev.mysql.com/doc/refman/5.5/en/replication-sbr-rbr.html
http://dev.mysql.com/doc/refman/5.5/en/replication-rbr-usage.html
Drizzle
http://docs.drizzle.org/replication.html
http://code.google.com/p/protobuf/
Oracle Data Guard
http://en.wikipedia.org/wiki/Oracle Data Guard
Oracle Streams
http://www.oracle.com/technetwork/database/features/data-integration/default-159085.html
Oracle GoldenGate
http://www.oracle.com/technetwork/middleware/goldengate/overview/index.html
T. Vondra (CSPUG) Replikace v PostgreSQL
33. Odkazy / PostgreSQL
Replication, Clustering, and Connection Pooling
http://wiki.postgresql.org/wiki/Replication, Clustering, and Connection Pooling
Replication solutions for PostgreSQL (Peter Eisentraut)
http://www.slideshare.net/petereisentraut/replication-solutions-for-postgresql
9.0 Streaming Replication vs Slony (Steve Singer)
http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/
PostgreSQL / WAL config
http://www.postgresql.org/docs/current/static/runtime-config-wal.html
http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html
PostgreSQL / Comparison of Different Solutions
http://developer.postgresql.org/pgdocs/postgres/different-replication-solutions.html
repmgr
https://github.com/greg2ndQuadrant/repmgr
T. Vondra (CSPUG) Replikace v PostgreSQL