More Related Content Similar to Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics (20) Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics1. 1©2014 TransLattice, Inc. All Rights Reserved. 1
Supersized PostgreSQL:
Postgres-XL for Scale-out
OLTP and Big Data Analytics
June 17, 2014
Mason Sharp
2. 2©2014 TransLattice, Inc. All Rights Reserved.
Agenda
• Postgres-XL Background
• Architecture Overview
• Distributing Tables
• Configuring a Local Cluster
• Backup and Recovery
• Differences to PostgreSQL and Postgres-XC
• Community
4. 4©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL
n Open Source
n Scale-out RDBMS
– Massive Parallel Processing
– OLTP write and read scalability
– Multi-tenant security
– Cluster-wide ACID properties
– Can also act as distributed key-value store
n Mixed workloads, can eliminate need for separate
clusters in some cases
6. 6©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Applications and
Use Case Examples
Postgres-XL Application Use Case Example
Business intelligence
requiring
MPP parallelism
Financial services big data
analysis of opportunities
across service lines
OLTP write-intensive
workloads
Online ad tracking
Mixed workload
environments
Real-time reporting on a
transactional system
7. 8©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Technology
• Data is automatically
spread across cluster
• Queries return back
data from all nodes in
parallel
• The Global Transaction
Manager maintains a
consistent view of the
data across cluster
• Connect to any
coordinator
8. 9©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Features
• Multi Version Concurrency Control
• Referential Integrity*
• Full Text Search
• Geo-spatial Functions and Data Types
• JSON and XML Support
• Distributed Key-Value Store
9. 10©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Missing Features
• Built in High Availability
• Use external like Corosync/Pacemaker
• Area for future improvement
• “Easy” to re-shard / add nodes
• Some pieces there
• Can however “pre-shard” multiple nodes on
the same server or VM
• Some FK and UNIQUE constraints
10. 11©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Connectors
Looks like a PostgreSQL 9.2 database
• C
• C++
• Perl
• Python
• PHP
• Erlang
• Haskell
– Java
– Javascript
– .NET
– Node.js
– Ruby
– Scala
11. 13©2014 TransLattice, Inc. All Rights Reserved.
Performance
Transactional work loads
10 50 100 200 300 400
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
Dell DVD Benchmark
StormDB
Amazon RDS
14. 17©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Architecture
• Based on the Postgres-XC project
• Coordinators
• Connection point for applications
• Parsing and planning of queries
• Data Nodes
• Actual data storage
• Local execution of queries
• Global Transaction Manager (GTM)
• Provides a consistent view to transactions
• GTM Proxy to increase performance
15. 18©2014 TransLattice, Inc. All Rights Reserved.
Coordinators
• Handles network connectivity to the client
• Parse and plan statements
• Perform final query processing of intermediate
result sets
• Manages two-phase commit
• Stores global catalog information
16. 19©2014 TransLattice, Inc. All Rights Reserved.
Data Nodes
• Stores tables and indexes
• Only coordinators connects to data nodes
• Executes queries from the coordinators
• Communicates with peer data nodes for
distributed joins
17. 20©2014 TransLattice, Inc. All Rights Reserved.
Global Transaction Manager (GTM)
• Handles necessary MVCC tasks
• Transaction IDs
• Snapshots
• Manages cluster wide values
• Timestamps
• Sequences
• GTM Standby can be configured
18. 21©2014 TransLattice, Inc. All Rights Reserved.
GTM Proxy
• Runs alongside Coordinator or Datanode
• Groups requests together
• Obtain range of transaction ids (XIDs)
• Obtain snapshot
19. 22©2014 TransLattice, Inc. All Rights Reserved.
Data Distribution
• Replicated Tables
• Each row is replicated
to all data nodes
• Statement based
replication
• Distributed Tables
• Each row is stored on
one data node
• Available strategies
• Hash
• Round Robin
• Modulo
20. 23©2014 TransLattice, Inc. All Rights Reserved.
Availability
• No Single Point of
Failure
• Global Transaction
Manager Standby
• Coordinators are load
balanced
• Streaming replication for
data nodes
• But, currently manual…
22. 25©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE my_table (…)
DISTRIBUTE BY
HASH(col) | MODULO(col) | ROUNDROBIN | REPLICATION
[ TO NODE (nodename[,nodename…])]
23. 26©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE state_code (
state_code CHAR(2),
state_name VARCHAR,
:
) DISTRIBUTE BY REPLICATION
An exact replica on each node
24. 27©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE invoice (
invoice_id SERIAL,
cust_id INT,
:
) DISTRIBUTE BY HASH(invoice_id)
Distributed evenly amongst sharded nodes
25. 28©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE invoice (
invoice_id SERIAL,
cust_id INT ….
) DISTRIBUTE BY HASH(invoice_id);
CREATE TABLE lineitem (
lineitem_id SERIAL,
invoice_id INT …
) DISTRIBUTE BY HASH(invoice_id);
Joins on invoice_id can be pushed down to the datanodes
26. 29©2014 TransLattice, Inc. All Rights Reserved.
test2=# create table t1 (col1 int,
col2 int) distribute by
hash(col1);
test2=# create table t2 (col3 int,
col4 int) distribute by
hash(col3);
27. 30©2014 TransLattice, Inc. All Rights Reserved.
explain select * from t1 inner join t2 on
col1 = col3;
Remote Subquery Scan on all
(datanode_1,datanode_2)
-> Hash Join
Hash Cond:
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
Join push-down. Looks
much like regular
PostgreSQL
28. 31©2014 TransLattice, Inc. All Rights Reserved.
test2=# explain select * from t1 inner join t2
on col2 = col3;
Remote Subquery Scan on all
(datanode_1,datanode_2)
-> Hash Join
Hash Cond:
-> Remote Subquery Scan on all
(datanode_1,datanode_2)
Distribute results by H: col2
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
Will read t1.col2 once and
put in shared queue for
consumption for other
datanodes for joining.
Datanode-datanode
communication and
parallelism
34. 39©2014 TransLattice, Inc. All Rights Reserved.
postgresql.conf
• Very similar to regular PostgreSQL
• But there are extra parameters
• max_pool_size
• min_pool_size
• pool_maintenance _timeout
• remote_query_cost
• network_byte_cost
• sequence_range
• pooler_port
• gtm_host, gtm_port
• shared_queues
• shared_queue_size
35. 40©2014 TransLattice, Inc. All Rights Reserved.
Install
n Download RPMs
– http://sourceforge.net/projects/postgres-xl/
files/Releases/Version_9.2rc/
Or
n configure; make; make install
37. 42©2014 TransLattice, Inc. All Rights Reserved.
Avoid password with pgxc_ctl
• ssh-keygen –t rsa (in ~/.ssh)
• For local test, no need to copy key,
already there
• cat ~/.ssh/id_rsa.pub >> ~/.ssh/
authorized_keys
38. 43©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl
• Initialize cluster
• Start/stop
• Deploy to node
• Add coordinator
• Add data node / slave
• Add GTM slave
39. 44©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
• Let’s create a local test cluster!
• One GTM
• One Coordinator
• Two Datanodes
Make sure all are using different ports
• including pooler_port
40. 45©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
pgxcOwner=pgxl
pgxcUser=$pgxcOwner
pgxcInstallDir=/usr/postgres-xl-9.2
41. 46©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
gtmMasterDir=/data/gtm_master
gtmMasterServer=localhost
gtmSlave=n
gtmProxy=n
42. 47©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
coordMasterDir=/data/coord_master
coordNames=(coord1)
coordPorts=(5432)
poolerPorts=(20010)
coordMasterServers=(localhost)
coordMasterDirs=($coordMasterDir/coord1)
43. 48©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
coordMaxWALSenders=($coordMaxWALsernder)
coordSlave=n
coordSpecificExtraConfig=(none)
coordSpecificExtraPgHba=(none)
44. 49©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
datanodeNames=(dn1 dn2)
datanodeMasterDir=/data/dn_master
datanodePorts=(5433 5434)
datanodePoolerPorts=(20011 20012)
datanodeMasterServers=(localhost localhost)
datanodeMasterDirs=($datanodeMasterDir/dn1
$datanodeMasterDir/dn2)
45. 50©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
datanodeMaxWALSenders=($datanodeMaxWalSe
nder $datanodeMaxWalSender)
datanodeSlave=n
primaryDatanode=dn1
47. 52©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl – Add a second Coordinator
Format:
pgxc_ctl add coordinator master name host port
pooler data_dir
pgxc_ctl add coordinator master coord2 localhost
15432 20020 /data/coord_master/coord2
49. 54©2014 TransLattice, Inc. All Rights Reserved.
pg_catalog
• pgxc_node
• Coordinator and Datanode definition
• Currently not GTM
• pgxc_class
• Table replication or distribution info
50. 55©2014 TransLattice, Inc. All Rights Reserved.
Additional Commands
• PAUSE CLUSTER / UNPAUSE CLUSTER
• Wait for current transactions to complete
• Prevent new commands until UNPAUSE
• EXECUTE DIRECT
ON (nodename) ‘command’
• CLEAN CONNECTION
• Cleans connection pool
• CREATE NODE / ALTER NODE
51. 56©2014 TransLattice, Inc. All Rights Reserved.
Noteworthy
• SELECT pgxc_pool_reload()
• pgxc_clean for 2PC
• Cleans prepared transactions
• If committed on one node, must be
committed on all
• If aborted on one node, must be aborted
on all
• If both committed and aborted output
error
53. 58©2014 TransLattice, Inc. All Rights Reserved.
Backup
• Like PostgreSQL
• pg_dump, pg_dumpall from any Coordinator
• Will pull all data from all nodes in one central
location
• No notion yet of parallel pg_dump of individual
nodes on node-by-node basis, nor parallel
pg_restore
54. 59©2014 TransLattice, Inc. All Rights Reserved.
Cold Backup
• Stop Cluster
• Backup individual data directories on nodes’
file system
• Gzip, compress, rsync off, etc
• Cold restore do the opposite
55. 60©2014 TransLattice, Inc. All Rights Reserved.
Hot Backup
• Similar to PostgreSQL
• Base backup individual nodes
• use streaming replication or WAL archiving
• Backup GTM control file
• Contains XID
• Contains Sequence values
• Periodically written to
• On restart, it will skip ahead some values
What about PITR?
56. 61©2014 TransLattice, Inc. All Rights Reserved.
Hot Backup - Barriers
• CREATE BARRIER ‘barrier_name’
• Waits for currently committing transactions to
finish
• Writes a barrier in each node’s WAL for a
consistent restoration point
• Currently done manually!
• Could be configured as background process in the
future to periodically write
In recovery.conf, use recovery_target_barrier
58. 63©2014 TransLattice, Inc. All Rights Reserved.
Code Merging
PostgreSQL
Postgres-XC
Postgres-XL
Postgres-XC
Enhancements
Postgres-XL
Enhancements
Planner and Executor
are different
68. 73©2014 TransLattice, Inc. All Rights Reserved.
Performance
Data node ßà Data node Communication
– Avoid Coordinator-centric processing
– Can be many times faster for some queries
Pooler needs to be configured for each Data node,
not just coordinator
– More connections needed at each node
69. 74©2014 TransLattice, Inc. All Rights Reserved.
Performance
Re-parse avoidance
In XC, SQL strings are sent to other nodes from
Coordinator, where they are reparsed and replanned.
In XL, plan once on coordinator, serialize and send
over
70. 75©2014 TransLattice, Inc. All Rights Reserved.
Performance
Sequences
Detect frequently accessed sequences, fetch range
of values, avoid fewer round trips to GTM
COPY
INSERT SELECT
71. 76©2014 TransLattice, Inc. All Rights Reserved.
Multi-tenant Security
User can only get own info from global pg_catalog
pg_database, pg_user
Harder to hack if names of other database objects
are unknown
72. 77©2014 TransLattice, Inc. All Rights Reserved.
Statistics
storm_stats
Basic DML stats by time range
pg_stat_statements may be sufficient
74. 79©2014 TransLattice, Inc. All Rights Reserved.
Differences
GUC
n remote_query_cost
n network_byte_cost
n pool_maintenance_timeout
n sequence_range
n shared_queues
n shared_queue_size
77. 82©2014 TransLattice, Inc. All Rights Reserved.
Website and Code
n postgres-xl.org
n sourceforge.net/projects/postgres-xl
n Will add mirror for github.com
n Docs
– files.postgres-xl.org/documentation/index.html
78. 84©2014 TransLattice, Inc. All Rights Reserved.
Philosophy
n Stability and bug fixes over features
n Performance-focused
n Open and inclusive
– Welcome input for roadmap, priorities and design
– Welcome others to become core members and
contributors
79. 85©2014 TransLattice, Inc. All Rights Reserved.
Planned Community Structure
n Like PostgreSQL
– Core
– Major Contributors
– Minor Contributors
– Advocacy
80. 87©2014 TransLattice, Inc. All Rights Reserved.
Roadmap
n Catch up to PostgreSQL Community -> 9.3, 9.4
n Make XL a more robust analytical platform
– Distributed Foreign Data Wrappers
n Remove blockers to adoption
n Native High Availability