Postgres-XC as a Key Value Store Compared To MongoDB


Published on

This presentation discusses how Postgres-XC can be used as a PostgreSQL-based key-value store using features like hstore and JSON. It also compares performance to MongoDB for a read workload

Published in: Technology, Business
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Postgres-XC as a Key Value Store Compared To MongoDB

  1. 1. A Postgres-XCDistributed Key-Value StoreMason SharpApril 15, 2013CC License: Attribution-NonCommercial-ShareAlike
  2. 2. Who Am I?Mason Sharp●Original architect of Stado / GridSQL●One of original architects of Postgres-XC●Former architect at EnterpriseDB●Co-organizer of NYC PostgreSQL User Group●Co-founder and CTO of
  3. 3. Agenda●Why use a key-value store?●PostgreSQL features●XML●hstore●JSON●Postgres-XC Overview●Measurements: MongoDB versus Postgres-XC
  4. 4. Agenda●Why use a key-value store?●PostgreSQL features●XML●hstore●JSON●Postgres-XC Overview●Measurements: MongoDB versus Postgres-XC
  5. 5. Why Use a Key-Value Store?●Document oriented vs. row oriented●Unstructured data●Semi-structured data●Self-describing / schema-less●Uses Tags●Dynamic attributes for different objects●Dwight Merriman, CEO 10gen (paraphrasing):●“Some customers use MongoDB just for the schema-less features. They dont need the scalability andrun on one single server” (!)●“Easier for developers” (...)
  6. 6. Why Use a Key-Value Store? (2)●Key-value makes for an easy distributed store●Multiple servers●In-memory●No complicated schema changes●But PostgreSQLs ALTER TABLE exclusive locksmay be brief●Need to be “web-scale”●Perception that it scales better●What if it no longer fits in memory?●A series of unfortunate anecdotes
  7. 7. PostgreSQLDocument Store Capabilities
  8. 8. XML●--with-libxml at build time●Native data type●CREATE TABLE foo (myid int, data xml)●ValidationINSERT INTO foo VALUES (2, <aaa);ERROR: invalid XML contentDetail: line 1: Couldnt find end of Start Tagaaa line 1●Xpath●Mapping & Export functions
  9. 9. hstore●Contrib module●CREATE EXTENSION hstore●Key/value pairs●Data type
  10. 10. hstoreCREATE TABLE foo (myid int, hdata hstore);INSERT INTO foo VALUES (10,"name"=>"fred", "department"=>"IT");
  11. 11. hstoreSELECT hdata->name FROM foo WHERE id = 10;?column?----------fred(1 row)# Extract all department values where it is an attributeSELECT hdata->departmentFROM fooWHERE hdata ? department;
  12. 12. Hstore Manipulation●Concatenatea=>b, c=>d::hstore || c=>x, d=>q::hstore"a"=>"b", "c"=>"x", "d"=>"q"●Delete elementdelete(a=>1,b=>2,b)"a"=>"1"
  13. 13. hstore# Get a list of unique keysSELECT DISTINCT (each(hdata)).keyFROM foo
  14. 14. hstore - Indexes●Btree index only helps with =●Gin and gist indexes will help with operators●@> left operand contains right●? contains key●?& contains all keys in array●?| contains at least one key in array●Can create index on custom function●Extract a particular key value
  15. 15. JSON●JavaScript Object Notation●PostgreSQL 9.2 basic support●array_to_json●row_to_jsonNote: Postgres-XC 1.0.2 based on PostgreSQL9.1, will be based on 9.2 soon
  16. 16. JSON – looking ahead toPostgreSQL 9.3●PostgreSQL 9.3●json_agg●hstore_to_json●hstore_to_json_loose●… and much more
  17. 17. Composite TypeCREATE TYPE address AS (street TEXT,city TEXT,state TEXT,zip CHAR(10));CREATE TABLE customer (full_name TEXT,mail_address address);
  18. 18. row_to_jsontest1=# select row_to_json(customer) fromcustomer;{"full_name":"Joe Lee","mail_address": {"street":"100 Broad Street","city":"Red Bank","state":"NJ","zip":"07701 "}}
  19. 19. 19●PostgreSQL-based database clusterSame API to Apps as PostgreSQL• Same drivers●Symmetric Multi-headed ClusterNo master, no slave• Not just PostgreSQL replication.• Application can read/write to any coordinator serverConsistent database view to all the transactions• Complete ACID property to all the transactions in the cluster●Scales both for Write and Read
  20. 20. Sep 20, 2012 Postgres-XC 20
  21. 21. Sep 20, 2012 Postgres-XC 21Postgres-XC ClusterCoordinatorData NodePG-XC ServerCoordinatorData NodeCoordinatorData NodeCoordinatorData Node・・・・・Communication amongPG-XC serversAdd PG-XC servers asneededGlobal TransactionManagerApplication can connect to any server to have the same database view and service.GTMPG-XC Server PG-XC Server PG-XC Server
  22. 22. Coordinator Overview●Based on PostgreSQL●Accepts connections from clients●Parses and plans requests●Interacts with Global Transaction Manager●Uses pooler for Data Node connections●Sends down XIDs and snapshots to Data Nodes●Collects results and returns to client●Uses two phase commit if necessary22
  23. 23. Data Node Overview●Based on PostgreSQL●Where user created data is actually stored●Coordinators (not clients) connects to DataNodes●Accepts XID and snapshots from Coordinator●The rest is fairly similar to vanilla PostgreSQL23
  24. 24. Sep 20, 2012 Postgres-XC 24Global Transaction ManagerCluster nodesGTMXIDSnapshotTimestampSequence values
  25. 25. GTM Overview●Issues Transaction IDs (XIDs)●Issues Snapshots●Issues Timestamps●Issues Sequences●Based on PostgreSQL procarray code●Multi-threaded25
  26. 26. GTM Proxy●Runs on other nodes●Groups requests together●Reduces number of connections to GTM●Reduces traffic to GTM26
  27. 27. Sep 20, 2012 Postgres-XC 27Summary● Coordinator● Visible to apps● SQL analysis, planning, execution● Connection pooling● Datanode (or simply “NODE”)● Actual database store● Local SQL execution● GTM (Global Transaction Manager)● Provides consistent database view to transactions– GXID (Global Transaction ID)– Snapshot (List of active transactions)– Other global values such as SEQUENCE● GTM Proxy, integrates server-local transaction requirement for performancePostgres-XC core, based uponvanilla PostgreSQLShare same binaryMay want to colocateDifferent binaries
  28. 28. MongoDB vs Postgres-XCPerformance Comparison●Three data nodes (16GB RAM each)●Postgres-XC also used a coordinator●Adds latency●Out-of-the-box default configuration●No replicas
  29. 29. Insert Comparison – single thread●0 – 1M Rows●MongoDB: 7m 06s●Postgres-XC: 131m 1s●Postgres-XC COPY: 43s●10M – 20M Rows●MongoDB: 64m 48●Postgres-XC: 354m 56sGTM in XC adds a lot of latency hurtingsingle-threaded performance
  30. 30. Read Comparison(shorter is better)1 2 3 4 5 6 7 8 9 1000.511.522.5MongoDBPostgres-XCRows (millions)Time(seconds)
  31. 31. Update Comparison – single thread50 GB, single thread●1000 Updates by partitioned key●MongoDB: 43s●Postgres-XC: 1m 6s●1000 Updates by indexed non-partitioned key●MongoDB: 7m 55s●Postgres-XC: 1m 54sNon-partitioned index-based faster in XC
  32. 32. Update Concurrency on Key
  33. 33. Possible Future Tests●Insert,Select concurrency test (important)●Mixed workload●Measure in-memory and not in-memory●Impact of replicas for availability●MongoDB replicas●Postgres-XC streaming replication●Have seen about 15% perf drop for two sync slaves●MongoDB Write-Concern durability settings (tryjournaled)●Hstore
  34. 34. Other PostgreSQL Results?●Christophe●Single laptop-based tests, but interesting●
  35. 35. Summary●PostgreSQL has schema-less functionality built-in and can act as a key-value store●Postgres-XC can scale this out horizontally tomultiple servers●MongoDB performs much better for lowconcurrency for inserts●In XC, use COPY or multiple threads to populate●Postgres-XC performs better for non-partitionedindexed access●Postgres-XC can perform about the same toMongoDB for reads
  36. 36. Summary (2)If Postgres-XC generally performs similarly toMongoDB, why not use XC and●Stick with ACID●Feel secure with PostgreSQL maturity●Leverage PostgreSQL features and community
  37. 37. Thank YouMason
  38. 38. Content Attribution●Postgres-XC Development Group●Koichi Suzuki●Michael Paquier●Ashutosh Bapat●Pavan Deolasee●Christophe Pettus●Mason Sharp●...