SCALING OUR SAAS BACKEND
WITH POSTGRESQL
OLIVER SEEMANN, BIDMANAGEMENT GMBH
BWB MEETUP, 2013-10-28
THIS TALK IS ABOUT …
THIS TALK IS ABOUT …

Gigabytes

Terabytes
PRODUCTIVITY TOOLS FOR
ONLINE MARKETERS

Automatic Bid Management for
Auctioned Ads

“Organic” Search
SIGNIFICANT AMOUNTS OF DATA

10.000 Campaigns
5 Mio Keywords
4 Mio Ads
per AdWords account
SIGNIFICANT AMOUNTS OF DATA

Full History for all objects
over full lifetime
SLOW AND FAST DATA

“Slow” / OLAP data for
batch-processing jobs
“Fast” / OLTP data for
human interaction
INITIALLY SEPARATE

Slow
Data

Fast
Data
A LOT OF OVERLAP

Slow
Data

Fast
Data
THEN ONLY ONE

Slow
Data

Fast
Data
CURRENTLY

7 machines running PostgreSQL
3 Terabytes Data
Thousands of Databases
Largest Table: 120GB
HOW IT BEGAN

Experiment
DESIGN BY THE BOOK
Scenario
PK,FK1
PK,FK1
PK

Customer
PK

customer_id

Account

Campaign

Adgroup

PK

user_id

FK1

cust...
MORE CUSTOMERS – MORE DATA
PARTITIONING
All Accounts
Account 1 – Rec 1
Account 2 – Rec 1
Account 1 – Rec 2
Account 3 – Rec 1

Account 2 – Rec 2
Accou...
PARTITIONING
Account 1

Account 2

Account 3

Account 1 – Rec 1

Account 2 – Rec 1

Account 3 – Rec 1

Account 1 – Rec 2

...
PARTITION WITH INHERITANCE

SELECT

Child

Parent

INSERT

Child

CHECK CONSTRAINTS

Child
ISOLATE ACCOUNTS

One DB

Many DBs
PARTITIONING VIA DATABASES

Excellent horizontal scaling
Easy cloning
pg_dump/pg_restore
Some Overhead
No direct reference...
WHY NOT SCHEMAS?

More lightweight
Full References
No easy cloning
No schemas inside schemas
SETUP

main

machine-1

machine-0
machine-2
DB HARDWARE

Data > RAM
⇒ High I/O
EC2?
MIGRATION TO EC2

Must migrate all/most machines
No PostgreSQL in RDS
DB Instances run 24/7 ⇒ costly
EBS Performance limit...
EBS I/O LIMITED
MB/s
900
800
700
600
500
400
300
200
100
0

Seq. Write
Seq. Read

AWS Instance AWS EBS (Raid-0)
Storage SS...
DEDICATED MACHINES

Moderate CPU / RAM
Fast Disks
Battery-backed caching controller
ALTERNATIVE HW

Use bigger (and slower) SATA drives
Evaluate EC2+EBS in production
SSDs
HARDWARE FAILS

Replication

Master

Slave

Availability
Query Load Balancing
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
BACKUPS

pg_dump
compressed

Backup Server
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
REPLICATION
Account DBs

Main DB
master-1

master

master-3

master-2

master-4

slave
DISASTER RESTORE

concurrent
pg_restore

Backup Server
PERFORMANCE PROBLEMS
Too many concurrent full table scans
From 300MB/s to 30MB/s
MORE CONCURRENT
QUERIES

LONGER QUERY RUN...
DIFFERENT APPS

Web App
Server

Compute
Cluster

Many fast
queries

Few very
slow queries
DIFFERENT APPS
Semaphore

Web App
Server

Many fast queries

Compute
Cluster

Few very slow queries

Simple counting semap...
BULK INSERTS

INSERT
20k – 80k
per sec

50M
BULK INSERT BEST PRACTICE

COPY instead of INSERT
Drop indexes + recreate
Truncate
COPY into a new table, swap + drop
SIGNUP PROBLEMS

Adspert
Service

Signup
CREATE
DATABASE

Up to 5-10 min
PRE-CREATE DATABASES

Create DBs ahead of time
New signups rename DBs
Periodically create new
Fall back to direct create
CONCLUDING ..

Partitioning into Databases
Physical Hardware
Check out advisory locks
THANKS FOR LISTENING

QUESTIONS?
Upcoming SlideShare
Loading in...5
×

Scaling our SaaS backend with PostgreSQL

1,011

Published on

Slides of the talk I gave on 2013-10-28 at the Backend Web Berlin Meetup as well as Developer Conference in Hamburg on 2013-11-08.

Published in: Technology
1 Comment
2 Likes
Statistics
Notes
  • can you send me the ppt
    Redis as a message queue to me? I like the blackboard background of this ppt. my email is albertshen1206@gmail.com. thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,011
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide
  • Hi, I’m Oliver, I’m a software developer, currently heading the development team at Bidmanagement GmbH in Berlin.
  • I’m going to talk aboutpostgresqlNot so much about the dbms itself, but more about how we’re using it as main datastore in our system.
  • About how in our company we're running a large Postgresql installationHow we‘ve grown our setup
  • ----- Meeting Notes (27/10/13 11:10) -----very popularbillions of dollarsvery important online marketing channel
  • Google provides a very extensive API
  • ----- Meeting Notes (23/10/13 22:22) -----The different kinds of data we store can be largely separated into two groups.
  • .. And we decided to go with postgresql, because:Our Go-To tool for storing data for many yearsProblems from time to time, but..We never looked back
  • But it began much smaller …
  • Straightforward approachNobody thought of scaling
  • Pilots successful, we started to acquire customersSoon >10mio rows in some tablesQuery performance lagged (many FTS) Did not want to scale horizontally, because we aspired much bigger growth(Also: expensive)----- Meeting Notes (24/10/13 20:45) -----vertically
  • PostgreSQL supports partitioning via inheritance[insert scheme]Use CHECK constraints to tell Query Planner where to lookCannot insert into parent table, must insert into child tableLot of effort goes to application logicTried it on one table, weren’t it conviced
  • One main db with non-account specific dataCurrently ~ 1-2 GBSeveral machines dedicated to account-databases50-1000 DBs per machinePostgreSQL 9.0 and 9.3 on each machineAllows us to migrate one db after another
  • Partitioning scheme allows easy horizontal scaling More machines. But which?Dataset does not fit in RAM High I/O requirementsAWS EC2?Must migrate all/most machines due to latencyDB Instances run 24/7  costlyEBS Performance limited (GBit Ethernet)[ec2 / ebs performance numbers vs. physical]----- Meeting Notes (24/10/13 20:45) -----Add: not many core
  • Not that much elasticity requiredAs B2B our growth is more predictableBatch processing of expensive backend jobs1 year EC2 instance ≅ Buying one physical serverUsing mid-sized machinesGood price/value ratio
  • SATA: 600GB vs 3 TBEC2: performance, latency unclear. Evaluate to make informed decisionSSDs: expensive. Reliable? Raid?
  • But when things go awry and data gets deleted …
  • Big cheap HDDs
  • But when things go awry and data gets deleted …
  • But when things go awry and data gets deleted …
  • MainDB still replicatedTo enable quick failoverHere we can’t afford extended downtime
  • Capacity doubled, cost reduced 40%The more servers, the faster the restoreGbit Ethernet on backup server is limiting factor
  • From sequential reads to random readsFeedback loop:
  • Webapp-queries with humans waiting are quite fastProblematic queries done by the analysis jobsFrequent full table scansQueries with huge resultsNeed way to synchronize queries, control concurrencyCould use a connection poolerOr an external synchronization mechanisme.g. Zookeeper
  • Webapp-queries with humans waiting are quite fastProblematic queries done by the analysis jobsFrequent full table scansQueries with huge resultsNeed way to synchronize queries, control concurrencyCould use a connection poolerOr an external synchronization mechanisme.g. Zookeeper
  • We rewrite the history every day (for various reasons)Conversions arrive up to 30 days laterCampaigns are added to optimizationFor most accounts <1M recordsFor some 10-100MWe achieve up to 80k inserts/secNetwork is bottleneck [check this]
  • We use COPY for all bulk inserts, even small bulksDrop/recreate with simple plpgsql functionsFor complete table rewritesTRUNCATE is not transaction safe
  • We added a self-service signup2-minute process to add AdWords account to the systemOAuth User Info  Optimization BootstrapBiggest problem:CREATE DATABASE can take several minutesDepends on current amount of write activity
  • We know always keep 10-20 spare databases in stockWe control target host for new databases this wayTake care not to have race conditions when applying schema changes
  • Transcript of "Scaling our SaaS backend with PostgreSQL"

    1. 1. SCALING OUR SAAS BACKEND WITH POSTGRESQL OLIVER SEEMANN, BIDMANAGEMENT GMBH BWB MEETUP, 2013-10-28
    2. 2. THIS TALK IS ABOUT …
    3. 3. THIS TALK IS ABOUT … Gigabytes Terabytes
    4. 4. PRODUCTIVITY TOOLS FOR ONLINE MARKETERS Automatic Bid Management for
    5. 5. Auctioned Ads “Organic” Search
    6. 6. SIGNIFICANT AMOUNTS OF DATA 10.000 Campaigns 5 Mio Keywords 4 Mio Ads per AdWords account
    7. 7. SIGNIFICANT AMOUNTS OF DATA Full History for all objects over full lifetime
    8. 8. SLOW AND FAST DATA “Slow” / OLAP data for batch-processing jobs “Fast” / OLTP data for human interaction
    9. 9. INITIALLY SEPARATE Slow Data Fast Data
    10. 10. A LOT OF OVERLAP Slow Data Fast Data
    11. 11. THEN ONLY ONE Slow Data Fast Data
    12. 12. CURRENTLY 7 machines running PostgreSQL 3 Terabytes Data Thousands of Databases Largest Table: 120GB
    13. 13. HOW IT BEGAN Experiment
    14. 14. DESIGN BY THE BOOK Scenario PK,FK1 PK,FK1 PK Customer PK customer_id Account Campaign Adgroup PK user_id FK1 customer_id account_id PK campaign_id PK adgroup_id FK1 User PK customer_id FK1 account_id FK1 campaign_id UserAccountAccess PK,FK1 PK,FK2 account_id user_id History PK PK,FK1 PK,FK1,FK2 day keyword_id adgroup_id keyword_id adgroup_id factor Keywords PK,FK1 PK adgroup_id keyword_id
    15. 15. MORE CUSTOMERS – MORE DATA
    16. 16. PARTITIONING All Accounts Account 1 – Rec 1 Account 2 – Rec 1 Account 1 – Rec 2 Account 3 – Rec 1 Account 2 – Rec 2 Account 2 – Rec 3 Account 1 – Rec 3 Account 3 – Rec 2
    17. 17. PARTITIONING Account 1 Account 2 Account 3 Account 1 – Rec 1 Account 2 – Rec 1 Account 3 – Rec 1 Account 1 – Rec 2 Account 2 – Rec 2 Account 3 – Rec 2 Account 1 – Rec 3 Account 2 – Rec 3 Account 3 – Rec 3
    18. 18. PARTITION WITH INHERITANCE SELECT Child Parent INSERT Child CHECK CONSTRAINTS Child
    19. 19. ISOLATE ACCOUNTS One DB Many DBs
    20. 20. PARTITIONING VIA DATABASES Excellent horizontal scaling Easy cloning pg_dump/pg_restore Some Overhead No direct references
    21. 21. WHY NOT SCHEMAS? More lightweight Full References No easy cloning No schemas inside schemas
    22. 22. SETUP main machine-1 machine-0 machine-2
    23. 23. DB HARDWARE Data > RAM ⇒ High I/O EC2?
    24. 24. MIGRATION TO EC2 Must migrate all/most machines No PostgreSQL in RDS DB Instances run 24/7 ⇒ costly EBS Performance limited
    25. 25. EBS I/O LIMITED MB/s 900 800 700 600 500 400 300 200 100 0 Seq. Write Seq. Read AWS Instance AWS EBS (Raid-0) Storage SSD (Raid0) Real 15k SAS2 (Raid-10)
    26. 26. DEDICATED MACHINES Moderate CPU / RAM Fast Disks Battery-backed caching controller
    27. 27. ALTERNATIVE HW Use bigger (and slower) SATA drives Evaluate EC2+EBS in production SSDs
    28. 28. HARDWARE FAILS Replication Master Slave Availability Query Load Balancing
    29. 29. REPLICATION Account DBs Main DB master-1 master slave-1 master-2 slave-2 slave
    30. 30. BACKUPS pg_dump compressed Backup Server
    31. 31. REPLICATION Account DBs Main DB master-1 master slave-1 master-2 slave-2 slave
    32. 32. REPLICATION Account DBs Main DB master-1 master slave-1 master-2 slave-2 slave
    33. 33. REPLICATION Account DBs Main DB master-1 master master-3 master-2 master-4 slave
    34. 34. DISASTER RESTORE concurrent pg_restore Backup Server
    35. 35. PERFORMANCE PROBLEMS Too many concurrent full table scans From 300MB/s to 30MB/s MORE CONCURRENT QUERIES LONGER QUERY RUNTIME
    36. 36. DIFFERENT APPS Web App Server Compute Cluster Many fast queries Few very slow queries
    37. 37. DIFFERENT APPS Semaphore Web App Server Many fast queries Compute Cluster Few very slow queries Simple counting semaphore using Advisory Locks Implemented in the application
    38. 38. BULK INSERTS INSERT 20k – 80k per sec 50M
    39. 39. BULK INSERT BEST PRACTICE COPY instead of INSERT Drop indexes + recreate Truncate COPY into a new table, swap + drop
    40. 40. SIGNUP PROBLEMS Adspert Service Signup CREATE DATABASE Up to 5-10 min
    41. 41. PRE-CREATE DATABASES Create DBs ahead of time New signups rename DBs Periodically create new Fall back to direct create
    42. 42. CONCLUDING .. Partitioning into Databases Physical Hardware Check out advisory locks
    43. 43. THANKS FOR LISTENING QUESTIONS?

    ×