• Save

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Scaling with Postgres (Highload++ 2010)

on

  • 3,906 views

 

Statistics

Views

Total Views
3,906
Views on SlideShare
3,905
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://www.pinterest.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Scaling with Postgres (Highload++ 2010) Scaling with Postgres (Highload++ 2010) Presentation Transcript

  • Scaling with Postgres Robert Treat Highload++ 2010 Monday, October 25, 2010
  • Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, Database, Ops Monday, October 25, 2010
  • Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, DATABASE, Ops Monday, October 25, 2010
  • Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, DATABASE, Ops ✤ Etsy, Allisports, National Geographic, Gilt, etc... Monday, October 25, 2010
  • Who Am I? ✤ Robert Treat ✤ Postgres ✤ Web, Advocacy, phpPgAdmin ✤ Major Contributor Monday, October 25, 2010
  • Who Am I? ✤ Postgres 6.5 -> 9.1alpha1 ✤ Terabytes of data ✤ Millions of transactions per day ✤ OLTP, ODS, DW ✤ Perl, PHP, Java, Ruby, C# Monday, October 25, 2010
  • Who Am I? OBSERVATION == LEARNING (hopefully) Monday, October 25, 2010
  • Scalability It is the ability of a computer application or product (hardware or software) to continue to function well when it (or its context) is changed in size or volume in order to meet a user need. Monday, October 25, 2010
  • Scalability Given ever increasing load Monday, October 25, 2010
  • Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL Monday, October 25, 2010
  • Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL impossible goal, but we’ll try Monday, October 25, 2010
  • Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL NOTE! data loss is not a goal, but ideally we won’t lose it :-) Monday, October 25, 2010
  • It starts with culture... Monday, October 25, 2010
  • ✤ Get over schema purity ✤ add column default not null Monday, October 25, 2010
  • ✤ Get over schema purity ✤ add column default not null Good performance comes from good schema design, HOWEVER, perfect relational modeling is NOT THE GOAL Monday, October 25, 2010
  • ✤ Devs must own schema and queries ✤ they design, you refine Monday, October 25, 2010
  • ✤ Devs must own schema and queries ✤ they design, you refine Performance and scalability cannot be managed solely within the database; both require application level knowledge. To achieve this, application developers need to have visibility of the resources they work on Monday, October 25, 2010
  • Gain Visibility Monday, October 25, 2010
  • Gain Visibility ✤ Monitoring ✤ Alerts ✤ Trending ✤ Capacity Planning ✤ Performance Tuning Monday, October 25, 2010
  • Gain Visibility ✤ Alerts ✤ server: out of disk space, high load, etc... ✤ database: connections, sequences, etc... ✤ business: registrations, revenue, etc... ✤ etc... check_postgres.pl Monday, October 25, 2010
  • Gain Visibility ✤ Trending ✤ server: disk usage, load, etc... ✤ database: connections, sequences, etc... ✤ business: registrations, revenue, etc... ✤ etc... cacti, mrtg, circonus Monday, October 25, 2010
  • Gain Visibility ✤ Capacity Planning ✤ disks, cpu, memory ✤ connections, vacuum, bloat simple projections, done regularly, are good enough Monday, October 25, 2010
  • Gain Visibility ✤ Performance tuning ✤ how long do queries take? ✤ how often do they run? pgfouine Monday, October 25, 2010
  • Gain Visibility COMMITS/PUSHES Monday, October 25, 2010
  • Gain Visibility ALL alerts, graphs, query reports, etc... MUST be available to EVERYONE on the team AT ALL TIMES Monday, October 25, 2010
  • Hands on You can’t succeed without first putting the right culture in place. Once you are on the right path, make sure you have the right technology Monday, October 25, 2010
  • Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity Monday, October 25, 2010
  • Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity seriously! Monday, October 25, 2010
  • Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity ✤ BETTER: 8.4 ✤ revised free space map management leads to more efficient vacuuming Monday, October 25, 2010
  • Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity ✤ BETTER: 8.4 ✤ revised free space map management leads to more efficient vacuuming ✤ WHY NOT? 9.0 ✤ Hot standby / streaming replication couldn’t hurt Monday, October 25, 2010
  • Speaking of replication ✤ Common practice for scaling websites ✤ Good for READ based loads ✤ We have used many: ✤ slony, rubyrep, bucardo, 9.0 built-in, mammoth, wrote-our-own Monday, October 25, 2010
  • Speaking of replication Monday, October 25, 2010
  • Speaking of replication ✤ No favorite system for this, evaluate based on: ✤ avoid solutions that duplicate writes at sql level (imho) ✤ how comfortable am I debugging the system? ✤ do you need automated schema changes? ✤ how much redundancy / complexity do you need? ✤ how does the system handle node failure for N nodes? Monday, October 25, 2010
  • So what would you use? (tm) ✤ 2 Nodes, master + standby: Postgres 9.0 ✤ Master + multiple slaves: Slony ✤ Master-Master: Bucardo All choices subject to change!! Monday, October 25, 2010
  • A word about “Sharding” ✤ Distributed computing is hard(er) ✤ we think of things in a singular global state ✤ the more we can work in that model, the better ✤ RDBM offer poor solutions for multiple masters ✤ you must manage that complexity on your own Monday, October 25, 2010
  • A word about “Sharding” ✤ Splitting systems by service: ✤ separate db for login, forums, sales, etc... ✤ allows for growth ✤ provides simple interface Monday, October 25, 2010
  • Pooling ✤ Postgres connections are expensive! ✤ fork new process per connection ✤ keep 1 process open per connection ✤ 1000+ processes you will notice trouble Monday, October 25, 2010
  • Pooling ✤ Postgres connections are expensive! ✤ fork new process per connection ✤ keep 1 process open per connection ✤ 1000+ processes you will notice trouble ✤ POOLING ✤ JDBC, mod-perl ✤ pgbouncer ftw! Monday, October 25, 2010
  • Summary ✤ Schema / Queries should be shared between dev, dba teams! ✤ Monitoring + Visibility! ✤ >= 8.3 Required! ✤ Replication, jump in it! ✤ Use connection pooling! Monday, October 25, 2010
  • Thanks! Oleg & Crew Highload++ OmniTI Postgres Community! You! more: @robtreat2 www.xzilla.net Monday, October 25, 2010