Managing terabytes

  • 1,426 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,426
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
15
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Managing Terabytes Problems and solutions with operating large Postgres installations Selena Deckelmann Prime Radiant @selenamarieSP ogm Ceon Cfo .En Uef2 re 1 0n 1c1e
  • 2. About me. 2 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 3. The Environment • 1.6 TB, 1 cluster,Version 8.2 • 1.1 TB, 1 cluster,Version 8.3 • 8.4/9.0 Dev systems • Working toward 9.0 into prod (May 2011) • pgpool, Redis, RabbitMQ, NFSSP ogm Ceon Cfo .En Uef2 re 3 0n 1c1e
  • 4. Some stats • daily peak: ~3000 commits per second • average writes: 4 MBps • average reads: 8 MBpsSP ogm Ceon Cfo .En Uef2 re 4 0n 1c1e
  • 5. What’s good • Most queries are fast! • Benchmarks say we’re pushing the limits of the hardware • Developers love working with PostgresSP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 6. And lots more. But... 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 7. 1c1e re0n Uef2 .En Cfo Ceon ogmSP
  • 8. The Problems 1. System resource exhaustion 2. Everything is slow: Huge catalogs, Backups 3. Handling VACUUM problems: Bloat, Transaction wraparound 4. Upgrades: Minor, MajorSP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 9. System Resource Exhaustion 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 10. Running out of inodes Problem: UFS on Solaris “The only way to add more inodes to a UFS filesystem is: 1. destroy the filesystem and create a new filesystem with a higher inode density 2. enlarge the filesystem - growfs man page”SP ogm Ceon Cfo .En Uef2 re 10 0n 1c1e
  • 11. Running out of inodes Solution 0: Delete files. Solution 1: Sharding/bigger filesystem Solution 2: xfsSP ogm Ceon Cfo .En Uef2 re 11 0n 1c1e
  • 12. Running out of file descriptors Problem: Too many open files by the database. selena@lulu:~ #508 18:43 :) sudo lsof -p 19121 | wc 40 355 4151 Solution: You need a connection pooler.SP ogm Ceon Cfo .En Uef2 re 12 0n 1c1e
  • 13. Running out of file descriptors Solution: You need a connection pooler. Recommended: pgbouncer (threaded, online upgrade) pgpool-II (failover)SP ogm Ceon Cfo .En Uef2 re 13 0n 1c1e
  • 14. Everything is slow. 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 15. Huge Catalogs 409,994 tablesSP ogm Ceon Cfo .En Uef2 re 15 0n 1c1e
  • 16. Maintenance problem Minor mistake in parent table definitions: not null default nextval(important_sequence::text) vs not null default nextval(important_sequence::regclass)SP ogm Ceon Cfo .En Uef2 re 16 0n 1c1e
  • 17. Huge Catalogs Problem: Slow scans of catalog data Solution: Upgrade to Postgres 8.4 or higher But really: Avoid making a cluster with >400k tables.SP ogm Ceon Cfo .En Uef2 re 17 0n 1c1e
  • 18. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Problem: This is slow to write. (128 MB written every second or so)SP ogm Ceon Cfo .En Uef2 re 18 0n 1c1e
  • 19. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Soution: Move stats file to RAM. stats_temp_directory (8.4 or higher) There’s a trivial patch for earlier versions.SP ogm Ceon Cfo .En Uef2 re 19 0n 1c1e
  • 20. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Problem: This is slow to read.SP ogm Ceon Cfo .En Uef2 re 20 0n 1c1e
  • 21. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Solution: Supposedly, this is better in 8.4 and higher. (fewer writes per minute) Still probably not fast.SP ogm Ceon Cfo .En Uef2 re 21 0n 1c1e
  • 22. Backups pg_dump takes longer and longer...SP ogm Ceon Cfo .En Uef2 re 22 0n 1c1e
  • 23.    backup     |   duration -------------------+--------------------  2009­11­22  |  02:44:36.821475   2009­11­23  |  02:46:20.003507  2009­11­24  |  02:47:06.260705  2009­12­06  |  07:13:04.174964  2009­12­13  |  05:00:01.082676  2009­12­20  |  06:24:49.433043  2009­12­27  |  05:35:20.551477  2010­01­03  |  07:36:49.651492  2010­01­10  |  05:55:02.396163  2010­01­17  |  07:32:33.277559  2010­01­24  |  06:22:46.522319  2010­01­31  |  10:48:13.060888  2010­02­07  |  21:21:47.77618  2010­02­14  |  14:32:04.638267  2010­02­21  |  11:34:42.353244  2010­02­28  |  11:13:02.102345SP ogm Ceon Cfo .En Uef2 re 23 0n 1c1e
  • 24. Backups Problem: pg_dump is too slow. Solutions: • patching pg_dump for SELECT ... LIMIT • crank down shared_buffers • Stop using pg_dump for backups • 64-bit might helpSP ogm Ceon Cfo .En Uef2 re 24 0n 1c1e
  • 25. How not to migrate to a 64-bit systemSP ogm Ceon Cfo .En Uef2 re 25 0n 1c1e
  • 26. Title Text Install 32-bit Postgres and libraries on a 64-bit system. Install 64-bit Postgres/libs of the same version. Copy “hot backup” from 32-bit sys over to 64-bit sys. Run pg_dump from 64-bit version on 32-bit Postgres.SP ogm Ceon Cfo .En Uef2 re 26 0n 1c1e
  • 27. A single warm standby is not a backup. But lots of people use them that way!SP ogm Ceon Cfo .En Uef2 re 27 0n 1c1e
  • 28. Ship WAL from Solaris x86 -> Linux It did work!SP ogm Ceon Cfo .En Uef2 re 28 0n 1c1e
  • 29. Handling VACUUM problems 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 30. Bloat Problem: Lots of dead tuples in tables. • Frequent UPDATEs to long tables of log data • Frequent DELETEs without a VACUUM • A terabyte of dead tuplesSP ogm Ceon Cfo .En Uef2 re 30 0n 1c1e
  • 31. Fixing bloat Solution: Write custom scripts to clean • VACUUM for small things • CLUSTER for everything else • Considered TRUNCATESP ogm Ceon Cfo .En Uef2 re 31 0n 1c1e
  • 32. Catalog Bloat Application allowed users to initiate ALTER TABLE. Regular VACUUM couldn’t fix it. VACUUM FULL of the catalog takes 2+ hours. Use of NOTIFY/LISTEN can also cause bloat.SP ogm Ceon Cfo .En Uef2 re 32 0n 1c1e
  • 33. Transaction wraparound avoidance Problem: autovacuum set off too frequently Watch age(datfrozenxid) Solution: Increase autovacuum_freeze_max_age (default is 200 million, we increase to one billion)SP ogm Ceon Cfo .En Uef2 re 33 0n 1c1e
  • 34. Upgrades 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 35. Minor upgrades Problem: Restarting Postgres causes bad application performance. • Require a start/stop of database • Unexpected CHECKPOINT • Cold cacheSP ogm Ceon Cfo .En Uef2 re 35 0n 1c1e
  • 36. Minor upgrades Solutions: • Plan for a CHECKPOINT before shutdown • Warm the cache (Queries that exercise indexes, maybe table scans)SP ogm Ceon Cfo .En Uef2 re 36 0n 1c1e
  • 37. Major Version upgrades Problem: Major upgrades are a PITA. • <8.2 - no pg_upgrade :( • Time your restores. • Document your SLAs.SP ogm Ceon Cfo .En Uef2 re 37 0n 1c1e
  • 38. Major Version upgrades Solutions: :( • >=8.3 - pg_upgrade • Time your restores. • Document your SLAs.SP ogm Ceon Cfo .En Uef2 re 38 0n 1c1e
  • 39. Major Version upgrades Solutions: :( • Write tools to migrate data • Shard • Trigger-based replicationSP ogm Ceon Cfo .En Uef2 re 39 0n 1c1e
  • 40. The Problems 1. System resource exhaustion 2. Everything is slow: Huge catalogs, Backups 3. Handling VACUUM problems: Bloat, Transaction wraparound 4. Upgrades: Minor, MajorSP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 41. The Solutions 1. System resource exhaustion Choose a better filesystem, Pooling 2. Everything is slow: Huge catalogs, Backups Don’t do that, Monitor & Binary backupsSP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 42. The Solutions 3. Handling VACUUM problems: Bloat, Transaction wraparound Developer education, Monitoring, Cleanup, *_max_freeze_age 4. Upgrades: Minor, Major Plan, Plan, Plan (CHECKPOINT, warm cache, pg_upgrade)SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 43. Thanks! Managing Terabytes Problems and solutions with operating large Postgres installations Selena Deckelmann Prime Radiant @selenamarieSP ogm Ceon Cfo .En Uef2 re 43 0n 1c1e