SlideShare a Scribd company logo
1 of 43
Download to read offline
Managing Terabytes
                          Problems and solutions with
                       operating large Postgres installations
                                Selena Deckelmann
                                   Prime Radiant
                                  @selenamarie
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         1
               0n
                1c1e
About me.



                          2
                            1c1e
                         re0n
                      Uef2
                    .En
                  Cfo
               Ceon
             ogm
            SP
The Environment

                       • 1.6 TB, 1 cluster,Version 8.2
                       • 1.1 TB, 1 cluster,Version 8.3
                       • 8.4/9.0 Dev systems
                       • Working toward 9.0 into prod (May 2011)
                       • pgpool, Redis, RabbitMQ, NFS
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                           3
               0n
                1c1e
Some stats

                       • daily peak: ~3000 commits per second
                       • average writes: 4 MBps
                       • average reads: 8 MBps
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                           4
               0n
                1c1e
What’s good

                       • Most queries are fast!
                       • Benchmarks say we’re pushing the limits of
                         the hardware
                       • Developers love working with Postgres
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re0n
                1c1e
And lots more. But...
                                        1c1e
                                     re0n
                                  Uef2
                                .En
                              Cfo
                           Ceon
                         ogm
                        SP
1c1e
             re0n
          Uef2
        .En
      Cfo
   Ceon
 ogm
SP
The Problems

                  1. System resource exhaustion
                  2. Everything is slow: Huge catalogs, Backups
                  3. Handling VACUUM problems: Bloat,
                     Transaction wraparound
                  4. Upgrades: Minor, Major
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re0n
                1c1e
System Resource Exhaustion
                                             1c1e
                                          re0n
                                       Uef2
                                     .En
                                   Cfo
                                Ceon
                              ogm
                             SP
Running out of inodes

                       Problem: UFS on Solaris
                       “The only way to add more inodes to a UFS
                       filesystem is: 1. destroy the filesystem and create a
                       new filesystem with a higher inode density 2. enlarge
                       the filesystem - growfs man page”
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                              10
               0n
                1c1e
Running out of inodes

                       Solution 0: Delete files.
                       Solution 1: Sharding/bigger filesystem
                       Solution 2: xfs
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         11
               0n
                1c1e
Running out of
                           file descriptors
                       Problem: Too many open files
                       by the database.
                       selena@lulu:~ #508 18:43 :)
                       sudo lsof -p 19121 | wc
                           40     355        4151

                       Solution: You need a connection
                       pooler.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                        12
               0n
                1c1e
Running out of
                           file descriptors
                       Solution: You need a connection
                       pooler.
                       Recommended:
                       pgbouncer (threaded, online upgrade)
                       pgpool-II (failover)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                       13
               0n
                1c1e
Everything is slow.
                                      1c1e
                                   re0n
                                Uef2
                              .En
                            Cfo
                         Ceon
                       ogm
                      SP
Huge Catalogs


                       409,994 tables
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                        15
               0n
                1c1e
Maintenance problem
                       Minor mistake in parent table definitions:

                       not null default
                       nextval('important_sequence'::text)

                       vs

                       not null default
                       nextval('important_sequence'::regclass)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                           16
               0n
                1c1e
Huge Catalogs
                       Problem: Slow scans of catalog data


                       Solution:
                       Upgrade to Postgres 8.4 or higher


                       But really: Avoid making a cluster with >400k
                       tables.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                           17
               0n
                1c1e
Stats collection

                       9,019,868 total data points for table stats
                       4,550,770 total data points for index stats
                       Problem: This is slow to write.
                       (128 MB written every second or so)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                            18
               0n
                1c1e
Stats collection

                       9,019,868 total data points for table stats
                       4,550,770 total data points for index stats
                       Soution: Move stats file to RAM.
                       stats_temp_directory    (8.4 or higher)
                       There’s a trivial patch for earlier versions.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                             19
               0n
                1c1e
Stats collection

                       9,019,868 total data points for table stats
                       4,550,770 total data points for index stats
                       Problem: This is slow to read.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                            20
               0n
                1c1e
Stats collection
                       9,019,868 total data points for table stats
                       4,550,770 total data points for index stats
                       Solution:
                       Supposedly, this is better in 8.4 and higher.
                       (fewer writes per minute)
                       Still probably not fast.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                             21
               0n
                1c1e
Backups


                       pg_dump takes longer and longer...
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                       22
               0n
                1c1e
 
                           backup          |   duration
                        -------------------+--------------------
                            2009­11­22     |  02:44:36.821475
                            2009­11­23     |  02:46:20.003507
                            2009­11­24     |  02:47:06.260705
                            2009­12­06     |  07:13:04.174964
                            2009­12­13     |  05:00:01.082676
                            2009­12­20     |  06:24:49.433043
                            2009­12­27     |  05:35:20.551477
                            2010­01­03     |  07:36:49.651492
                            2010­01­10     |  05:55:02.396163
                            2010­01­17     |  07:32:33.277559
                            2010­01­24     |  06:22:46.522319
                            2010­01­31     |  10:48:13.060888
                            2010­02­07     |  21:21:47.77618
                            2010­02­14     |  14:32:04.638267
                            2010­02­21     |  11:34:42.353244
                            2010­02­28     |  11:13:02.102345
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                        23
               0n
                1c1e
Backups
                       Problem: pg_dump is too slow.
                       Solutions:
                       • patching pg_dump for SELECT ... LIMIT
                       • crank down shared_buffers
                       • Stop using pg_dump for backups
                       • 64-bit might help
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         24
               0n
                1c1e
How not to migrate
                        to a 64-bit system
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                25
               0n
                1c1e
Title Text



                       Install 32-bit Postgres and libraries on a 64-bit system.
                       Install 64-bit Postgres/libs of the same version.
                       Copy “hot backup” from 32-bit sys over to 64-bit sys.
                       Run pg_dump from 64-bit version on 32-bit Postgres.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                                  26
               0n
                1c1e
A single warm standby
                          is not a backup.


                       But lots of people use them that way!
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                          27
               0n
                1c1e
Ship WAL from Solaris x86 -> Linux
                                  It did work!
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                    28
               0n
                1c1e
Handling VACUUM problems
                                           1c1e
                                        re0n
                                     Uef2
                                   .En
                                 Cfo
                              Ceon
                            ogm
                           SP
Bloat

                       Problem: Lots of dead tuples in tables.

                       • Frequent UPDATEs to long tables of log
                          data
                       • Frequent DELETEs without a VACUUM
                       • A terabyte of dead tuples
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                           30
               0n
                1c1e
Fixing bloat

                       Solution: Write custom scripts to clean

                       • VACUUM for small things
                       • CLUSTER for everything else
                       • Considered TRUNCATE
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         31
               0n
                1c1e
Catalog Bloat

                       Application allowed users to initiate ALTER
                       TABLE.

                       Regular VACUUM couldn’t fix it.
                       VACUUM FULL   of the catalog takes 2+ hours.
                       Use of NOTIFY/LISTEN can also cause bloat.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                             32
               0n
                1c1e
Transaction
                       wraparound avoidance
                       Problem: autovacuum set off too
                       frequently
                       Watch age(datfrozenxid)
                       Solution:
                       Increase autovacuum_freeze_max_age
                       (default is 200 million, we increase to one
                       billion)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                            33
               0n
                1c1e
Upgrades
                           1c1e
                        re0n
                     Uef2
                   .En
                 Cfo
              Ceon
            ogm
           SP
Minor upgrades

                       Problem: Restarting Postgres causes bad
                       application performance.
                       • Require a start/stop of database
                         • Unexpected CHECKPOINT
                         • Cold cache
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                          35
               0n
                1c1e
Minor upgrades

                       Solutions:
                         • Plan for a CHECKPOINT before
                            shutdown
                         • Warm the cache (Queries that
                            exercise indexes, maybe table scans)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                          36
               0n
                1c1e
Major Version upgrades

                       Problem: Major upgrades are a PITA.
                         • <8.2 - no pg_upgrade :(
                         • Time your restores.
                         • Document your SLAs.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         37
               0n
                1c1e
Major Version upgrades

                       Solutions: :(
                         • >=8.3 - pg_upgrade
                         • Time your restores.
                         • Document your SLAs.
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                       38
               0n
                1c1e
Major Version upgrades

                       Solutions: :(
                         • Write tools to migrate data
                         • Shard
                         • Trigger-based replication
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         39
               0n
                1c1e
The Problems

                  1. System resource exhaustion
                  2. Everything is slow: Huge catalogs, Backups
                  3. Handling VACUUM problems: Bloat,
                     Transaction wraparound
                  4. Upgrades: Minor, Major
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re0n
                1c1e
The Solutions

                  1. System resource exhaustion
                     Choose a better filesystem, Pooling
                  2. Everything is slow: Huge catalogs, Backups
                     Don’t do that, Monitor & Binary
                     backups
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re0n
                1c1e
The Solutions

                  3. Handling VACUUM problems: Bloat,
                     Transaction wraparound
                     Developer education, Monitoring,
                     Cleanup, *_max_freeze_age
                  4. Upgrades: Minor, Major
                     Plan, Plan, Plan
                       (CHECKPOINT, warm cache, pg_upgrade)
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re0n
                1c1e
Thanks!
                           Managing Terabytes
                          Problems and solutions with
                       operating large Postgres installations
                                Selena Deckelmann
                                   Prime Radiant
                                  @selenamarie
SP
 ogm
   Ceon
      Cfo
        .En
          Uef2
             re




                                         43
               0n
                1c1e

More Related Content

Viewers also liked

Web Standards Seminar 2006
Web Standards Seminar 2006Web Standards Seminar 2006
Web Standards Seminar 2006Taeyoung Yoon
 
A Critical Analysis Of British Mosques As An
A Critical Analysis Of British Mosques As AnA Critical Analysis Of British Mosques As An
A Critical Analysis Of British Mosques As Anguest0fb60e
 
Kyoto, Japan 京都
 Kyoto, Japan 京都 Kyoto, Japan 京都
Kyoto, Japan 京都nonnon
 
Sida vs Denguecmas
Sida vs DenguecmasSida vs Denguecmas
Sida vs Denguecmaslambert
 
Tehnoloogia Rakendamine
Tehnoloogia RakendamineTehnoloogia Rakendamine
Tehnoloogia Rakendaminekiq
 
De Milieuproblematiek
De MilieuproblematiekDe Milieuproblematiek
De Milieuproblematiekguestb2e54a
 
Использование инструментальных средств для выделения коллокаций в лексикограф...
Использование инструментальных средств для выделения коллокаций влексикограф...Использование инструментальных средств для выделения коллокаций влексикограф...
Использование инструментальных средств для выделения коллокаций в лексикограф...Lidia Pivovarova
 
Presentation1
Presentation1Presentation1
Presentation1Borreke
 
Text Pattern Formation For Information Extraction
Text Pattern Formation For Information ExtractionText Pattern Formation For Information Extraction
Text Pattern Formation For Information ExtractionLidia Pivovarova
 

Viewers also liked (14)

Web Standards Seminar 2006
Web Standards Seminar 2006Web Standards Seminar 2006
Web Standards Seminar 2006
 
Stolyarov
StolyarovStolyarov
Stolyarov
 
Matkalla metaverseen?
Matkalla metaverseen?Matkalla metaverseen?
Matkalla metaverseen?
 
A Critical Analysis Of British Mosques As An
A Critical Analysis Of British Mosques As AnA Critical Analysis Of British Mosques As An
A Critical Analysis Of British Mosques As An
 
Exercici 3
Exercici 3Exercici 3
Exercici 3
 
Kyoto, Japan 京都
 Kyoto, Japan 京都 Kyoto, Japan 京都
Kyoto, Japan 京都
 
Sida vs Denguecmas
Sida vs DenguecmasSida vs Denguecmas
Sida vs Denguecmas
 
Tehnoloogia Rakendamine
Tehnoloogia RakendamineTehnoloogia Rakendamine
Tehnoloogia Rakendamine
 
P4
P4P4
P4
 
De Milieuproblematiek
De MilieuproblematiekDe Milieuproblematiek
De Milieuproblematiek
 
Использование инструментальных средств для выделения коллокаций в лексикограф...
Использование инструментальных средств для выделения коллокаций влексикограф...Использование инструментальных средств для выделения коллокаций влексикограф...
Использование инструментальных средств для выделения коллокаций в лексикограф...
 
Presentation1
Presentation1Presentation1
Presentation1
 
Noches Griegas
Noches GriegasNoches Griegas
Noches Griegas
 
Text Pattern Formation For Information Extraction
Text Pattern Formation For Information ExtractionText Pattern Formation For Information Extraction
Text Pattern Formation For Information Extraction
 

More from Selena Deckelmann

While we're here, let's fix computer science education
While we're here, let's fix computer science educationWhile we're here, let's fix computer science education
While we're here, let's fix computer science educationSelena Deckelmann
 
Mistakes were made - LCA 2012
Mistakes were made - LCA 2012Mistakes were made - LCA 2012
Mistakes were made - LCA 2012Selena Deckelmann
 
Postgres needs an aircraft carrier
Postgres needs an aircraft carrierPostgres needs an aircraft carrier
Postgres needs an aircraft carrierSelena Deckelmann
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Selena Deckelmann
 
Letters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communityLetters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communitySelena Deckelmann
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source communitySelena Deckelmann
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigSelena Deckelmann
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigSelena Deckelmann
 
How a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionHow a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionSelena Deckelmann
 
Open Source Bridge Opening Day
Open Source Bridge Opening DayOpen Source Bridge Opening Day
Open Source Bridge Opening DaySelena Deckelmann
 

More from Selena Deckelmann (20)

While we're here, let's fix computer science education
While we're here, let's fix computer science educationWhile we're here, let's fix computer science education
While we're here, let's fix computer science education
 
Algorithms are Recipes
Algorithms are RecipesAlgorithms are Recipes
Algorithms are Recipes
 
Hire the right way
Hire the right wayHire the right way
Hire the right way
 
Mistakes were made - LCA 2012
Mistakes were made - LCA 2012Mistakes were made - LCA 2012
Mistakes were made - LCA 2012
 
Pg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, BallaratPg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, Ballarat
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Postgres needs an aircraft carrier
Postgres needs an aircraft carrierPostgres needs an aircraft carrier
Postgres needs an aircraft carrier
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1
 
How to ask for money
How to ask for moneyHow to ask for money
How to ask for money
 
Letters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communityLetters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres community
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source community
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Making Software Communities
Making Software CommunitiesMaking Software Communities
Making Software Communities
 
Illustrated buffer cache
Illustrated buffer cacheIllustrated buffer cache
Illustrated buffer cache
 
Bucardo
BucardoBucardo
Bucardo
 
How a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionHow a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged Election
 
Open Source Bridge Opening Day
Open Source Bridge Opening DayOpen Source Bridge Opening Day
Open Source Bridge Opening Day
 

Recently uploaded

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Managing terabytes

  • 1. Managing Terabytes Problems and solutions with operating large Postgres installations Selena Deckelmann Prime Radiant @selenamarie SP ogm Ceon Cfo .En Uef2 re 1 0n 1c1e
  • 2. About me. 2 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 3. The Environment • 1.6 TB, 1 cluster,Version 8.2 • 1.1 TB, 1 cluster,Version 8.3 • 8.4/9.0 Dev systems • Working toward 9.0 into prod (May 2011) • pgpool, Redis, RabbitMQ, NFS SP ogm Ceon Cfo .En Uef2 re 3 0n 1c1e
  • 4. Some stats • daily peak: ~3000 commits per second • average writes: 4 MBps • average reads: 8 MBps SP ogm Ceon Cfo .En Uef2 re 4 0n 1c1e
  • 5. What’s good • Most queries are fast! • Benchmarks say we’re pushing the limits of the hardware • Developers love working with Postgres SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 6. And lots more. But... 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 7. 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 8. The Problems 1. System resource exhaustion 2. Everything is slow: Huge catalogs, Backups 3. Handling VACUUM problems: Bloat, Transaction wraparound 4. Upgrades: Minor, Major SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 9. System Resource Exhaustion 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 10. Running out of inodes Problem: UFS on Solaris “The only way to add more inodes to a UFS filesystem is: 1. destroy the filesystem and create a new filesystem with a higher inode density 2. enlarge the filesystem - growfs man page” SP ogm Ceon Cfo .En Uef2 re 10 0n 1c1e
  • 11. Running out of inodes Solution 0: Delete files. Solution 1: Sharding/bigger filesystem Solution 2: xfs SP ogm Ceon Cfo .En Uef2 re 11 0n 1c1e
  • 12. Running out of file descriptors Problem: Too many open files by the database. selena@lulu:~ #508 18:43 :) sudo lsof -p 19121 | wc 40 355 4151 Solution: You need a connection pooler. SP ogm Ceon Cfo .En Uef2 re 12 0n 1c1e
  • 13. Running out of file descriptors Solution: You need a connection pooler. Recommended: pgbouncer (threaded, online upgrade) pgpool-II (failover) SP ogm Ceon Cfo .En Uef2 re 13 0n 1c1e
  • 14. Everything is slow. 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 15. Huge Catalogs 409,994 tables SP ogm Ceon Cfo .En Uef2 re 15 0n 1c1e
  • 16. Maintenance problem Minor mistake in parent table definitions: not null default nextval('important_sequence'::text) vs not null default nextval('important_sequence'::regclass) SP ogm Ceon Cfo .En Uef2 re 16 0n 1c1e
  • 17. Huge Catalogs Problem: Slow scans of catalog data Solution: Upgrade to Postgres 8.4 or higher But really: Avoid making a cluster with >400k tables. SP ogm Ceon Cfo .En Uef2 re 17 0n 1c1e
  • 18. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Problem: This is slow to write. (128 MB written every second or so) SP ogm Ceon Cfo .En Uef2 re 18 0n 1c1e
  • 19. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Soution: Move stats file to RAM. stats_temp_directory (8.4 or higher) There’s a trivial patch for earlier versions. SP ogm Ceon Cfo .En Uef2 re 19 0n 1c1e
  • 20. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Problem: This is slow to read. SP ogm Ceon Cfo .En Uef2 re 20 0n 1c1e
  • 21. Stats collection 9,019,868 total data points for table stats 4,550,770 total data points for index stats Solution: Supposedly, this is better in 8.4 and higher. (fewer writes per minute) Still probably not fast. SP ogm Ceon Cfo .En Uef2 re 21 0n 1c1e
  • 22. Backups pg_dump takes longer and longer... SP ogm Ceon Cfo .En Uef2 re 22 0n 1c1e
  • 23.    backup     |   duration -------------------+--------------------  2009­11­22  |  02:44:36.821475   2009­11­23  |  02:46:20.003507  2009­11­24  |  02:47:06.260705  2009­12­06  |  07:13:04.174964  2009­12­13  |  05:00:01.082676  2009­12­20  |  06:24:49.433043  2009­12­27  |  05:35:20.551477  2010­01­03  |  07:36:49.651492  2010­01­10  |  05:55:02.396163  2010­01­17  |  07:32:33.277559  2010­01­24  |  06:22:46.522319  2010­01­31  |  10:48:13.060888  2010­02­07  |  21:21:47.77618  2010­02­14  |  14:32:04.638267  2010­02­21  |  11:34:42.353244  2010­02­28  |  11:13:02.102345 SP ogm Ceon Cfo .En Uef2 re 23 0n 1c1e
  • 24. Backups Problem: pg_dump is too slow. Solutions: • patching pg_dump for SELECT ... LIMIT • crank down shared_buffers • Stop using pg_dump for backups • 64-bit might help SP ogm Ceon Cfo .En Uef2 re 24 0n 1c1e
  • 25. How not to migrate to a 64-bit system SP ogm Ceon Cfo .En Uef2 re 25 0n 1c1e
  • 26. Title Text Install 32-bit Postgres and libraries on a 64-bit system. Install 64-bit Postgres/libs of the same version. Copy “hot backup” from 32-bit sys over to 64-bit sys. Run pg_dump from 64-bit version on 32-bit Postgres. SP ogm Ceon Cfo .En Uef2 re 26 0n 1c1e
  • 27. A single warm standby is not a backup. But lots of people use them that way! SP ogm Ceon Cfo .En Uef2 re 27 0n 1c1e
  • 28. Ship WAL from Solaris x86 -> Linux It did work! SP ogm Ceon Cfo .En Uef2 re 28 0n 1c1e
  • 29. Handling VACUUM problems 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 30. Bloat Problem: Lots of dead tuples in tables. • Frequent UPDATEs to long tables of log data • Frequent DELETEs without a VACUUM • A terabyte of dead tuples SP ogm Ceon Cfo .En Uef2 re 30 0n 1c1e
  • 31. Fixing bloat Solution: Write custom scripts to clean • VACUUM for small things • CLUSTER for everything else • Considered TRUNCATE SP ogm Ceon Cfo .En Uef2 re 31 0n 1c1e
  • 32. Catalog Bloat Application allowed users to initiate ALTER TABLE. Regular VACUUM couldn’t fix it. VACUUM FULL of the catalog takes 2+ hours. Use of NOTIFY/LISTEN can also cause bloat. SP ogm Ceon Cfo .En Uef2 re 32 0n 1c1e
  • 33. Transaction wraparound avoidance Problem: autovacuum set off too frequently Watch age(datfrozenxid) Solution: Increase autovacuum_freeze_max_age (default is 200 million, we increase to one billion) SP ogm Ceon Cfo .En Uef2 re 33 0n 1c1e
  • 34. Upgrades 1c1e re0n Uef2 .En Cfo Ceon ogm SP
  • 35. Minor upgrades Problem: Restarting Postgres causes bad application performance. • Require a start/stop of database • Unexpected CHECKPOINT • Cold cache SP ogm Ceon Cfo .En Uef2 re 35 0n 1c1e
  • 36. Minor upgrades Solutions: • Plan for a CHECKPOINT before shutdown • Warm the cache (Queries that exercise indexes, maybe table scans) SP ogm Ceon Cfo .En Uef2 re 36 0n 1c1e
  • 37. Major Version upgrades Problem: Major upgrades are a PITA. • <8.2 - no pg_upgrade :( • Time your restores. • Document your SLAs. SP ogm Ceon Cfo .En Uef2 re 37 0n 1c1e
  • 38. Major Version upgrades Solutions: :( • >=8.3 - pg_upgrade • Time your restores. • Document your SLAs. SP ogm Ceon Cfo .En Uef2 re 38 0n 1c1e
  • 39. Major Version upgrades Solutions: :( • Write tools to migrate data • Shard • Trigger-based replication SP ogm Ceon Cfo .En Uef2 re 39 0n 1c1e
  • 40. The Problems 1. System resource exhaustion 2. Everything is slow: Huge catalogs, Backups 3. Handling VACUUM problems: Bloat, Transaction wraparound 4. Upgrades: Minor, Major SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 41. The Solutions 1. System resource exhaustion Choose a better filesystem, Pooling 2. Everything is slow: Huge catalogs, Backups Don’t do that, Monitor & Binary backups SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 42. The Solutions 3. Handling VACUUM problems: Bloat, Transaction wraparound Developer education, Monitoring, Cleanup, *_max_freeze_age 4. Upgrades: Minor, Major Plan, Plan, Plan (CHECKPOINT, warm cache, pg_upgrade) SP ogm Ceon Cfo .En Uef2 re0n 1c1e
  • 43. Thanks! Managing Terabytes Problems and solutions with operating large Postgres installations Selena Deckelmann Prime Radiant @selenamarie SP ogm Ceon Cfo .En Uef2 re 43 0n 1c1e