SlideShare a Scribd company logo
1 of 81
Download to read offline
Addressing Vendor Weaknesses in User-Space


                        ROBERT TREAT,
                           OmniTI

                        Highload++ 2011




                          @robtreat2
                           xzilla.net
                         +Robert Treat
                                                     1
Monday, October 3, 11
Who Am I?

                        OMNTI - Internet Scalability Consultants
                            Lead Database Operations




                                                                   2
Monday, October 3, 11
Who Am I?

                        OMNTI - Internet Scalability Consultants
                            Lead Database Operations

                                     “Large Scale”




                                                                   3
Monday, October 3, 11
Who Am I?

                        OMNTI - Internet Scalability Consultants
                            Lead Database Operations

                                     “Large Scale”

                                   High Transactions
                                      TB+ Data




                                                                   4
Monday, October 3, 11
Who Am I?

                        OMNTI - Internet Scalability Consultants
                            Lead Database Operations

                                     “Large Scale”

                                   High Transactions
                                      TB+ Data

                                    Mission Critical



                                                                   5
Monday, October 3, 11
Who Am I?

                        Database Operations @OMNTI

                                 Postgres
                                 MySQL
                                  Oracle
                                 & More




                                                     6
Monday, October 3, 11
Postgres for Scalability

                             Traditional RDBMS
                              Highly Extensible
                              Runs Everywhere
                             Talks To Everything
                               “BSD” Licensed
                           15+ Years Development
                        Open Development Community




                                                      7
Monday, October 3, 11
The Bloat Problem

                 Data Footprint Can Be Critical To Performance




                                                                 8
Monday, October 3, 11
The Bloat Problem

                 Data Footprint Can Be Critical To Performance


                        Size On Disk Affects The Needs Of
                            RAM, Disk Speed, Storage




                                                                 9
Monday, October 3, 11
The Bloat Problem

                 Data Footprint Can Be Critical To Performance


                         Size On Disk Affects The Needs Of
                             RAM, Disk Speed, Storage

                         “Bloat” is unused, wasted disk space,
                                used by the database,
                        but not needed for actual data storage



                                                                 10
Monday, October 3, 11
The Bloat Problem

                 Data Footprint Can Be Critical To Performance


                         Size On Disk Affects The Needs Of
                             RAM, Disk Speed, Storage

                         “Bloat” is unused, wasted disk space,
                              taken up by the database,
                        but not needed for actual data storage

                                        Why?
                                                                 11
Monday, October 3, 11
MVCC Architecture

                Multiversion Concurrency Control (MVCC) allows
                Postgres to offer high concurrency even during
                significant database read/write activity. MVCC
                specifically offers behavior where "readers never block
                writers, and writers never block readers".




                                                                         12
Monday, October 3, 11
MVCC Architecture

                • Oracle
                • MySQL (InnoDB)
                • Informix
                • Firebird
                • MSSQL (optional)




                                                  13
Monday, October 3, 11
MVCC Architecture

                • Oracle
                • MySQL (InnoDB)
                • Informix
                • Firebird
                • MSSQL (optional)
                • CouchDB




                                                  14
Monday, October 3, 11
“Bloat” Manifests Differently, But Is Common

                • MongoDB (deletes, some updates)
                 • dump/restore
                 • mongod --repair
                 • db.runCommand( { compact : 'mycollectionname'   } )




                • Lucene (updates)
                • Hadoop / HDFS (small files)


                                                                         15
Monday, October 3, 11
Postgres MVCC Architecture

       • Implemented Postgres 6.5
       • 1999, Vadim Mikheev
       • MVCC Unmasked
       • http://momjian.us/main/writings/pgsql/mvcc.pdf




                                                          16
Monday, October 3, 11
Postgres MVCC Architecture

       • Postgres maintains global transaction counters
       • Keeps track of transaction counter per row for
          • creating transaction
          • removing transaction
       • Using these counters, Postgres allows different
       transactions to see different rows, based on visibility rules.



                                                                  17
Monday, October 3, 11
Postgres MVCC Architecture

       • Postgres maintains global transaction counters
       • Keeps track of transaction counter per row for
          • creating transaction
          • removing transaction
       • Using these counters, Postgres allows different
       transactions to see different rows, based on visibility rules.

                           Transaction Reading An Old Row
                        Doesn’t Block Transaction Writing A Row
                                                                  18
Monday, October 3, 11
MVCC Architecture


              user_id   X42
              Create     32      INSERT
              Expire




                                              19
Monday, October 3, 11
MVCC Architecture


              user_id   X42
              Create     32      INSERT
              Expire

              user_id   X42
              Create     32      DELETE
              Expire     38




                                              20
Monday, October 3, 11
MVCC Architecture


             user_id    X69
             Create      43   OLD(delete)
             Expire      56
             user_id    X69
                                            UPDATE
             Create      43   NEW(insert)
             Expire




                                                     21
Monday, October 3, 11
MVCC Architecture


             user_id    X69
             Create      43   <~~ DEAD ROW
             Expire      56
             user_id    X69
                                                Clean Up / Bloat
             Create      43   <~~ VISIBLE ROW
             Expire




                                                                   22
Monday, October 3, 11
MVCC Architecture


             user_id     X69
             Create       43   <~~ DEAD ROW
             Expire       56
             user_id     X69
                                                 Clean Up / Bloat
             Create       43   <~~ VISIBLE ROW
             Expire


                        Speed Up SQL Commands By
                        Dealing With Clean Up Later

                                                                    23
Monday, October 3, 11
How Postgres Deals With Bloat

            • Heap-Only-Tuples (HOT)
               • On-The-Fly, Per Page Cleanup
               • Marks Given Row’s Space Reusable
               • Update Only




                                                        24
Monday, October 3, 11
How Postgres Deals With Bloat

            • Heap-Only-Tuples (HOT)
               • On-The-Fly, Per Page Cleanup
               • Marks Given Row’s Space Reusable
               • Update Only
            • VACUUM
               • Non-Blocking Bulk Cleanup
               • Removes End-Of-File Pages
               • “autovacuum” Process Monitors Tables


                                                        25
Monday, October 3, 11
Problems With Automatic Cleanup

            • HOT
               • Update Only
               • Doesn’t Work With Changing Index Data




                                                          26
Monday, October 3, 11
Problems With Automatic Cleanup

            • HOT
               • Update Only
               • Doesn’t Work When Changing Index Data
            • VACUUM
               • Must Wait For Long Transactions To Complete
               • Costs I/O, Can Only Work So Fast
               • Can’t Remove Non End-Of-File Pages
               • Leaves A “High Water Mark”


                                                               27
Monday, October 3, 11
Dealing With Bloat - The Hard Way

                • VACUUM FULL / CLUSTER
                • The Good
                   • Reclaims All “Dead Rows”




                                                            28
Monday, October 3, 11
Dealing With Bloat - The Hard Way

                • VACUUM FULL / CLUSTER
                • The Good
                   • Reclaims All “Dead Rows”
                • The Bad
                   • Exclusive Lock
                   • Rewrite All Data In Tables
                   • Needs Working Space
                   • Heavy I/O


                                                            29
Monday, October 3, 11
Monitoring Your Bloat

             • check_postgres.pl
                • Nagios plugin
                • Compares physical size to row size estimates
                • http://bucardo.org/wiki/Check_postgres
             • “bloat report”
                • Script to measure table/index bloat
                • Compares physical size to row size estimates
                • http://labs.omniti.com/labs/pgtreats/
                        browser/trunk/tools/

                                                                 30
Monday, October 3, 11
Dealing With Bloat In Userspace

       • Solving MVCC Bloat Is A “Hard Problem”
          • Even a good solution would be hard to
                 implement in core




                                                          31
Monday, October 3, 11
Dealing With Bloat In Userspace

       • Solving MVCC Bloat Is A “Hard Problem”
          • Even a good solution would be hard to
                 implement in core

       • Can we build a tool in user space?
          • Develop solution quicker
          • Easier to deploy and maintain
          • Provide a prototype for future development


                                                          32
Monday, October 3, 11
Dealing With Bloat Redux

       • Updating A Row Rewrites Data To New Location




                                                        33
Monday, October 3, 11
Dealing With Bloat Redux

       • Updating A Row Rewrites Data To New Location
       • Use Vacuum To Mark Old Rows “Reusable”




                                                        34
Monday, October 3, 11
Dealing With Bloat Redux

       • Updating A Row Rewrites Data To New Location
       • Use Vacuum To Mark Old Rows “Reusable”
       • Update Row To Rewrite Data At “Front” Of Page




                                                         35
Monday, October 3, 11
Dealing With Bloat Redux

       • Updating A Row Rewrites Data To New Location
       • Use Vacuum To Mark Old Rows “Reusable”
       • Update Row To Rewrite Data At “Front” Of Page
       • Use Vacuum To Reclaim Space From End Of File




                                                         36
Monday, October 3, 11
Dealing With Bloat Redux

       • Updating A Row Rewrites Data To New Location
       • Use Vacuum To Mark Old Rows “Reusable”
       • Update Row To Rewrite Data At “Front” Of Page
       • Use Vacuum To Reclaim Space From End Of File
       • Put A Script On It
          • https://labs.omniti.com/pgtreats/trunk/tools/compact_table


                                                                         37
Monday, October 3, 11
Dealing With Bloat Redux

                • “Compact Table”
                   • Requires Lots of Time, I/O
                   • Often Causes Heavy Index Bloat
                   • Heavy Concurrency Bloats Faster Than
                        We Can Recover It




                                                            38
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”




                                                        39
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”
                  • Vacuum / Cluster Replacement




                                                       40
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”
                  • Vacuum / Cluster Replacement
                  • Command Line Tool




                                                       41
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”
                  • Vacuum / Cluster Replacement
                  • Command Line Tool
                  • Online Table Rewrite
                     • Uses Minimal Locking




                                                       42
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”
                  • Vacuum / Cluster Replacement
                  • Command Line Tool
                  • Online Table Rewrite
                     • Uses Minimal Locking
                  • Developed By NTT




                                                       43
Monday, October 3, 11
Dealing With Bloat For Real!

                • Enter “pg_reorg”
                  • Vacuum / Cluster Replacement
                  • Command Line Tool
                  • Online Table Rewrite
                     • Uses Minimal Locking
                  • Developed By NTT
                  • BSD Licensed
                  • C Code
                  • http://pgfoundry.org/projects/reorg/

                                                           44
Monday, October 3, 11
How pg_reorg Works


    • Create a log table for changes
    • Create triggers on the old table to log changes (I/U/D)
    • Create a new table with a copy of all data in old table
    • Create all indexes on the new table
    • Apply all changes from the log table to the new table
    • Modify the system catalogs information about table files
    • Drop old table, leaving new table in it’s place



                                                           45
Monday, October 3, 11
How pg_reorg Works


    • Create a log table for changes
    • Create triggers on the old table to log changes
    • Create a new table with a copy of all data in old table
    • Create all indexes on the new table
    • Apply all changes from the log table to the new table
    • MODIFY THE SYSTEM CATALOGS
     INFORMATION ABOUT THE TABLE FILES (!!!)
    • Drop old table, leaving the new table in it’s place


                                                                46
Monday, October 3, 11
Dealing With Bloat For Real!

                             Open Source Code

                        The Power Is In Your Hands

                               Look At Code
                             Examine the SQL
                        (User Space Is Really Visible)

                                   TEST!


                                                         47
Monday, October 3, 11
Dealing With Bloat For Real!

                        What Does Testing Look Like?

                           Create Some Tables,
                           Create Artificial Bloat,
                               run pg_reorg




                                                       48
Monday, October 3, 11
Dealing With Bloat For Real!

                        What Does Testing Look Like?

                           Create Some Tables,
                           Create Artificial Bloat,
                               run pg_reorg



                                   WIN!


                                                       49
Monday, October 3, 11
Dealing With Bloat For Real!

                              Test In “Prod”




                                                       50
Monday, October 3, 11
Dealing With Bloat For Real!

                              Test In “Prod”

                         Find Some Bloated Tables,
                          Make Backup Of Tables,
                               Cross Fingers,
                                 pg_reorg




                                                       51
Monday, October 3, 11
Dealing With Bloat For Real!

                              Test In “Prod”

                         Find Some Bloated Tables,
                          Make Backup Of Tables,
                               Cross Fingers,
                                 pg_reorg


                                   WIN!


                                                       52
Monday, October 3, 11
Dealing With Bloat For Real!




                         Eventually You Have To Use It
                          On Something That Matters




                                                         53
Monday, October 3, 11
pg_reorg In The Real World

                • Production Database (OLTP)
                    • 540GB Size
                    • 2000 TPS (off-peak time, multiple statements)
                    • Largest Table (pre-reorg) 127GB




                                                                      54
Monday, October 3, 11
pg_reorg In The Real World

                • Production Database (OLTP)
                    • 540GB Size
                    • 2000 TPS (off-peak time, multiple statements)
                    • Largest Table (pre-reorg) 127GB

                • Rebuild Stats
                   • 5.75 Hours To Rebuild
                   • Reclaimed 52GB Disk Space
                   • No outages reported for Website/API’s


                                                                      55
Monday, October 3, 11
pg_reorg In The Real World




                                                     56
Monday, October 3, 11
pg_reorg In The Real World




                                                     56
Monday, October 3, 11
pg_reorg In The Real World




                                                     57
Monday, October 3, 11
pg_reorg In The Real World




                                                     57
Monday, October 3, 11
pg_reorg In The Real World




                                                     57
Monday, October 3, 11
pg_reorg In The Real World




                               YAY!



                                                     58
Monday, October 3, 11
Return Of
                              The Jedi



                                         59
Monday, October 3, 11
“your overconfidence is
                            your weakness.”

                               -Luke Skywalker


                                                  60
Monday, October 3, 11
“your faith in your
                                friends is yours.”

                            -Emperor Palpatine


                                                     61
Monday, October 3, 11
Sometimes You Can Have Both




                        Trust in NTT’s Code == faith in friends

                        Success in production == overconfidence




                                                                  62
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                   WARNING:  unexpected attrdef record found
                          for attr 61 of rel orders

                   WARNING:  1 attrdef record(s) missing for rel
                                    orders




                                                                   63
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                   WARNING:  unexpected attrdef record found
                          for attr 61 of rel orders

                   WARNING:  1 attrdef record(s) missing for rel
                                    orders




                           Yes, On A Production System
                 Yes, Trying To Take 1000’s of Orders Per Second

                                                                   64
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                  create table test (
                   a int4,
                   b int4 default 2112,
                   c bool
                );




                                                      65
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                  create table test (
                   a int4,
                   b int4 default 2112,
                   c bool
                );

                Postgres internals track defaults / constraints
                based on column position “2”, not column name “b”




                                                                    66
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                  create table test (
                   a int4,
                   b int4 default 2112,
                   c bool
                );

                Postgres internals track defaults / constraints
                based on column position “2”, not column name “b”

                If you drop column “a” and then do pg_reorg, column
                “c” is now column “2”, and default 2112 is on boolean


                                                                        67
Monday, October 3, 11
When Good pg_reorgs Go Bad!

                  create table test (
                   a int4,
                   b int4 default 2112,
                   c bool
                );

                Postgres internals track defaults / constraints
                based on column position “2”, not column name “b”

                If you drop column “a” and then do pg_reorg, column
                “c” is now column “2”, and default 2112 is on boolean

                This Is Fair - pg_reorg hacks the system tables
                                                                        68
Monday, October 3, 11
When Good pg_reorgs Go Bad!



                        Basic Fix: Drop All Defaults And Recreate




                                                                    69
Monday, October 3, 11
When Good pg_reorgs Go Bad!



                            Basic Fix: Drop All Defaults And Recreate

                        Alternative Fix: Hack System Catalogs Some More




                                                                          70
Monday, October 3, 11
When Good pg_reorgs Go Bad!



                            Basic Fix: Drop All Defaults And Recreate

                        Alternative Fix: Hack System Catalogs Some More

                                     Haven’t we had enough
                                     system catalog hacking
                                            for now?




                                                                          71
Monday, October 3, 11
When Good pg_reorgs Go Bad!



                        “now, if you'll excuse me,
                        I'll go away and have a
                        heart attack.”



                                                        72
Monday, October 3, 11
What Next?

                        Report Problem To Mailing List
                                Submit A Patch
                        Ultimately The Problem Is Fixed
                              Everyone’s Happy?




                                                          73
Monday, October 3, 11
Hackers Discussion
                           Postgres Development Community Is Funny

                                Sometimes Hard To Get Them To
                                      Recognize Problems

                        Not Everyone See Online Rebuild As A Big Problem




                                                                           74
Monday, October 3, 11
Hackers Discussion
                           Postgres Development Community Is Funny

                                Sometimes Hard To Get Them To
                                      Recognize Problems

                        Not Everyone See Online Rebuild As A Big Problem

                                      In All The Fairness,
                                 Not Everyone Has This Problem




                                                                           75
Monday, October 3, 11
Hackers Discussion
                                Hackers Meeting 2011,
                        Discussion On Internal Queuing System

                        Could Be Used As Underlying Basis For
                                 On-Line Rebuilding

                                    Until Then...




                                                                76
Monday, October 3, 11
pg_reorg Is A Great Tool!
                        Best Option For Difficult Situation

                                 Just Be Careful!




                                                             77
Monday, October 3, 11
THANKS!

                                 Highload++
                                     NTT
                                   OmniTI
                             Postgres Community
                        Momjian, Depesz, Patel, Kocoloski




                                    xzilla.net
                                  @robtreat2
                                 + Robert Treat
                                                            78
Monday, October 3, 11

More Related Content

Similar to Addressing vendor weaknesses in user space (Robert Treat)

soft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Gridssoft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Gridssoft-shake.ch
 
Interop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServiceInterop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServicePatrick Chanezon
 
Building A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage SolutionBuilding A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage SolutionPhil Cryer
 
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...JAX London
 
CloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenCloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenPatrick Chanezon
 
The Coming Database Revolution
The Coming Database RevolutionThe Coming Database Revolution
The Coming Database RevolutionDATAVERSITY
 
Riak intro to..
Riak intro to..Riak intro to..
Riak intro to..Adron Hall
 
Riak Use Cases : Dissecting The Solutions To Hard Problems
Riak Use Cases : Dissecting The Solutions To Hard ProblemsRiak Use Cases : Dissecting The Solutions To Hard Problems
Riak Use Cases : Dissecting The Solutions To Hard ProblemsAndy Gross
 
DTrace talk at Oracle Open World
DTrace talk at Oracle Open WorldDTrace talk at Oracle Open World
DTrace talk at Oracle Open WorldAngelo Rajadurai
 
Non Relational Databases And World Domination
Non Relational Databases And World DominationNon Relational Databases And World Domination
Non Relational Databases And World DominationJason Davies
 
Big Bad "Upgraded" Postgres
Big Bad "Upgraded" PostgresBig Bad "Upgraded" Postgres
Big Bad "Upgraded" PostgresRobert Treat
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014Connor McDonald
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampMichael Montano
 
soft-shake.ch - Data grids and Data Caching
soft-shake.ch - Data grids and Data Cachingsoft-shake.ch - Data grids and Data Caching
soft-shake.ch - Data grids and Data Cachingsoft-shake.ch
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudRightScale
 

Similar to Addressing vendor weaknesses in user space (Robert Treat) (20)

soft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Gridssoft-shake.ch - Data grids and Data Grids
soft-shake.ch - Data grids and Data Grids
 
Data Grids vs Databases
Data Grids vs DatabasesData Grids vs Databases
Data Grids vs Databases
 
Interop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServiceInterop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A Service
 
Ruby-on-Infinispan
Ruby-on-InfinispanRuby-on-Infinispan
Ruby-on-Infinispan
 
Infinispan for Dummies
Infinispan for DummiesInfinispan for Dummies
Infinispan for Dummies
 
Building A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage SolutionBuilding A Scalable Open Source Storage Solution
Building A Scalable Open Source Storage Solution
 
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...
 
CloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heavenCloudFoundry and MongoDb, a marriage made in heaven
CloudFoundry and MongoDb, a marriage made in heaven
 
The Coming Database Revolution
The Coming Database RevolutionThe Coming Database Revolution
The Coming Database Revolution
 
Riak intro to..
Riak intro to..Riak intro to..
Riak intro to..
 
Riak Use Cases : Dissecting The Solutions To Hard Problems
Riak Use Cases : Dissecting The Solutions To Hard ProblemsRiak Use Cases : Dissecting The Solutions To Hard Problems
Riak Use Cases : Dissecting The Solutions To Hard Problems
 
DTrace talk at Oracle Open World
DTrace talk at Oracle Open WorldDTrace talk at Oracle Open World
DTrace talk at Oracle Open World
 
Non Relational Databases And World Domination
Non Relational Databases And World DominationNon Relational Databases And World Domination
Non Relational Databases And World Domination
 
Persistence Smoothie
Persistence SmoothiePersistence Smoothie
Persistence Smoothie
 
Big Bad "Upgraded" Postgres
Big Bad "Upgraded" PostgresBig Bad "Upgraded" Postgres
Big Bad "Upgraded" Postgres
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacamp
 
Nosql and newsql
Nosql and newsqlNosql and newsql
Nosql and newsql
 
soft-shake.ch - Data grids and Data Caching
soft-shake.ch - Data grids and Data Cachingsoft-shake.ch - Data grids and Data Caching
soft-shake.ch - Data grids and Data Caching
 
CodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the CloudCodeFutures - Scaling Your Database in the Cloud
CodeFutures - Scaling Your Database in the Cloud
 

More from Ontico

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...Ontico
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Ontico
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Ontico
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Ontico
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Ontico
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Ontico
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)Ontico
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)Ontico
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Ontico
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Ontico
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Ontico
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Ontico
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)Ontico
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Ontico
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Ontico
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...Ontico
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Ontico
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Ontico
 

More from Ontico (20)

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
 

Recently uploaded

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Addressing vendor weaknesses in user space (Robert Treat)

  • 1. Addressing Vendor Weaknesses in User-Space ROBERT TREAT, OmniTI Highload++ 2011 @robtreat2 xzilla.net +Robert Treat 1 Monday, October 3, 11
  • 2. Who Am I? OMNTI - Internet Scalability Consultants Lead Database Operations 2 Monday, October 3, 11
  • 3. Who Am I? OMNTI - Internet Scalability Consultants Lead Database Operations “Large Scale” 3 Monday, October 3, 11
  • 4. Who Am I? OMNTI - Internet Scalability Consultants Lead Database Operations “Large Scale” High Transactions TB+ Data 4 Monday, October 3, 11
  • 5. Who Am I? OMNTI - Internet Scalability Consultants Lead Database Operations “Large Scale” High Transactions TB+ Data Mission Critical 5 Monday, October 3, 11
  • 6. Who Am I? Database Operations @OMNTI Postgres MySQL Oracle & More 6 Monday, October 3, 11
  • 7. Postgres for Scalability Traditional RDBMS Highly Extensible Runs Everywhere Talks To Everything “BSD” Licensed 15+ Years Development Open Development Community 7 Monday, October 3, 11
  • 8. The Bloat Problem Data Footprint Can Be Critical To Performance 8 Monday, October 3, 11
  • 9. The Bloat Problem Data Footprint Can Be Critical To Performance Size On Disk Affects The Needs Of RAM, Disk Speed, Storage 9 Monday, October 3, 11
  • 10. The Bloat Problem Data Footprint Can Be Critical To Performance Size On Disk Affects The Needs Of RAM, Disk Speed, Storage “Bloat” is unused, wasted disk space, used by the database, but not needed for actual data storage 10 Monday, October 3, 11
  • 11. The Bloat Problem Data Footprint Can Be Critical To Performance Size On Disk Affects The Needs Of RAM, Disk Speed, Storage “Bloat” is unused, wasted disk space, taken up by the database, but not needed for actual data storage Why? 11 Monday, October 3, 11
  • 12. MVCC Architecture Multiversion Concurrency Control (MVCC) allows Postgres to offer high concurrency even during significant database read/write activity. MVCC specifically offers behavior where "readers never block writers, and writers never block readers". 12 Monday, October 3, 11
  • 13. MVCC Architecture • Oracle • MySQL (InnoDB) • Informix • Firebird • MSSQL (optional) 13 Monday, October 3, 11
  • 14. MVCC Architecture • Oracle • MySQL (InnoDB) • Informix • Firebird • MSSQL (optional) • CouchDB 14 Monday, October 3, 11
  • 15. “Bloat” Manifests Differently, But Is Common • MongoDB (deletes, some updates) • dump/restore • mongod --repair • db.runCommand( { compact : 'mycollectionname' } ) • Lucene (updates) • Hadoop / HDFS (small files) 15 Monday, October 3, 11
  • 16. Postgres MVCC Architecture • Implemented Postgres 6.5 • 1999, Vadim Mikheev • MVCC Unmasked • http://momjian.us/main/writings/pgsql/mvcc.pdf 16 Monday, October 3, 11
  • 17. Postgres MVCC Architecture • Postgres maintains global transaction counters • Keeps track of transaction counter per row for • creating transaction • removing transaction • Using these counters, Postgres allows different transactions to see different rows, based on visibility rules. 17 Monday, October 3, 11
  • 18. Postgres MVCC Architecture • Postgres maintains global transaction counters • Keeps track of transaction counter per row for • creating transaction • removing transaction • Using these counters, Postgres allows different transactions to see different rows, based on visibility rules. Transaction Reading An Old Row Doesn’t Block Transaction Writing A Row 18 Monday, October 3, 11
  • 19. MVCC Architecture user_id X42 Create 32 INSERT Expire 19 Monday, October 3, 11
  • 20. MVCC Architecture user_id X42 Create 32 INSERT Expire user_id X42 Create 32 DELETE Expire 38 20 Monday, October 3, 11
  • 21. MVCC Architecture user_id X69 Create 43 OLD(delete) Expire 56 user_id X69 UPDATE Create 43 NEW(insert) Expire 21 Monday, October 3, 11
  • 22. MVCC Architecture user_id X69 Create 43 <~~ DEAD ROW Expire 56 user_id X69 Clean Up / Bloat Create 43 <~~ VISIBLE ROW Expire 22 Monday, October 3, 11
  • 23. MVCC Architecture user_id X69 Create 43 <~~ DEAD ROW Expire 56 user_id X69 Clean Up / Bloat Create 43 <~~ VISIBLE ROW Expire Speed Up SQL Commands By Dealing With Clean Up Later 23 Monday, October 3, 11
  • 24. How Postgres Deals With Bloat • Heap-Only-Tuples (HOT) • On-The-Fly, Per Page Cleanup • Marks Given Row’s Space Reusable • Update Only 24 Monday, October 3, 11
  • 25. How Postgres Deals With Bloat • Heap-Only-Tuples (HOT) • On-The-Fly, Per Page Cleanup • Marks Given Row’s Space Reusable • Update Only • VACUUM • Non-Blocking Bulk Cleanup • Removes End-Of-File Pages • “autovacuum” Process Monitors Tables 25 Monday, October 3, 11
  • 26. Problems With Automatic Cleanup • HOT • Update Only • Doesn’t Work With Changing Index Data 26 Monday, October 3, 11
  • 27. Problems With Automatic Cleanup • HOT • Update Only • Doesn’t Work When Changing Index Data • VACUUM • Must Wait For Long Transactions To Complete • Costs I/O, Can Only Work So Fast • Can’t Remove Non End-Of-File Pages • Leaves A “High Water Mark” 27 Monday, October 3, 11
  • 28. Dealing With Bloat - The Hard Way • VACUUM FULL / CLUSTER • The Good • Reclaims All “Dead Rows” 28 Monday, October 3, 11
  • 29. Dealing With Bloat - The Hard Way • VACUUM FULL / CLUSTER • The Good • Reclaims All “Dead Rows” • The Bad • Exclusive Lock • Rewrite All Data In Tables • Needs Working Space • Heavy I/O 29 Monday, October 3, 11
  • 30. Monitoring Your Bloat • check_postgres.pl • Nagios plugin • Compares physical size to row size estimates • http://bucardo.org/wiki/Check_postgres • “bloat report” • Script to measure table/index bloat • Compares physical size to row size estimates • http://labs.omniti.com/labs/pgtreats/ browser/trunk/tools/ 30 Monday, October 3, 11
  • 31. Dealing With Bloat In Userspace • Solving MVCC Bloat Is A “Hard Problem” • Even a good solution would be hard to implement in core 31 Monday, October 3, 11
  • 32. Dealing With Bloat In Userspace • Solving MVCC Bloat Is A “Hard Problem” • Even a good solution would be hard to implement in core • Can we build a tool in user space? • Develop solution quicker • Easier to deploy and maintain • Provide a prototype for future development 32 Monday, October 3, 11
  • 33. Dealing With Bloat Redux • Updating A Row Rewrites Data To New Location 33 Monday, October 3, 11
  • 34. Dealing With Bloat Redux • Updating A Row Rewrites Data To New Location • Use Vacuum To Mark Old Rows “Reusable” 34 Monday, October 3, 11
  • 35. Dealing With Bloat Redux • Updating A Row Rewrites Data To New Location • Use Vacuum To Mark Old Rows “Reusable” • Update Row To Rewrite Data At “Front” Of Page 35 Monday, October 3, 11
  • 36. Dealing With Bloat Redux • Updating A Row Rewrites Data To New Location • Use Vacuum To Mark Old Rows “Reusable” • Update Row To Rewrite Data At “Front” Of Page • Use Vacuum To Reclaim Space From End Of File 36 Monday, October 3, 11
  • 37. Dealing With Bloat Redux • Updating A Row Rewrites Data To New Location • Use Vacuum To Mark Old Rows “Reusable” • Update Row To Rewrite Data At “Front” Of Page • Use Vacuum To Reclaim Space From End Of File • Put A Script On It • https://labs.omniti.com/pgtreats/trunk/tools/compact_table 37 Monday, October 3, 11
  • 38. Dealing With Bloat Redux • “Compact Table” • Requires Lots of Time, I/O • Often Causes Heavy Index Bloat • Heavy Concurrency Bloats Faster Than We Can Recover It 38 Monday, October 3, 11
  • 39. Dealing With Bloat For Real! • Enter “pg_reorg” 39 Monday, October 3, 11
  • 40. Dealing With Bloat For Real! • Enter “pg_reorg” • Vacuum / Cluster Replacement 40 Monday, October 3, 11
  • 41. Dealing With Bloat For Real! • Enter “pg_reorg” • Vacuum / Cluster Replacement • Command Line Tool 41 Monday, October 3, 11
  • 42. Dealing With Bloat For Real! • Enter “pg_reorg” • Vacuum / Cluster Replacement • Command Line Tool • Online Table Rewrite • Uses Minimal Locking 42 Monday, October 3, 11
  • 43. Dealing With Bloat For Real! • Enter “pg_reorg” • Vacuum / Cluster Replacement • Command Line Tool • Online Table Rewrite • Uses Minimal Locking • Developed By NTT 43 Monday, October 3, 11
  • 44. Dealing With Bloat For Real! • Enter “pg_reorg” • Vacuum / Cluster Replacement • Command Line Tool • Online Table Rewrite • Uses Minimal Locking • Developed By NTT • BSD Licensed • C Code • http://pgfoundry.org/projects/reorg/ 44 Monday, October 3, 11
  • 45. How pg_reorg Works • Create a log table for changes • Create triggers on the old table to log changes (I/U/D) • Create a new table with a copy of all data in old table • Create all indexes on the new table • Apply all changes from the log table to the new table • Modify the system catalogs information about table files • Drop old table, leaving new table in it’s place 45 Monday, October 3, 11
  • 46. How pg_reorg Works • Create a log table for changes • Create triggers on the old table to log changes • Create a new table with a copy of all data in old table • Create all indexes on the new table • Apply all changes from the log table to the new table • MODIFY THE SYSTEM CATALOGS INFORMATION ABOUT THE TABLE FILES (!!!) • Drop old table, leaving the new table in it’s place 46 Monday, October 3, 11
  • 47. Dealing With Bloat For Real! Open Source Code The Power Is In Your Hands Look At Code Examine the SQL (User Space Is Really Visible) TEST! 47 Monday, October 3, 11
  • 48. Dealing With Bloat For Real! What Does Testing Look Like? Create Some Tables, Create Artificial Bloat, run pg_reorg 48 Monday, October 3, 11
  • 49. Dealing With Bloat For Real! What Does Testing Look Like? Create Some Tables, Create Artificial Bloat, run pg_reorg WIN! 49 Monday, October 3, 11
  • 50. Dealing With Bloat For Real! Test In “Prod” 50 Monday, October 3, 11
  • 51. Dealing With Bloat For Real! Test In “Prod” Find Some Bloated Tables, Make Backup Of Tables, Cross Fingers, pg_reorg 51 Monday, October 3, 11
  • 52. Dealing With Bloat For Real! Test In “Prod” Find Some Bloated Tables, Make Backup Of Tables, Cross Fingers, pg_reorg WIN! 52 Monday, October 3, 11
  • 53. Dealing With Bloat For Real! Eventually You Have To Use It On Something That Matters 53 Monday, October 3, 11
  • 54. pg_reorg In The Real World • Production Database (OLTP) • 540GB Size • 2000 TPS (off-peak time, multiple statements) • Largest Table (pre-reorg) 127GB 54 Monday, October 3, 11
  • 55. pg_reorg In The Real World • Production Database (OLTP) • 540GB Size • 2000 TPS (off-peak time, multiple statements) • Largest Table (pre-reorg) 127GB • Rebuild Stats • 5.75 Hours To Rebuild • Reclaimed 52GB Disk Space • No outages reported for Website/API’s 55 Monday, October 3, 11
  • 56. pg_reorg In The Real World 56 Monday, October 3, 11
  • 57. pg_reorg In The Real World 56 Monday, October 3, 11
  • 58. pg_reorg In The Real World 57 Monday, October 3, 11
  • 59. pg_reorg In The Real World 57 Monday, October 3, 11
  • 60. pg_reorg In The Real World 57 Monday, October 3, 11
  • 61. pg_reorg In The Real World YAY! 58 Monday, October 3, 11
  • 62. Return Of The Jedi 59 Monday, October 3, 11
  • 63. “your overconfidence is your weakness.” -Luke Skywalker 60 Monday, October 3, 11
  • 64. “your faith in your friends is yours.” -Emperor Palpatine 61 Monday, October 3, 11
  • 65. Sometimes You Can Have Both Trust in NTT’s Code == faith in friends Success in production == overconfidence 62 Monday, October 3, 11
  • 66. When Good pg_reorgs Go Bad! WARNING:  unexpected attrdef record found for attr 61 of rel orders WARNING:  1 attrdef record(s) missing for rel orders 63 Monday, October 3, 11
  • 67. When Good pg_reorgs Go Bad! WARNING:  unexpected attrdef record found for attr 61 of rel orders WARNING:  1 attrdef record(s) missing for rel orders Yes, On A Production System Yes, Trying To Take 1000’s of Orders Per Second 64 Monday, October 3, 11
  • 68. When Good pg_reorgs Go Bad! create table test ( a int4, b int4 default 2112, c bool ); 65 Monday, October 3, 11
  • 69. When Good pg_reorgs Go Bad! create table test ( a int4, b int4 default 2112, c bool ); Postgres internals track defaults / constraints based on column position “2”, not column name “b” 66 Monday, October 3, 11
  • 70. When Good pg_reorgs Go Bad! create table test ( a int4, b int4 default 2112, c bool ); Postgres internals track defaults / constraints based on column position “2”, not column name “b” If you drop column “a” and then do pg_reorg, column “c” is now column “2”, and default 2112 is on boolean 67 Monday, October 3, 11
  • 71. When Good pg_reorgs Go Bad! create table test ( a int4, b int4 default 2112, c bool ); Postgres internals track defaults / constraints based on column position “2”, not column name “b” If you drop column “a” and then do pg_reorg, column “c” is now column “2”, and default 2112 is on boolean This Is Fair - pg_reorg hacks the system tables 68 Monday, October 3, 11
  • 72. When Good pg_reorgs Go Bad! Basic Fix: Drop All Defaults And Recreate 69 Monday, October 3, 11
  • 73. When Good pg_reorgs Go Bad! Basic Fix: Drop All Defaults And Recreate Alternative Fix: Hack System Catalogs Some More 70 Monday, October 3, 11
  • 74. When Good pg_reorgs Go Bad! Basic Fix: Drop All Defaults And Recreate Alternative Fix: Hack System Catalogs Some More Haven’t we had enough system catalog hacking for now? 71 Monday, October 3, 11
  • 75. When Good pg_reorgs Go Bad! “now, if you'll excuse me, I'll go away and have a heart attack.” 72 Monday, October 3, 11
  • 76. What Next? Report Problem To Mailing List Submit A Patch Ultimately The Problem Is Fixed Everyone’s Happy? 73 Monday, October 3, 11
  • 77. Hackers Discussion Postgres Development Community Is Funny Sometimes Hard To Get Them To Recognize Problems Not Everyone See Online Rebuild As A Big Problem 74 Monday, October 3, 11
  • 78. Hackers Discussion Postgres Development Community Is Funny Sometimes Hard To Get Them To Recognize Problems Not Everyone See Online Rebuild As A Big Problem In All The Fairness, Not Everyone Has This Problem 75 Monday, October 3, 11
  • 79. Hackers Discussion Hackers Meeting 2011, Discussion On Internal Queuing System Could Be Used As Underlying Basis For On-Line Rebuilding Until Then... 76 Monday, October 3, 11
  • 80. pg_reorg Is A Great Tool! Best Option For Difficult Situation Just Be Careful! 77 Monday, October 3, 11
  • 81. THANKS! Highload++ NTT OmniTI Postgres Community Momjian, Depesz, Patel, Kocoloski xzilla.net @robtreat2 + Robert Treat 78 Monday, October 3, 11