ˇ
      Cteme EXPLAIN
         CSPUG, Praha


Tom´ˇ Vondra (tv@fuzzy.cz)
   as

Czech and Slovak PostgreSQL Users Group




             21.6.2011
Agenda




    K ˇemu slouˇ´ EXPLAIN a EXPLAIN ANALYZE?
      c        zı
    Jak funguje pl´nov´n´ jak se vyb´ a “optim´ln´ pl´n?
                  a   a ı,          ır´       a ı” a
    Z´kladn´ fyzick´ oper´tory : scany, joiny, ...
     a     ı       e     a
    Jak poznat ˇe je nˇco ˇpatnˇ?
               z      e s      e
    Dalˇ´ uˇiteˇn´ n´stroje.
       sı z c e a




                               T. Vondra (CSPUG)   ˇ
                                                   Cteme EXPLAIN
K ˇemu slouˇ´ EXPLAIN a EXPLAIN ANALYZE?
  c        zı



 SQL je deklarativn´ jazyk
                   ı
     SQL dotaz nen´ program, popisuje v´sledek (logick´ algebra).
                  ı                    y              a
     Existuje mnoho zp˚sob˚ jak dan´ dotaz vyhodnotit (fyzick´ algebra).
                      u u          y                         a
     Nalezen´ “optim´ln´
            ı       a ıho” zp˚sobu je starost´ datab´ze.
                             u               ı      a
     Optim´ln´ = nejm´nˇ n´roˇn´ na zdroje (CPU, I/O, pamˇˇ, ...)
          a ı        e e a c y                           et
     Z´vis´ na podm´ ach (poˇet uˇivatel˚, velikost work mem, ...).
      a ı          ınk´     c    z      u

 stupnˇ volnosti
      e
     access strategy (sequential scan, index scan, ...)
     join order
     join strategy (merge join, hash join, nested loop)
     aggregation strategy (plain, hash, sorted)




                              T. Vondra (CSPUG)   ˇ
                                                  Cteme EXPLAIN
Stromov´ struktura exekuˇn´ pl´nu
       a                c ıho a




   SELECT * FROM a JOIN b ON ( a . id = b . id ) LIMIT 100;




                                T. Vondra (CSPUG)   ˇ
                                                    Cteme EXPLAIN
V´poˇet ceny
 y c




    chci porovnat nˇkolik variant ˇeˇen´ a vybrat tu “nejlevnˇjˇ´
                   e              r s ı                      e sı”
    pˇıstup obvykl´ v (ne)line´rn´ programov´n´
     r´           y           a ım          a ı
    ze statistik se odhadne poˇet ˇ´dek
                              c ra
    s vyuˇit´ “cost” promˇnn´ch se spoˇte cena pl´nu
         z ım               e y         c        a
        seq page cost = 1.0
        random page cost = 4.0
        cpu tuple cost = 0.01
        cpu index tuple cost = 0.005
        cpu operator cost = 0.0025
        ...
    porovn´m ceny moˇnost´ vyberu tu s nejniˇˇ´ cenou
          a         z    ı,                 zsı




                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
Orientaˇn´ principy
       c ı




     I/O tradiˇnˇ dominuje - minimalizace I/O operac´
              c e                                   ı
     n´hodn´ I/O je n´roˇnˇjˇ´ neˇ sekvenˇn´ I/O
      a    e         a c e sı z          c ı
     minimalizace CPU operac´
                            ı
     nepouˇ´
          zıvat pˇıliˇ mnoho pamˇti
                 r´ s           e
     minimalizace toku dat
     preferovat niˇˇ´ startup nebo celkovou cenu (?)
                  zsı


      Cena je zhruba ˇas proporˇnˇ k sekvenˇn´
                     c         c e         c ımu naˇten´ str´nky z disku.
                                                   c ı      a




                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
Z´kladn´ fyzick´ oper´tory / pˇıstup k tabulce
 a     ı       e     a        r´

 sequential scan
     pˇeˇti vˇechny ˇ´dky tabulky (a aˇ pak filtruj)
      r c s         ra                z
     data (bloky) se ˇtou sekvenˇnˇ, kaˇd´ pr´vˇ 1x
                     c          c e    z y a e

 index scan
     najdi v indexu odkazy na odpov´ ıc´ ˇ´dky
                                   ıdaj´ ı ra
     z tabulky naˇti jen ty potˇebn´ bloky (i opakovanˇ)
                 c             r   e                  e
     kombinace sekvenˇn´ a n´hodn´ho I/O
                     c ıho  a    e

 bitmap index scan
     pˇeˇti listy indexu, vytvoˇ z nich bitmapu ˇ´dk˚
      r c                      r                ra u
     naˇti jen ty bloky tabulky pro kter´ je v bitmapˇ “1”
       c                                e            e
     sekvenˇn´ I/O ale “startup” cena (tvorba bitmapy)
           c ı
     moˇnost kombinace v´ index˚ (OR, AND)
       z                ıce    u
     flexibilnˇjˇ´ neˇ multi-column indexy
             e sı z


                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
Pˇıklad - vytvoˇen´ tabulky
 r´            r ı




 tabulka se 100.000 ˇ´dk˚
                    ra u
 CREATE TABLE tab ( id INT );

 INSERT INTO tab SELECT * FROM generate_series (1 ,100000);

 ANALYZE tab ;

 SELECT relpages , reltuples FROM pg_class
                            WHERE relname = ’ tab ’;

  relpages | reltuples
 -- - - - - - - - -+ - - - - - - - - - - -
            393 |            100000
 (1 row )




                                             T. Vondra (CSPUG)   ˇ
                                                                 Cteme EXPLAIN
Pˇıklad - sequential vs. index scan
 r´



 sekvenˇn´ sken
       c ı
 EXPLAIN SELECT * FROM tab WHERE id BETWEEN 1000 AND 2000;

                                                  QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Seq Scan on tab ( cost =0.00..1893.00 rows =927 width =4)
      Filter : (( id >= 1000) AND ( id <= 2000))



 index scan
 CREATE INDEX idx ON tab ( id );
 EXPLAIN ANALYZE SELECT * FROM tab WHERE id BETWEEN 1000 AND 2000;

                                                            QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Index Scan using idx on tab ( cost =0.00..39.54 rows =1014 width =4)
                                                            ( actual time =0.108..1.703 rows =1001 loops =1)
      Index Cond : (( id >= 1000) AND ( id <= 2000))
  Total runtime : 2.840 ms




                                                           T. Vondra (CSPUG)               ˇ
                                                                                           Cteme EXPLAIN
Pˇıklad - bitmap index scan
 r´




 bitmap index scan
 EXPLAIN SELECT * FROM tab WHERE ( id = 110 OR id = 130);

                                                              QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Bitmap Heap Scan on tab ( cost =8.53..16.14 rows =2 width =4)
      Recheck Cond : (( id = 110) OR ( id = 130))
      -> BitmapOr ( cost =8.53..8.53 rows =2 width =0)
                  -> Bitmap Index Scan on idx ( cost =0.00..4.27 rows =1 width =0)
                              Index Cond : ( id = 110)
                  -> Bitmap Index Scan on idx ( cost =0.00..4.27 rows =1 width =0)
                              Index Cond : ( id = 130)




                                                           T. Vondra (CSPUG)               ˇ
                                                                                           Cteme EXPLAIN
Join strategies




     nested loop
     hash join
     merge join




                   T. Vondra (CSPUG)   ˇ
                                       Cteme EXPLAIN
Nested loop




    velice jednoduch´ - v principu dvˇ vnoˇen´ smyˇky
                    y                e    r e     c
    pro vˇtˇ´ relace pomal´, ale rychle produkuje prvn´ ˇ´dek
         e sı             y                           ı ra
    jedin´ join pouˇiteln´ pro CROSS JOIN a non-equijoin podm´
         y         z     y                                   ınky
    vˇtˇinou je k vidˇn´ v OLTP syst´mech (pr´ce s mal´mi poˇty ˇ´dek)
     es              e ı            e        a        y     c ra


    FOR a IN vnejsi_relace
       FOR b IN vnitrni_relace
          RETURN (a,b) pokud splˇuje JOIN podm´nku
                                n             ı




                            T. Vondra (CSPUG)   ˇ
                                                Cteme EXPLAIN
Nested Loop



 CREATE TABLE vnejsi ( id INT , val INT UNIQUE );
 CREATE TABLE vnitrni ( id INT PRIMARY KEY );

 INSERT INTO vnejsi
 SELECT i , i +1 FROM generate_series (1 ,1000) s ( i );

 INSERT INTO vnitrni
 SELECT i FROM generate_series (1 ,1000) s ( i );




 EXPLAIN SELECT 1 FROM vnejsi , vnitrni
                 WHERE vnejsi . id = vnitrni . id
                   AND vnejsi . val = 10;

                                                                        QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Nested Loop ( cost =0.00..16.55 rows =1 width =0)
      -> Index Scan using vnejsi_val_key on vnejsi ( cost =0.00..8.27 rows =1 width =4)
                  Index Cond : ( val = 10)
      -> Index Scan using vnitrni_pkey on vnitrni ( cost =0.00..8.27 rows =1 width =4)
                  Index Cond : ( vnitrni . id = vnejsi . id )
 (5 rows )




                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Merge Join




    setˇıd´ obˇ relace dle joinovac´ podm´
       r´ ı e                      ı     ınky (jen equijoin)
    potom ˇte ˇ´dek po ˇ´dku a posouv´ se kupˇedu
          c ra         ra            a       r
    nˇkdy jsou potˇeba rescany (duplicity ve vnˇjˇ´ tabulce)
     e            r                            e sı
    velmi rychl´ pro setˇıdˇn´ relace, jinak n´roˇn´ startup
               y        r´ e e                a c y
    vˇtˇinou k vidˇn´ v DSS/DWH syst´mech
     es           e ı               e




                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
Merge Join



 CREATE TABLE vnejsi ( id INT );
 CREATE TABLE vnitrni ( id INT );

 INSERT INTO vnejsi
 SELECT i FROM generate_series (1 ,100000) s ( i );

 INSERT INTO vnitrni
 SELECT i FROM generate_series (1 ,100000) s ( i );




 EXPLAIN SELECT 1 FROM vnejsi JOIN vnitrni USING ( id );

                                                                QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Merge Join ( cost = 1 9 3 95 . 6 4 . . 2 1 3 95 . 6 4 rows =100000 width =0)
      Merge Cond : ( vnejsi . id = vnitrni . id )
      -> Sort ( cost =96 97. 82.. 994 7.8 2 rows =100000 width =4)
                  Sort Key : vnejsi . id
                  -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4)
      -> Sort ( cost =96 97. 82.. 994 7.8 2 rows =100000 width =4)
                  Sort Key : vnitrni . id
                  -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4)




                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Hash Join


     1    naˇti menˇ´ (vnitˇn´ relaci a vygeneruj z n´ hash tabulku (pˇes join kl´c)
            c      sı      r ı)                      ı                r          ıˇ
     2    ˇti vnˇjˇ´ tabulku a vyhled´vej v hash tabulce pˇed hash kl´ce
          c     e sı                 a                    r          ıˇ


 CREATE TABLE vnejsi ( id INT );
 CREATE TABLE vnitrni ( id INT );

 INSERT INTO vnejsi SELECT i FROM generate_series (1 ,100000) s ( i );
 INSERT INTO vnitrni SELECT i FROM generate_series (1 ,100000) s ( i );




 EXPLAIN SELECT 1 FROM vnejsi_tabulka JOIN vnitrni_tabulka USING ( id );

                                                                QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Hash Join ( cost =29 85. 00.. 702 9.0 0 rows =100000 width =0)
      Hash Cond : ( vnejsi . id = vnitrni . id )
      -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4)
      -> Hash ( cost =13 93. 00.. 139 3.0 0 rows =100000 width =4)
                  -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4)
 (5 rows )




                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Hash Join / batches

 Co kdyˇ se hash tabulka nevejde to pamˇti (work mem)?
       z                               e
     1    rozdˇl menˇ´ tabulku na ˇ´sti, aby se tabulka do pamˇti veˇla
              e     sı            ca                            e    s
     2    pro kaˇdou ˇ´st sestav tabulku a proved
                z    ca                          ˇ join s “velkou” tabulkou
     3    m´nˇ efektivn´ (opakovan´ ˇten´ vnˇjˇ´ tabulky)
           e e         ı          e c ı e sı
     4    pozn´ se dle “batches” v pl´nu
              a                      a


 EXPLAIN ANALYZE SELECT 1 FROM vnejsi_tabulka JOIN vnitrni_tabulka USING ( id );

                                                                        QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -   ...
  Hash Join ( cost =29 85. 00.. 702 9.0 0 rows =100000 width =0) ( actual time =277.886..792                                                                        ...
      Hash Cond : ( vnejsi . id = vnitrni . id )
      -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4) ( actual time =                                                                            ...
      -> Hash ( cost =13 93. 00.. 139 3.0 0 rows =100000 width =4) ( actual time =277.836..27                                                                       ...
                  Buckets : 8192 Batches : 4 Memory Usage : 589 kB
                  -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4) ( actua                                                                       ...
  Total runtime : 900.664 ms
 (7 rows )




          zvyˇte work mem (ˇ´ m´nˇ batch˚, t´ vˇtˇinou l´pe)
             s             cım e e      u ım e s        e


                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Srovn´n´ join metod
     a ı


 Nested Loop
    ˇpatnˇ funguje pro dvˇ velk´ relace
    s    e               e     e
     ide´ln´ pro malou vnˇjˇ´ relaci + rychl´ dotaz do vnitˇn´ (index scan)
        a ı              e sı               y              r ı
     jedin´ metoda pro non-equijoin :-(
          a

 Merge Join
     ide´ln´ pro jiˇ setˇıdˇn´ relace (napˇ. CLUSTER + index scan)
        a ı        z    r´ e e            r
     pokud vyˇaduje extra tˇıdˇn´ probl´m (hlavnˇ velk´ on-disk tˇıdˇn´
             z             r´ e ı,     e        e     e          r´ e ı)

 Hash Join
     nevyˇaduje tˇıdˇn´ mus´ ale vytvoˇit hash tabulku
         z       r´ e ı,   ı          r
     vyˇaduje ale dostatek pamˇti (work mem pro hash tabulku)
       z                      e
     pokud je hash tabulka moc velk´, dˇl´ se do batch˚ (pomalejˇ´
                                   a eı               u         sı)




                              T. Vondra (CSPUG)   ˇ
                                                  Cteme EXPLAIN
Sort & Limit




    ORDER BY ale i spousta dalˇ´ (DISTINCT, GROUP BY, UNION)
                              sıch
    tˇi moˇnosti
     r    z
         quicksort (v pamˇti, omezeno work mem)
                          e
         merge sort (na disku)
         index scan (dostateˇnˇ korelovan´ index, napˇ. CLUSTERED)
                            c e          y           r
    LIMIT ˇık´ “chci jenom p´r ˇ´dek, preferuj rychle startuj´ ı pl´ny”
          r´ a              a ra                             ıc´ a
    vˇtˇinou mal´ startovn´ ˇas znamen´ velk´ celkov´ ˇas
     es         y         ıc          a     y       yc


   EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ;

    Sort (...) ( actual time =44 6.08 9.. 591 .71 4 rows =100000 loops =1)
      Sort Key : id
      Sort Method : external sort Disk : 1368 kB
      -> Seq Scan on tab (...) ( actual time =0.016..129.756 rows =100000 loops =1)




                              T. Vondra (CSPUG)   ˇ
                                                  Cteme EXPLAIN
Sort


 v pamˇti
      e
 SET work_mem = ’8 MB ’;

 EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ;

                                                                        QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  Sort (...) ( actual time =31 2.7 09. .432 .41 0 rows =100000 loops =1)
    Sort Key : id
    Sort Method : quicksort Memory : 4392 kB
    -> Seq Scan on tab (...) ( actual time =0.020..146.975 rows =100000 loops =1)



 s dobˇe korelovan´m indexem
      r           y
 CREATE INDEX idx ON tab ( id );

 EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ;

                                                                        QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Index Scan using idx on tab ( cost =0.00..2780.26 rows =100000 width =4)
                                                            ( actual time =0.088..162.377 rows =100000 loops =1)
  Total runtime : 272.881 ms




                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Typy uzl˚ - ostatn´
        u         ı




     agregace (GROUP BY, DISTINCT)
     LIMIT
     modifikace tabulky (INSERT, UPDATE, DELETE)
     mnoˇinov´ operace (INTERSECT, EXCEPT)
        z    e
     subplan (pro korelovan´ subselecty), initplan (nekorelovan´)
                           e                                   e
     CTE, window functions
     materializace
     zamyk´n´ ˇ´dek
          a ı ra
     append (inheritance)
     ...




                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
Chybn´ odhad poˇtu ˇ´dk˚ (relace resp. vyhovuj´ ıch podm´
     y         c ra u                         ıc´       ınce).


 V ˇem spoˇ´ a probl´m?
   c      cıv´      e
     Pl´novaˇ si mysl´ ˇe tabulka je mal´ ale ve skuteˇnosti je velk´.
       a    c        ız                 a             c             a
     Pl´novaˇ si mysl´ ˇe podm´
       a    c        ız       ınce vyhovuje p´r ˇ´dek, ve skuteˇnosti mnoho.
                                             a ra              c
     nebo naopak ...

 Jak se projevuje?
     vol´ se nevhodn´ zp˚sob pˇıstupu k tabulce (index vs. sekvenˇn´ sken)
        ı           y u       r´                                 c ı
     vol´ se nevhodn´ zp˚sob joinov´n´ (nested loop nam´ hash/merge joinu)
        ı           y u            a ı                 ısto

 Co je pˇıˇinou?
        r´c
     zastaral´ statistiky (napˇ. hned po loadu)
             e                r
     chybn´ statistiky - obˇas poˇet distinct hodnot, nevhodn´ formulace podm´
          e                c     c                           a               ınek
     podm´
         ınky na korelovan´ch sloupc´ (cross-column statistiky zat´ nejsou)
                          y         ıch                           ım
     LIMIT situaci vˇtˇinou v´raznˇ zhorˇuje (preferuje pl´ny s levn´m startem)
                    es       y    e     s                 a         y



                              T. Vondra (CSPUG)   ˇ
                                                  Cteme EXPLAIN
Pˇıklad - zd´nlivˇ velk´ selektivita
 r´         a e        a

 zaloˇeno na “race condition” - spust´ dotaz jeˇtˇ neˇ se staˇ´ pˇepoˇ´ statistiky
     z                               ım        se z          cı r    cıtat
 CREATE TABLE tab ( id INT );
 CREATE INDEX idx ON tab ( id );
 INSERT INTO tab SELECT * FROM generate_series (1 ,100000);
 ANALYZE tab ;

 DELETE FROM tab ;
 INSERT INTO tab SELECT 1111 FROM generate_series (1 ,100000);



 EXPLAIN ANALYZE SELECT * FROM tab WHERE id = 1111;
                                                            QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Index Scan using idx on tab ( cost =0.00..8.29 rows =1 width =4)
                                                            ( actual time =0.049..166.562 rows =100000 loops =1)
      Index Cond : ( id = 1111)
 (3 rows )


 ... wait ....


 EXPLAIN ANALYZE SELECT * FROM tab WHERE id = 1111;
                                                            QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Seq Scan on tab ( cost =0.00..2035.00 rows =100000 width =4)
                                    ( actual time =0.949..158.568 rows =100000 loops =1)
      Filter : ( id = 1111)



                                                           T. Vondra (CSPUG)                ˇ
                                                                                            Cteme EXPLAIN
Pˇıklad - korelovan´ sloupce
 r´                e




 CREATE TABLE tab ( a INT , b INT );
 INSERT INTO tab SELECT i , i FROM generate_series (1 ,100000) s ( i );
 ANALYZE tab ;



 EXPLAIN ANALYZE SELECT * FROM tab WHERE a >= 50000 AND b <= 50000;

                                                            QUERY PLAN
 -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  Seq Scan on tab ( cost =0.00..1943.00 rows =25000 width =8)
                                    ( actual time =26.196..58.715 rows =1 loops =1)
      Filter : (( a >= 50000) AND ( b <= 50000))
  Total runtime : 58.762 ms
 (3 rows )




                                                           T. Vondra (CSPUG)               ˇ
                                                                                           Cteme EXPLAIN
Dalˇ´ problematick´ m´
   sı             a ısta




 Nevhodn´ nastaven´ “cost” promˇnn´ch
        e         ı            e y
     v´choz´ hodnoty vych´z´ z “typick´ho” syst´mu
      y    ı             a ı          e        e
     nemus´ nutnˇ odpov´
          ı     e      ıdat tomu vaˇemu
                                   s
     napˇ. pokud m´te SSD, st´ a se rozd´ mezi n´hodn´m a sekvenˇn´ I/O
        r         a           ır´       ıl      a     y            c ım
     pokud m´te rychl´ disky (15k SAS) tak ˇ´steˇnˇ tak´, byˇ ne tak markantnˇ
             a       e                     ca c e       e   t                e
     mal´ effective cache size znev´hodˇuje indexy
        a                         y   n

 ˇ
 Cern´ d´
     e ıry
     triggery
     referenˇn´ integrita (ciz´ kl´ce bez index˚)
            c ı               ı ıˇ             u




                               T. Vondra (CSPUG)   ˇ
                                                   Cteme EXPLAIN
EXPLAIN kuchaˇka
             r



 Zkontrolujte uzly kde nesed´ odhad poˇtu ˇ´dek.
                            ı         c ra
     Mal´ rozd´ nevad´ ˇ´dov´ rozd´ uˇ jsou probl´m.
        e     ıly    ı, ra  e     ıly z          e
     Pokud je ˇpatnˇ odhad, nem˚ˇe b´t volba pl´nu spolehliv´.
              s    e           uz y            a            a
     Zkuste aktualizovat statistiky, pˇeformulovat podm´
                                      r                ınky, ...

 Pod´
    ıvejte se na na uzly s nejvˇtˇım proporˇn´ rozd´
                               e s´        c ım    ılem mezi cenou a ˇasem.
                                                                     c
     Jste si jisti ˇe m´te rozumnˇ nastaveny promˇnn´?
                   z   a         e               e e
     Zmˇnte nastaven´ (v session) a sledujte jak se zmˇn´ pl´n a v´kon dotazu.
       eˇ           ı                                 e ı a       y

  r                       r ˇ
 Pˇi optimalizaci se soustˇedte na uzly s nejvyˇˇı cenou / skuteˇn´m ˇasem.
                                               ss´              c y c
     Tam kde se tr´v´ nejv´ ˇasu m˚ˇete optimalizac´ nejv´ z´
                  a ı     ıc c    uz               ı     ıce ıskat.
     Nelze napˇ. pˇidat index nebo zv´ˇit work mem?
              r r                    ys




                             T. Vondra (CSPUG)   ˇ
                                                 Cteme EXPLAIN
explain.depesz.com




    http://explain.depesz.com
    v´born´ n´stroj pro vizualizaci a anal´zu explain planu
     y    y a                             y
    skvˇl´ pro pos´ an´ pl´nu napˇ. do e-mailov´ch konferenc´ (nezmrˇ´ se)
       ee         ıl´ ı a        r             y            ı       sı

                            T. Vondra (CSPUG)   ˇ
                                                Cteme EXPLAIN
explain.depesz.com




    Jak dlouho trval dan´ krok (samostatnˇ / vˇetnˇ podˇızen´ch)?
                        y                e    c e      r´ y
    Jak pˇesn´ byl odhad poˇtu ˇ´dek?
         r y               c ra
    Kolik ˇ´dek se vyprodukovalo?
          ra

                           T. Vondra (CSPUG)   ˇ
                                               Cteme EXPLAIN
explain.depesz.com

 Unique (cost=30938464.86..31166982.10 rows=30468966 width=89) (actual
 time=249353.521..250273.108 rows=342107 loops=1)
 -> Sort (cost=30938464.86..31014637.27 rows=30468966 width=89) (actual
 time=249353.518..250155.187 rows=342108 loops=1)
 Sort Key: (lower(u.samaccountname[1])), (g.cn[1])
 Sort Method: external merge Disk: 13176kB
 -> Append (cost=0.00..19340392.34 rows=30468966 width=89) (actual
 time=44.687..242695.135 rows=342108 loops=1)
 -> Nested Loop (cost=0.00..19031015.08 rows=30385836 width=89) (actual
 time=44.685..240132.584 rows=2535 loops=1)
 Join Filter: ((u.primarygroupid[1] = ANY (tmp_g.primarygrouptoken)) OR
 (u.gidnumber[1] = ANY (tmp_g.gidnumber)) OR (tmp_g.dn = ANY (u.memberof)) OR
 (tmp_g.cn[1] = ANY (u.memberof)) OR (tmp_g.dn = ANY (u.groupmembership)) OR
 (tmp_g.cn[1] = ANY (u.groupmembership)) OR (u.samaccountname[1] = ANY
 (tmp_g.memberuid)) OR (u.dn = ANY (tmp_g.member)) OR (u.cn[1] = ANY
 (tmp_g.member)))
 -> Nested Loop (cost=0.00..1421.74 rows=1350 width=986) (actual
 time=0.054..116.528 rows=1350 loops=1)
 -> Nested Loop (cost=0.00..734.12 rows=1350 width=1023) (actual
 time=0.038..76.647 rows=1350 loops=1)
 -> Seq Scan on ldap_group_inheritance i (cost=0.00..46.50 rows=1350 width=166)
 (actual time=0.015..1.633 rows=1350 loops=1)
 -> Index Scan using ldap_import_groups_dn_key on ldap_import_groups tmp_g
 (cost=0.00..0.50 rows=1 width=940) (actual time=0.048..0.049 rows=1 loops=1350)
 Index Cond: (tmp_g.dn = i.groupdn)
 -> Index Scan using ldap_import_groups_dn_key on ldap_import_groups g
 (cost=0.00..0.50 rows=1 width=129) (actual time=0.022..0.026 rows=1 loops=1350)
 Index Cond: (g.dn = i.parentdn)
 -> Seq Scan on ldap_import_users u (cost=0.00..3856.30 rows=83130 width=372)
 (actual time=0.006..26.162 rows=83130 loops=1350)
 -> Seq Scan on ldap_import_users u (cost=0.00..4687.60 rows=83130 width=126)
 (actual time=0.098..2499.336 rows=339573 loops=1)
 Total runtime: 250301.001 ms
 (17 rows)




                                           T. Vondra (CSPUG)      ˇ
                                                                  Cteme EXPLAIN
pgadmin3




    http://www.pgadmin.org/
    GUI umoˇnuj´ ı mimo jin´ i vizualizaci SQL dotaz˚
           zˇ ıc´          e                        u




                           T. Vondra (CSPUG)   ˇ
                                               Cteme EXPLAIN
auto explain & explanation


 auto explain
     jak´si doplnˇk k log min duration statement
        y        e
     umoˇnuje logovat EXPLAIN (ˇi EXPLAIN ANALYZE) pro dlouh´ dotazy
        zˇ                     c                            e
     http://developer.postgresql.org/pgdocs/postgres/auto-explain.html

 explanation
     flexibilnˇjˇ´ pr´ce s informacemi o pl´nu pˇımo v SQL
             e sı a                       a    r´
     http://www.pgxn.org/dist/explanation/doc/explanation.html

    SELECT node_type , strategy , actual_startup_time , ac tual _to tal _ti me
      FROM explanation (
          query    := $$ SELECT * FROM pg_class WHERE relname = ’ users ’ $$ ,
          analyzed := true
      );


     node_type | strategy | a c t u al _ s t a r t u p _t i m e | act ual _tot al_ tim e
    -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     Index Scan |                                | 00:00:00.000017                           | 00:00:00.000017




                                                      T. Vondra (CSPUG)               ˇ
                                                                                      Cteme EXPLAIN
Odkazy


    Query Execution Techniques in PostgreSQL, Neil Conway, 2007
    http://neilconway.org/talks/executor.pdf

    ˇ ı
    Cten´ prov´dˇc´ pl´n˚ v PostgreSQL, Pavel Stˇhule, 2008
              a e ıch a u                       e
    http://www.root.cz/clanky/cteni-provadecich-planu-v-postgresql/

    Using EXPLAIN @ wiki
    http://wiki.postgresql.org/wiki/Using_EXPLAIN

    Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT @ wiki
    http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,
    _EXPLAIN,_and_COUNT

    Explaining EXPLAIN, R. Treat, G. S. Mullane, AndrewSN, Magnifikus, B. Encina,
    N. Conway, 2008
    http://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf



                          T. Vondra (CSPUG)   ˇ
                                              Cteme EXPLAIN

Čtení explain planu (CSPUG 21.6.2011)

  • 1.
    ˇ Cteme EXPLAIN CSPUG, Praha Tom´ˇ Vondra (tv@fuzzy.cz) as Czech and Slovak PostgreSQL Users Group 21.6.2011
  • 2.
    Agenda K ˇemu slouˇ´ EXPLAIN a EXPLAIN ANALYZE? c zı Jak funguje pl´nov´n´ jak se vyb´ a “optim´ln´ pl´n? a a ı, ır´ a ı” a Z´kladn´ fyzick´ oper´tory : scany, joiny, ... a ı e a Jak poznat ˇe je nˇco ˇpatnˇ? z e s e Dalˇ´ uˇiteˇn´ n´stroje. sı z c e a T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 3.
    K ˇemu slouˇ´EXPLAIN a EXPLAIN ANALYZE? c zı SQL je deklarativn´ jazyk ı SQL dotaz nen´ program, popisuje v´sledek (logick´ algebra). ı y a Existuje mnoho zp˚sob˚ jak dan´ dotaz vyhodnotit (fyzick´ algebra). u u y a Nalezen´ “optim´ln´ ı a ıho” zp˚sobu je starost´ datab´ze. u ı a Optim´ln´ = nejm´nˇ n´roˇn´ na zdroje (CPU, I/O, pamˇˇ, ...) a ı e e a c y et Z´vis´ na podm´ ach (poˇet uˇivatel˚, velikost work mem, ...). a ı ınk´ c z u stupnˇ volnosti e access strategy (sequential scan, index scan, ...) join order join strategy (merge join, hash join, nested loop) aggregation strategy (plain, hash, sorted) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 4.
    Stromov´ struktura exekuˇn´pl´nu a c ıho a SELECT * FROM a JOIN b ON ( a . id = b . id ) LIMIT 100; T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 5.
    V´poˇet ceny yc chci porovnat nˇkolik variant ˇeˇen´ a vybrat tu “nejlevnˇjˇ´ e r s ı e sı” pˇıstup obvykl´ v (ne)line´rn´ programov´n´ r´ y a ım a ı ze statistik se odhadne poˇet ˇ´dek c ra s vyuˇit´ “cost” promˇnn´ch se spoˇte cena pl´nu z ım e y c a seq page cost = 1.0 random page cost = 4.0 cpu tuple cost = 0.01 cpu index tuple cost = 0.005 cpu operator cost = 0.0025 ... porovn´m ceny moˇnost´ vyberu tu s nejniˇˇ´ cenou a z ı, zsı T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 6.
    Orientaˇn´ principy c ı I/O tradiˇnˇ dominuje - minimalizace I/O operac´ c e ı n´hodn´ I/O je n´roˇnˇjˇ´ neˇ sekvenˇn´ I/O a e a c e sı z c ı minimalizace CPU operac´ ı nepouˇ´ zıvat pˇıliˇ mnoho pamˇti r´ s e minimalizace toku dat preferovat niˇˇ´ startup nebo celkovou cenu (?) zsı Cena je zhruba ˇas proporˇnˇ k sekvenˇn´ c c e c ımu naˇten´ str´nky z disku. c ı a T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 7.
    Z´kladn´ fyzick´ oper´tory/ pˇıstup k tabulce a ı e a r´ sequential scan pˇeˇti vˇechny ˇ´dky tabulky (a aˇ pak filtruj) r c s ra z data (bloky) se ˇtou sekvenˇnˇ, kaˇd´ pr´vˇ 1x c c e z y a e index scan najdi v indexu odkazy na odpov´ ıc´ ˇ´dky ıdaj´ ı ra z tabulky naˇti jen ty potˇebn´ bloky (i opakovanˇ) c r e e kombinace sekvenˇn´ a n´hodn´ho I/O c ıho a e bitmap index scan pˇeˇti listy indexu, vytvoˇ z nich bitmapu ˇ´dk˚ r c r ra u naˇti jen ty bloky tabulky pro kter´ je v bitmapˇ “1” c e e sekvenˇn´ I/O ale “startup” cena (tvorba bitmapy) c ı moˇnost kombinace v´ index˚ (OR, AND) z ıce u flexibilnˇjˇ´ neˇ multi-column indexy e sı z T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 8.
    Pˇıklad - vytvoˇen´tabulky r´ r ı tabulka se 100.000 ˇ´dk˚ ra u CREATE TABLE tab ( id INT ); INSERT INTO tab SELECT * FROM generate_series (1 ,100000); ANALYZE tab ; SELECT relpages , reltuples FROM pg_class WHERE relname = ’ tab ’; relpages | reltuples -- - - - - - - - -+ - - - - - - - - - - - 393 | 100000 (1 row ) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 9.
    Pˇıklad - sequentialvs. index scan r´ sekvenˇn´ sken c ı EXPLAIN SELECT * FROM tab WHERE id BETWEEN 1000 AND 2000; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Seq Scan on tab ( cost =0.00..1893.00 rows =927 width =4) Filter : (( id >= 1000) AND ( id <= 2000)) index scan CREATE INDEX idx ON tab ( id ); EXPLAIN ANALYZE SELECT * FROM tab WHERE id BETWEEN 1000 AND 2000; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Index Scan using idx on tab ( cost =0.00..39.54 rows =1014 width =4) ( actual time =0.108..1.703 rows =1001 loops =1) Index Cond : (( id >= 1000) AND ( id <= 2000)) Total runtime : 2.840 ms T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 10.
    Pˇıklad - bitmapindex scan r´ bitmap index scan EXPLAIN SELECT * FROM tab WHERE ( id = 110 OR id = 130); QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Bitmap Heap Scan on tab ( cost =8.53..16.14 rows =2 width =4) Recheck Cond : (( id = 110) OR ( id = 130)) -> BitmapOr ( cost =8.53..8.53 rows =2 width =0) -> Bitmap Index Scan on idx ( cost =0.00..4.27 rows =1 width =0) Index Cond : ( id = 110) -> Bitmap Index Scan on idx ( cost =0.00..4.27 rows =1 width =0) Index Cond : ( id = 130) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 11.
    Join strategies nested loop hash join merge join T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 12.
    Nested loop velice jednoduch´ - v principu dvˇ vnoˇen´ smyˇky y e r e c pro vˇtˇ´ relace pomal´, ale rychle produkuje prvn´ ˇ´dek e sı y ı ra jedin´ join pouˇiteln´ pro CROSS JOIN a non-equijoin podm´ y z y ınky vˇtˇinou je k vidˇn´ v OLTP syst´mech (pr´ce s mal´mi poˇty ˇ´dek) es e ı e a y c ra FOR a IN vnejsi_relace FOR b IN vnitrni_relace RETURN (a,b) pokud splˇuje JOIN podm´nku n ı T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 13.
    Nested Loop CREATETABLE vnejsi ( id INT , val INT UNIQUE ); CREATE TABLE vnitrni ( id INT PRIMARY KEY ); INSERT INTO vnejsi SELECT i , i +1 FROM generate_series (1 ,1000) s ( i ); INSERT INTO vnitrni SELECT i FROM generate_series (1 ,1000) s ( i ); EXPLAIN SELECT 1 FROM vnejsi , vnitrni WHERE vnejsi . id = vnitrni . id AND vnejsi . val = 10; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Nested Loop ( cost =0.00..16.55 rows =1 width =0) -> Index Scan using vnejsi_val_key on vnejsi ( cost =0.00..8.27 rows =1 width =4) Index Cond : ( val = 10) -> Index Scan using vnitrni_pkey on vnitrni ( cost =0.00..8.27 rows =1 width =4) Index Cond : ( vnitrni . id = vnejsi . id ) (5 rows ) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 14.
    Merge Join setˇıd´ obˇ relace dle joinovac´ podm´ r´ ı e ı ınky (jen equijoin) potom ˇte ˇ´dek po ˇ´dku a posouv´ se kupˇedu c ra ra a r nˇkdy jsou potˇeba rescany (duplicity ve vnˇjˇ´ tabulce) e r e sı velmi rychl´ pro setˇıdˇn´ relace, jinak n´roˇn´ startup y r´ e e a c y vˇtˇinou k vidˇn´ v DSS/DWH syst´mech es e ı e T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 15.
    Merge Join CREATETABLE vnejsi ( id INT ); CREATE TABLE vnitrni ( id INT ); INSERT INTO vnejsi SELECT i FROM generate_series (1 ,100000) s ( i ); INSERT INTO vnitrni SELECT i FROM generate_series (1 ,100000) s ( i ); EXPLAIN SELECT 1 FROM vnejsi JOIN vnitrni USING ( id ); QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Merge Join ( cost = 1 9 3 95 . 6 4 . . 2 1 3 95 . 6 4 rows =100000 width =0) Merge Cond : ( vnejsi . id = vnitrni . id ) -> Sort ( cost =96 97. 82.. 994 7.8 2 rows =100000 width =4) Sort Key : vnejsi . id -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4) -> Sort ( cost =96 97. 82.. 994 7.8 2 rows =100000 width =4) Sort Key : vnitrni . id -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 16.
    Hash Join 1 naˇti menˇ´ (vnitˇn´ relaci a vygeneruj z n´ hash tabulku (pˇes join kl´c) c sı r ı) ı r ıˇ 2 ˇti vnˇjˇ´ tabulku a vyhled´vej v hash tabulce pˇed hash kl´ce c e sı a r ıˇ CREATE TABLE vnejsi ( id INT ); CREATE TABLE vnitrni ( id INT ); INSERT INTO vnejsi SELECT i FROM generate_series (1 ,100000) s ( i ); INSERT INTO vnitrni SELECT i FROM generate_series (1 ,100000) s ( i ); EXPLAIN SELECT 1 FROM vnejsi_tabulka JOIN vnitrni_tabulka USING ( id ); QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hash Join ( cost =29 85. 00.. 702 9.0 0 rows =100000 width =0) Hash Cond : ( vnejsi . id = vnitrni . id ) -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4) -> Hash ( cost =13 93. 00.. 139 3.0 0 rows =100000 width =4) -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4) (5 rows ) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 17.
    Hash Join /batches Co kdyˇ se hash tabulka nevejde to pamˇti (work mem)? z e 1 rozdˇl menˇ´ tabulku na ˇ´sti, aby se tabulka do pamˇti veˇla e sı ca e s 2 pro kaˇdou ˇ´st sestav tabulku a proved z ca ˇ join s “velkou” tabulkou 3 m´nˇ efektivn´ (opakovan´ ˇten´ vnˇjˇ´ tabulky) e e ı e c ı e sı 4 pozn´ se dle “batches” v pl´nu a a EXPLAIN ANALYZE SELECT 1 FROM vnejsi_tabulka JOIN vnitrni_tabulka USING ( id ); QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ... Hash Join ( cost =29 85. 00.. 702 9.0 0 rows =100000 width =0) ( actual time =277.886..792 ... Hash Cond : ( vnejsi . id = vnitrni . id ) -> Seq Scan on vnejsi ( cost =0.00..1393.00 rows =100000 width =4) ( actual time = ... -> Hash ( cost =13 93. 00.. 139 3.0 0 rows =100000 width =4) ( actual time =277.836..27 ... Buckets : 8192 Batches : 4 Memory Usage : 589 kB -> Seq Scan on vnitrni ( cost =0.00..1393.00 rows =100000 width =4) ( actua ... Total runtime : 900.664 ms (7 rows ) zvyˇte work mem (ˇ´ m´nˇ batch˚, t´ vˇtˇinou l´pe) s cım e e u ım e s e T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 18.
    Srovn´n´ join metod a ı Nested Loop ˇpatnˇ funguje pro dvˇ velk´ relace s e e e ide´ln´ pro malou vnˇjˇ´ relaci + rychl´ dotaz do vnitˇn´ (index scan) a ı e sı y r ı jedin´ metoda pro non-equijoin :-( a Merge Join ide´ln´ pro jiˇ setˇıdˇn´ relace (napˇ. CLUSTER + index scan) a ı z r´ e e r pokud vyˇaduje extra tˇıdˇn´ probl´m (hlavnˇ velk´ on-disk tˇıdˇn´ z r´ e ı, e e e r´ e ı) Hash Join nevyˇaduje tˇıdˇn´ mus´ ale vytvoˇit hash tabulku z r´ e ı, ı r vyˇaduje ale dostatek pamˇti (work mem pro hash tabulku) z e pokud je hash tabulka moc velk´, dˇl´ se do batch˚ (pomalejˇ´ a eı u sı) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 19.
    Sort & Limit ORDER BY ale i spousta dalˇ´ (DISTINCT, GROUP BY, UNION) sıch tˇi moˇnosti r z quicksort (v pamˇti, omezeno work mem) e merge sort (na disku) index scan (dostateˇnˇ korelovan´ index, napˇ. CLUSTERED) c e y r LIMIT ˇık´ “chci jenom p´r ˇ´dek, preferuj rychle startuj´ ı pl´ny” r´ a a ra ıc´ a vˇtˇinou mal´ startovn´ ˇas znamen´ velk´ celkov´ ˇas es y ıc a y yc EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ; Sort (...) ( actual time =44 6.08 9.. 591 .71 4 rows =100000 loops =1) Sort Key : id Sort Method : external sort Disk : 1368 kB -> Seq Scan on tab (...) ( actual time =0.016..129.756 rows =100000 loops =1) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 20.
    Sort v pamˇti e SET work_mem = ’8 MB ’; EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Sort (...) ( actual time =31 2.7 09. .432 .41 0 rows =100000 loops =1) Sort Key : id Sort Method : quicksort Memory : 4392 kB -> Seq Scan on tab (...) ( actual time =0.020..146.975 rows =100000 loops =1) s dobˇe korelovan´m indexem r y CREATE INDEX idx ON tab ( id ); EXPLAIN ANALYZE SELECT * FROM tab ORDER BY id ; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Index Scan using idx on tab ( cost =0.00..2780.26 rows =100000 width =4) ( actual time =0.088..162.377 rows =100000 loops =1) Total runtime : 272.881 ms T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 21.
    Typy uzl˚ -ostatn´ u ı agregace (GROUP BY, DISTINCT) LIMIT modifikace tabulky (INSERT, UPDATE, DELETE) mnoˇinov´ operace (INTERSECT, EXCEPT) z e subplan (pro korelovan´ subselecty), initplan (nekorelovan´) e e CTE, window functions materializace zamyk´n´ ˇ´dek a ı ra append (inheritance) ... T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 22.
    Chybn´ odhad poˇtuˇ´dk˚ (relace resp. vyhovuj´ ıch podm´ y c ra u ıc´ ınce). V ˇem spoˇ´ a probl´m? c cıv´ e Pl´novaˇ si mysl´ ˇe tabulka je mal´ ale ve skuteˇnosti je velk´. a c ız a c a Pl´novaˇ si mysl´ ˇe podm´ a c ız ınce vyhovuje p´r ˇ´dek, ve skuteˇnosti mnoho. a ra c nebo naopak ... Jak se projevuje? vol´ se nevhodn´ zp˚sob pˇıstupu k tabulce (index vs. sekvenˇn´ sken) ı y u r´ c ı vol´ se nevhodn´ zp˚sob joinov´n´ (nested loop nam´ hash/merge joinu) ı y u a ı ısto Co je pˇıˇinou? r´c zastaral´ statistiky (napˇ. hned po loadu) e r chybn´ statistiky - obˇas poˇet distinct hodnot, nevhodn´ formulace podm´ e c c a ınek podm´ ınky na korelovan´ch sloupc´ (cross-column statistiky zat´ nejsou) y ıch ım LIMIT situaci vˇtˇinou v´raznˇ zhorˇuje (preferuje pl´ny s levn´m startem) es y e s a y T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 23.
    Pˇıklad - zd´nlivˇvelk´ selektivita r´ a e a zaloˇeno na “race condition” - spust´ dotaz jeˇtˇ neˇ se staˇ´ pˇepoˇ´ statistiky z ım se z cı r cıtat CREATE TABLE tab ( id INT ); CREATE INDEX idx ON tab ( id ); INSERT INTO tab SELECT * FROM generate_series (1 ,100000); ANALYZE tab ; DELETE FROM tab ; INSERT INTO tab SELECT 1111 FROM generate_series (1 ,100000); EXPLAIN ANALYZE SELECT * FROM tab WHERE id = 1111; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Index Scan using idx on tab ( cost =0.00..8.29 rows =1 width =4) ( actual time =0.049..166.562 rows =100000 loops =1) Index Cond : ( id = 1111) (3 rows ) ... wait .... EXPLAIN ANALYZE SELECT * FROM tab WHERE id = 1111; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Seq Scan on tab ( cost =0.00..2035.00 rows =100000 width =4) ( actual time =0.949..158.568 rows =100000 loops =1) Filter : ( id = 1111) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 24.
    Pˇıklad - korelovan´sloupce r´ e CREATE TABLE tab ( a INT , b INT ); INSERT INTO tab SELECT i , i FROM generate_series (1 ,100000) s ( i ); ANALYZE tab ; EXPLAIN ANALYZE SELECT * FROM tab WHERE a >= 50000 AND b <= 50000; QUERY PLAN -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Seq Scan on tab ( cost =0.00..1943.00 rows =25000 width =8) ( actual time =26.196..58.715 rows =1 loops =1) Filter : (( a >= 50000) AND ( b <= 50000)) Total runtime : 58.762 ms (3 rows ) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 25.
    Dalˇ´ problematick´ m´ sı a ısta Nevhodn´ nastaven´ “cost” promˇnn´ch e ı e y v´choz´ hodnoty vych´z´ z “typick´ho” syst´mu y ı a ı e e nemus´ nutnˇ odpov´ ı e ıdat tomu vaˇemu s napˇ. pokud m´te SSD, st´ a se rozd´ mezi n´hodn´m a sekvenˇn´ I/O r a ır´ ıl a y c ım pokud m´te rychl´ disky (15k SAS) tak ˇ´steˇnˇ tak´, byˇ ne tak markantnˇ a e ca c e e t e mal´ effective cache size znev´hodˇuje indexy a y n ˇ Cern´ d´ e ıry triggery referenˇn´ integrita (ciz´ kl´ce bez index˚) c ı ı ıˇ u T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 26.
    EXPLAIN kuchaˇka r Zkontrolujte uzly kde nesed´ odhad poˇtu ˇ´dek. ı c ra Mal´ rozd´ nevad´ ˇ´dov´ rozd´ uˇ jsou probl´m. e ıly ı, ra e ıly z e Pokud je ˇpatnˇ odhad, nem˚ˇe b´t volba pl´nu spolehliv´. s e uz y a a Zkuste aktualizovat statistiky, pˇeformulovat podm´ r ınky, ... Pod´ ıvejte se na na uzly s nejvˇtˇım proporˇn´ rozd´ e s´ c ım ılem mezi cenou a ˇasem. c Jste si jisti ˇe m´te rozumnˇ nastaveny promˇnn´? z a e e e Zmˇnte nastaven´ (v session) a sledujte jak se zmˇn´ pl´n a v´kon dotazu. eˇ ı e ı a y r r ˇ Pˇi optimalizaci se soustˇedte na uzly s nejvyˇˇı cenou / skuteˇn´m ˇasem. ss´ c y c Tam kde se tr´v´ nejv´ ˇasu m˚ˇete optimalizac´ nejv´ z´ a ı ıc c uz ı ıce ıskat. Nelze napˇ. pˇidat index nebo zv´ˇit work mem? r r ys T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 27.
    explain.depesz.com http://explain.depesz.com v´born´ n´stroj pro vizualizaci a anal´zu explain planu y y a y skvˇl´ pro pos´ an´ pl´nu napˇ. do e-mailov´ch konferenc´ (nezmrˇ´ se) ee ıl´ ı a r y ı sı T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 28.
    explain.depesz.com Jak dlouho trval dan´ krok (samostatnˇ / vˇetnˇ podˇızen´ch)? y e c e r´ y Jak pˇesn´ byl odhad poˇtu ˇ´dek? r y c ra Kolik ˇ´dek se vyprodukovalo? ra T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 29.
    explain.depesz.com Unique (cost=30938464.86..31166982.10rows=30468966 width=89) (actual time=249353.521..250273.108 rows=342107 loops=1) -> Sort (cost=30938464.86..31014637.27 rows=30468966 width=89) (actual time=249353.518..250155.187 rows=342108 loops=1) Sort Key: (lower(u.samaccountname[1])), (g.cn[1]) Sort Method: external merge Disk: 13176kB -> Append (cost=0.00..19340392.34 rows=30468966 width=89) (actual time=44.687..242695.135 rows=342108 loops=1) -> Nested Loop (cost=0.00..19031015.08 rows=30385836 width=89) (actual time=44.685..240132.584 rows=2535 loops=1) Join Filter: ((u.primarygroupid[1] = ANY (tmp_g.primarygrouptoken)) OR (u.gidnumber[1] = ANY (tmp_g.gidnumber)) OR (tmp_g.dn = ANY (u.memberof)) OR (tmp_g.cn[1] = ANY (u.memberof)) OR (tmp_g.dn = ANY (u.groupmembership)) OR (tmp_g.cn[1] = ANY (u.groupmembership)) OR (u.samaccountname[1] = ANY (tmp_g.memberuid)) OR (u.dn = ANY (tmp_g.member)) OR (u.cn[1] = ANY (tmp_g.member))) -> Nested Loop (cost=0.00..1421.74 rows=1350 width=986) (actual time=0.054..116.528 rows=1350 loops=1) -> Nested Loop (cost=0.00..734.12 rows=1350 width=1023) (actual time=0.038..76.647 rows=1350 loops=1) -> Seq Scan on ldap_group_inheritance i (cost=0.00..46.50 rows=1350 width=166) (actual time=0.015..1.633 rows=1350 loops=1) -> Index Scan using ldap_import_groups_dn_key on ldap_import_groups tmp_g (cost=0.00..0.50 rows=1 width=940) (actual time=0.048..0.049 rows=1 loops=1350) Index Cond: (tmp_g.dn = i.groupdn) -> Index Scan using ldap_import_groups_dn_key on ldap_import_groups g (cost=0.00..0.50 rows=1 width=129) (actual time=0.022..0.026 rows=1 loops=1350) Index Cond: (g.dn = i.parentdn) -> Seq Scan on ldap_import_users u (cost=0.00..3856.30 rows=83130 width=372) (actual time=0.006..26.162 rows=83130 loops=1350) -> Seq Scan on ldap_import_users u (cost=0.00..4687.60 rows=83130 width=126) (actual time=0.098..2499.336 rows=339573 loops=1) Total runtime: 250301.001 ms (17 rows) T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 30.
    pgadmin3 http://www.pgadmin.org/ GUI umoˇnuj´ ı mimo jin´ i vizualizaci SQL dotaz˚ zˇ ıc´ e u T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 31.
    auto explain &explanation auto explain jak´si doplnˇk k log min duration statement y e umoˇnuje logovat EXPLAIN (ˇi EXPLAIN ANALYZE) pro dlouh´ dotazy zˇ c e http://developer.postgresql.org/pgdocs/postgres/auto-explain.html explanation flexibilnˇjˇ´ pr´ce s informacemi o pl´nu pˇımo v SQL e sı a a r´ http://www.pgxn.org/dist/explanation/doc/explanation.html SELECT node_type , strategy , actual_startup_time , ac tual _to tal _ti me FROM explanation ( query := $$ SELECT * FROM pg_class WHERE relname = ’ users ’ $$ , analyzed := true ); node_type | strategy | a c t u al _ s t a r t u p _t i m e | act ual _tot al_ tim e -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Index Scan | | 00:00:00.000017 | 00:00:00.000017 T. Vondra (CSPUG) ˇ Cteme EXPLAIN
  • 32.
    Odkazy Query Execution Techniques in PostgreSQL, Neil Conway, 2007 http://neilconway.org/talks/executor.pdf ˇ ı Cten´ prov´dˇc´ pl´n˚ v PostgreSQL, Pavel Stˇhule, 2008 a e ıch a u e http://www.root.cz/clanky/cteni-provadecich-planu-v-postgresql/ Using EXPLAIN @ wiki http://wiki.postgresql.org/wiki/Using_EXPLAIN Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT @ wiki http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE, _EXPLAIN,_and_COUNT Explaining EXPLAIN, R. Treat, G. S. Mullane, AndrewSN, Magnifikus, B. Encina, N. Conway, 2008 http://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf T. Vondra (CSPUG) ˇ Cteme EXPLAIN