SQL Top-Nand Pagination Pattern     Session 403   Maxym Kharchenko
What is top-N• Give me the top 10 salaries in the “Sales” dept• Give me the top 10 best selling books• Give me the 10 late...
Lolcats!
What is top-NSELECT picture                            “The internet”FROM imagesWHERE subject=‘lolcats’/sorted by: funny  ...
SetupSQL> @desc cities Name                           Null?      Type --------------------------     --------   ----------...
Naïve Top-N Give me the top 5 cities by population                            NAME                      PopSELECT name, po...
Naïve Top-N explained-----------------------------------------------------------------| Id | Operation            | Name  ...
Correct top-N querySELECT name, population    SELECT * FROM (FROM cities                  SELECT name, populationORDER BY ...
Correct top-N query:          Execution                             NAME                        PopSELECT * FROM (        ...
Reading, filtering           and sorting---------------------------------------------------------------------| Id | Operat...
Reading, filtering             and sorting----------------------------------------------------------------------| Id | Ope...
Proper data structureOrdered By: Population                 CREATE INDEX i_pop ON cities(population);---------------------...
Why indexes workOrdered By: Population                 CREATE INDEX i_pop ON cities(population);• Colocation• Can stop aft...
More elaborate top-N  Give me the top 5 cities by population in Florida                             NAME                  ...
Uncertain nature            of filteringOrdered By: Population   WHERE state=Florida        WHERE state=Florida  ORDER BY ...
Multi column indexes  CREATE INDEX i_state_pop ON cities(state, population);                                        where ...
Multicolumn indexes-------------------------------------------------------------------------| Id | Operation              ...
Trips to the table-------------------------------------------------------------------------| Id | Operation               ...
Index range scan:   cost math                         ~4                       ~10                       500              ...
Covering index CREATE INDEX i_state_pop                CREATE INDEX i_state_pop_c ON cities                               ...
Ideal top-N• Use the index• Make the best index• And read only from the index
Less than ideal top-N• Effect of query conditions• Effect of deletes and updates• Technicalities
Condition better!CREATE TABLE orders ( … active char(1) NOT NULL CHECK (active IN (Y, N))WHERE active != N               W...
Trade WHERE           for ORDER BYCREATE INDEX t_idx ON t(a, b, c);SELECT * FROM (SELECT * FROM t WHERE a=12 ORDER BY c)WH...
Tolerate filteringSELECT * FROM (  SELECT name, population  FROM cities  WHERE state != Florida  ORDER BY population DESC)...
Tolerate filtering--------------------------------------------------------------------| Id | Operation                    ...
Updates and DeletesSQL> @desc cities2  Name                    Null?      Type ----------------------   --------   -------...
Updates and DeletesSELECT * FROM (  SELECT name, population FROM cities2  WHERE budget_surplus=Y ORDER BY population DESC)...
Updates and DeletesUPDATE cities2 SET budget_surplus=N WHERE rowid IN ( SELECT * FROM (   SELECT rowid FROM cities2 ORDER ...
Updates and Deletes
Updates and DeletesALTER TABLE cities2 ADD (version number default 0 NOT NULL);CREATE INDEX i2_vpop ON cities2(budget_surp...
Updates and DeletesSELECT * FROM (  SELECT name, population FROM cities2  WHERE budget_surplus=Y AND version=1  ORDER BY p...
PaginationSELECT * FROM (              SELECT * FROM (  SELECT name, population      SELECT * FROM (  FROM cities         ...
Dumb Pagination) WHERE rownum <= 20   Statistics) WHERE rn > 10;         22 consistent gets) WHERE rownum <= 30   Statisti...
Smart paginationSELECT * FROM (                SELECT * FROM (  SELECT * FROM (                SELECT name, population    ...
Top-N with joins: Rules• ORDER BY only the LEADING table• Use NESTED LOOPS• Build indexes for STREAMING
Top-N with joinsSELECT * FROM (                Driving   Filter      state                               table:  SELECT c....
Top-N with joins: Good-------------------------------------------------------| Id | Operation            | Name | Rows | T...
Top-N with joins: Bad-----------------------------------------------------------| Id | Operation                | Name | R...
Gotchas?     TMI“Too many indexes”
Thank you!
Upcoming SlideShare
Loading in …5
×

SQL Top-N and pagination pattern (IOUG)

607 views

Published on

SQL Top-N and Pagination pattern
IOUG Collaborate 2013, Denver CO, Wednesday April 10

Published in: Technology, News & Politics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
607
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • This feels a bit 1990s … Let’s see some examples that are more 21st century
  • Nowadays every website has a search buttonUsers can search for anythingWhen you search you get some results, but these are normally hugeYou need to cut the results and display a few that are most interesting and fit the page (aka: top-n) and then you need to be able to move on to less interesting results (aka: paginate)“Most interesting” can be defined several different ways
  • Notice no SORT step
  • Likely: return a few rows, ‘freeze’, then return a few rows again etcFiltering in “plain English”: reading junk
  • Notice no TABLE ACCESS BY INDEX ROWID – all data is in the index
  • For pagination, equality filter and order by conditions are tradeableYou can either “fix” the condition (where a=…) or include it into order by
  • Can work if number of “filtered” conditions is small
  • … And then we run a select
  • Alternatively, statement can extract current version from the index itself:SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=&apos;Y&apos; AND version=(SELECT max(version) FROM cities2 WHERE budget_surplus=‘Y’) ORDER BY population DESC) WHERE rownum &lt;= 5;
  • SQL Top-N and pagination pattern (IOUG)

    1. 1. SQL Top-Nand Pagination Pattern Session 403 Maxym Kharchenko
    2. 2. What is top-N• Give me the top 10 salaries in the “Sales” dept• Give me the top 10 best selling books• Give me the 10 latest orders
    3. 3. Lolcats!
    4. 4. What is top-NSELECT picture “The internet”FROM imagesWHERE subject=‘lolcats’/sorted by: funny “Lolcats” view more: next >
    5. 5. SetupSQL> @desc cities Name Null? Type -------------------------- -------- --------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBERPCTFREE 99 PCTUSED 1http://www.census.gov
    6. 6. Naïve Top-N Give me the top 5 cities by population NAME PopSELECT name, population ---------------------- ------FROM cities Robertsdale city 5,276WHERE rownum <= 5 Glen Allen town (pt.) 458ORDER BY population DESC; Boligee town 328 Riverview town 184 Altoona town (pt.) 30Statistics 7 consistent gets
    7. 7. Naïve Top-N explained-----------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Time |-----------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 110 | 00:00:01 || 1 | SORT ORDER BY | | 5 | 110 | 00:00:01 ||* 2 | COUNT STOPKEY | | | | || 3 | TABLE ACCESS FULL| CITIES | 10 | 220 | 00:00:01 |-----------------------------------------------------------------
    8. 8. Correct top-N querySELECT name, population SELECT * FROM (FROM cities SELECT name, populationORDER BY population DESC FROM citiesFETCH FIRST 5 ROWS ONLY ORDER BY population DESC ) WHERE rownum <= 5 >= 12c <= 11g
    9. 9. Correct top-N query: Execution NAME PopSELECT * FROM ( -------------------- ---------- SELECT name, population Los Angeles city 3,792,621 FROM cities Chicago city (pt.) 2,695,598 Chicago city (pt.) 2,695,598 ORDER BY population DESC Chicago city 2,695,598) WHERE rownum <= 5; New York city (pt.) 2,504,700Statistics 56024 consistent gets
    10. 10. Reading, filtering and sorting---------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |---------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | CITIES | 56072 | | 00:01:54 |---------------------------------------------------------------------
    11. 11. Reading, filtering and sorting----------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |----------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | O_CITIES| 56072 | | 00:01:54 |----------------------------------------------------------------------
    12. 12. Proper data structureOrdered By: Population CREATE INDEX i_pop ON cities(population);--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 56072 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 10 | 00:00:01 |-------------------------------------------------------------------- Statistics 12 consistent gets
    13. 13. Why indexes workOrdered By: Population CREATE INDEX i_pop ON cities(population);• Colocation• Can stop after reading N rows• No Sort
    14. 14. More elaborate top-N Give me the top 5 cities by population in Florida NAME PopSELECT * FROM ( -------------------- ---------- SELECT name, population Jacksonville city 821,784 FROM cities Miami city 399,457 Tampa city 335,709 WHERE state=Florida St. Petersburg city 244,769 ORDER BY population DESC Orlando city 238,300) WHERE rownum <= 5; Statistics 264 consistent gets
    15. 15. Uncertain nature of filteringOrdered By: Population WHERE state=Florida WHERE state=Florida ORDER BY population DESC ORDER BY population DESC) WHERE rownum <= 5; ) WHERE rownum <= 200;Statistics Statistics264 consistent gets 19747 consistent gets
    16. 16. Multi column indexes CREATE INDEX i_state_pop ON cities(state, population); where state=‘FL’ State AL AK AZ CO FL MA WAPopulation *NOT* Ordered by: Ordered By: Population Population
    17. 17. Multicolumn indexes-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------Predicate Information (identified by operation id): 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida) Statistics 12 consistent gets
    18. 18. Trips to the table-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------Predicate Information (identified by operation id): 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida) Statistics 12 consistent gets
    19. 19. Index range scan: cost math ~4 ~10 500 Window: 500 records
    20. 20. Covering index CREATE INDEX i_state_pop CREATE INDEX i_state_pop_c ON cities ON cities (state, population); (state, population, name); Statistics Statistics 12 consistent gets 7 consistent gets--------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 506 | 00:00:01 |--------------------------------------------------------------------------
    21. 21. Ideal top-N• Use the index• Make the best index• And read only from the index
    22. 22. Less than ideal top-N• Effect of query conditions• Effect of deletes and updates• Technicalities
    23. 23. Condition better!CREATE TABLE orders ( … active char(1) NOT NULL CHECK (active IN (Y, N))WHERE active != N WHERE active = Y ORDER BY order_date DESC ORDER BY order_date DESC) WHERE rownum <= 10; ) WHERE rownum <= 10;Statistics Statistics12345 consistent gets 10 consistent gets
    24. 24. Trade WHERE for ORDER BYCREATE INDEX t_idx ON t(a, b, c);SELECT * FROM (SELECT * FROM t WHERE a=12 ORDER BY c)WHERE rownum <= 10;WHERE a=12 ORDER BY c Statistics 1200 consistent getsWHERE a=12 ORDER BY b, c Statistics 12 consistent getsWHERE a=12 AND b=0 StatisticsORDER BY c 12 consistent gets
    25. 25. Tolerate filteringSELECT * FROM ( SELECT name, population FROM cities WHERE state != Florida ORDER BY population DESC) WHERE rownum <= 10; Statistics 28 consistent gets
    26. 26. Tolerate filtering--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 11 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 ||* 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 55566 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 12 | 00:00:01 |-------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=10) 3 - filter("STATE"<>Florida)
    27. 27. Updates and DeletesSQL> @desc cities2 Name Null? Type ---------------------- -------- ---------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER BUDGET_SURPLUS NOT NULL VARCHAR2(1)CREATE INDEX i2_popON cities2(budget_surplus, population, name);
    28. 28. Updates and DeletesSELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 5;-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |------------------------------------------------------------------- Statistics 7 consistent gets
    29. 29. Updates and DeletesUPDATE cities2 SET budget_surplus=N WHERE rowid IN ( SELECT * FROM ( SELECT rowid FROM cities2 ORDER BY population DESC ) WHERE rownum <= 200);-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |------------------------------------------------------------------- Statistics 207 consistent gets
    30. 30. Updates and Deletes
    31. 31. Updates and DeletesALTER TABLE cities2 ADD (version number default 0 NOT NULL);CREATE INDEX i2_vpop ON cities2(budget_surplus, version, population);UPDATE cities2 SET version=1WHERE budget_surplus=Y AND version=0; Budget_surplus Y Y Budget_surplus Version 0 Y 1 Population
    32. 32. Updates and DeletesSELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y AND version=1 ORDER BY population DESC) WHERE rownum <= 5;--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 1 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_VPOP | 1 | 00:00:01 |-------------------------------------------------------------------- Statistics 9 consistent gets
    33. 33. PaginationSELECT * FROM ( SELECT * FROM ( SELECT name, population SELECT * FROM ( FROM cities SELECT name, population, WHERE state=Florida rownum AS rn ORDER BY population DESC FROM cities) WHERE rownum <= 10; WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 20 ) WHERE rn > 10;
    34. 34. Dumb Pagination) WHERE rownum <= 20 Statistics) WHERE rn > 10; 22 consistent gets) WHERE rownum <= 30 Statistics) WHERE rn > 20; 32 consistent gets
    35. 35. Smart paginationSELECT * FROM ( SELECT * FROM ( SELECT * FROM ( SELECT name, population SELECT name, population, FROM cities rownum AS rn WHERE state=Florida FROM cities AND population < 154750 WHERE state=Florida ORDER BY population DESC ORDER BY population DESC ) WHERE rownum <= 10; ) WHERE rownum <= 20) WHERE rn > 10;Statistics Statistics 22 consistent gets 12 consistent gets
    36. 36. Top-N with joins: Rules• ORDER BY only the LEADING table• Use NESTED LOOPS• Build indexes for STREAMING
    37. 37. Top-N with joinsSELECT * FROM ( Driving Filter state table: SELECT c.name as city, Order By population c.population, s.capital Join state_id FROM cities c, states s Select name WHERE c.state_id = s.id AND c.state=Florida ORDER BY c.population DESC Joined to Join id) WHERE rownum <= 5 table: Select capital/
    38. 38. Top-N with joins: Good-------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:13 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:13 || 3 | NESTED LOOPS | | 10 | 00:00:13 ||* 4 | INDEX RANGE SCAN| I_C | 506 | 00:00:07 ||* 5 | INDEX RANGE SCAN| I_S | 1 | 00:00:01 |-------------------------------------------------------
    39. 39. Top-N with joins: Bad-----------------------------------------------------------| Id | Operation | Name | Rows | Time |-----------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:07 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:07 ||* 3 | SORT ORDER BY STOPKEY| | 10 | 00:00:07 ||* 4 | HASH JOIN | | 10 | 00:00:07 ||* 5 | INDEX RANGE SCAN | I_C | 506 | 00:00:07 ||* 6 | INDEX RANGE SCAN | I_S | 1 | 00:00:01 |-----------------------------------------------------------
    40. 40. Gotchas? TMI“Too many indexes”
    41. 41. Thank you!

    ×