• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
SQL Top-N and pagination pattern (IOUG)
 

SQL Top-N and pagination pattern (IOUG)

on

  • 423 views

SQL Top-N and Pagination pattern

SQL Top-N and Pagination pattern
IOUG Collaborate 2013, Denver CO, Wednesday April 10

Statistics

Views

Total Views
423
Views on SlideShare
409
Embed Views
14

Actions

Likes
0
Downloads
4
Comments
0

2 Embeds 14

http://www.linkedin.com 12
https://www.linkedin.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • This feels a bit 1990s … Let’s see some examples that are more 21st century
  • Nowadays every website has a search buttonUsers can search for anythingWhen you search you get some results, but these are normally hugeYou need to cut the results and display a few that are most interesting and fit the page (aka: top-n) and then you need to be able to move on to less interesting results (aka: paginate)“Most interesting” can be defined several different ways
  • Notice no SORT step
  • Likely: return a few rows, ‘freeze’, then return a few rows again etcFiltering in “plain English”: reading junk
  • Notice no TABLE ACCESS BY INDEX ROWID – all data is in the index
  • For pagination, equality filter and order by conditions are tradeableYou can either “fix” the condition (where a=…) or include it into order by
  • Can work if number of “filtered” conditions is small
  • … And then we run a select
  • Alternatively, statement can extract current version from the index itself:SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus='Y' AND version=(SELECT max(version) FROM cities2 WHERE budget_surplus=‘Y’) ORDER BY population DESC) WHERE rownum <= 5;

SQL Top-N and pagination pattern (IOUG) SQL Top-N and pagination pattern (IOUG) Presentation Transcript

  • SQL Top-Nand Pagination Pattern Session 403 Maxym Kharchenko
  • What is top-N• Give me the top 10 salaries in the “Sales” dept• Give me the top 10 best selling books• Give me the 10 latest orders
  • Lolcats!
  • What is top-NSELECT picture “The internet”FROM imagesWHERE subject=‘lolcats’/sorted by: funny “Lolcats” view more: next >
  • SetupSQL> @desc cities Name Null? Type -------------------------- -------- --------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBERPCTFREE 99 PCTUSED 1http://www.census.gov
  • Naïve Top-N Give me the top 5 cities by population NAME PopSELECT name, population ---------------------- ------FROM cities Robertsdale city 5,276WHERE rownum <= 5 Glen Allen town (pt.) 458ORDER BY population DESC; Boligee town 328 Riverview town 184 Altoona town (pt.) 30Statistics 7 consistent gets
  • Naïve Top-N explained-----------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Time |-----------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 110 | 00:00:01 || 1 | SORT ORDER BY | | 5 | 110 | 00:00:01 ||* 2 | COUNT STOPKEY | | | | || 3 | TABLE ACCESS FULL| CITIES | 10 | 220 | 00:00:01 |-----------------------------------------------------------------
  • Correct top-N querySELECT name, population SELECT * FROM (FROM cities SELECT name, populationORDER BY population DESC FROM citiesFETCH FIRST 5 ROWS ONLY ORDER BY population DESC ) WHERE rownum <= 5 >= 12c <= 11g
  • Correct top-N query: Execution NAME PopSELECT * FROM ( -------------------- ---------- SELECT name, population Los Angeles city 3,792,621 FROM cities Chicago city (pt.) 2,695,598 Chicago city (pt.) 2,695,598 ORDER BY population DESC Chicago city 2,695,598) WHERE rownum <= 5; New York city (pt.) 2,504,700Statistics 56024 consistent gets
  • Reading, filtering and sorting---------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |---------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | CITIES | 56072 | | 00:01:54 |---------------------------------------------------------------------
  • Reading, filtering and sorting----------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |----------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | O_CITIES| 56072 | | 00:01:54 |----------------------------------------------------------------------
  • Proper data structureOrdered By: Population CREATE INDEX i_pop ON cities(population);--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 56072 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 10 | 00:00:01 |-------------------------------------------------------------------- Statistics 12 consistent gets
  • Why indexes workOrdered By: Population CREATE INDEX i_pop ON cities(population);• Colocation• Can stop after reading N rows• No Sort
  • More elaborate top-N Give me the top 5 cities by population in Florida NAME PopSELECT * FROM ( -------------------- ---------- SELECT name, population Jacksonville city 821,784 FROM cities Miami city 399,457 Tampa city 335,709 WHERE state=Florida St. Petersburg city 244,769 ORDER BY population DESC Orlando city 238,300) WHERE rownum <= 5; Statistics 264 consistent gets
  • Uncertain nature of filteringOrdered By: Population WHERE state=Florida WHERE state=Florida ORDER BY population DESC ORDER BY population DESC) WHERE rownum <= 5; ) WHERE rownum <= 200;Statistics Statistics264 consistent gets 19747 consistent gets
  • Multi column indexes CREATE INDEX i_state_pop ON cities(state, population); where state=‘FL’ State AL AK AZ CO FL MA WAPopulation *NOT* Ordered by: Ordered By: Population Population
  • Multicolumn indexes-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------Predicate Information (identified by operation id): 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida) Statistics 12 consistent gets
  • Trips to the table-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------Predicate Information (identified by operation id): 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida) Statistics 12 consistent gets
  • Index range scan: cost math ~4 ~10 500 Window: 500 records
  • Covering index CREATE INDEX i_state_pop CREATE INDEX i_state_pop_c ON cities ON cities (state, population); (state, population, name); Statistics Statistics 12 consistent gets 7 consistent gets--------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 506 | 00:00:01 |--------------------------------------------------------------------------
  • Ideal top-N• Use the index• Make the best index• And read only from the index
  • Less than ideal top-N• Effect of query conditions• Effect of deletes and updates• Technicalities
  • Condition better!CREATE TABLE orders ( … active char(1) NOT NULL CHECK (active IN (Y, N))WHERE active != N WHERE active = Y ORDER BY order_date DESC ORDER BY order_date DESC) WHERE rownum <= 10; ) WHERE rownum <= 10;Statistics Statistics12345 consistent gets 10 consistent gets
  • Trade WHERE for ORDER BYCREATE INDEX t_idx ON t(a, b, c);SELECT * FROM (SELECT * FROM t WHERE a=12 ORDER BY c)WHERE rownum <= 10;WHERE a=12 ORDER BY c Statistics 1200 consistent getsWHERE a=12 ORDER BY b, c Statistics 12 consistent getsWHERE a=12 AND b=0 StatisticsORDER BY c 12 consistent gets
  • Tolerate filteringSELECT * FROM ( SELECT name, population FROM cities WHERE state != Florida ORDER BY population DESC) WHERE rownum <= 10; Statistics 28 consistent gets
  • Tolerate filtering--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 11 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 ||* 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 55566 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 12 | 00:00:01 |-------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=10) 3 - filter("STATE"<>Florida)
  • Updates and DeletesSQL> @desc cities2 Name Null? Type ---------------------- -------- ---------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER BUDGET_SURPLUS NOT NULL VARCHAR2(1)CREATE INDEX i2_popON cities2(budget_surplus, population, name);
  • Updates and DeletesSELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 5;-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |------------------------------------------------------------------- Statistics 7 consistent gets
  • Updates and DeletesUPDATE cities2 SET budget_surplus=N WHERE rowid IN ( SELECT * FROM ( SELECT rowid FROM cities2 ORDER BY population DESC ) WHERE rownum <= 200);-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |------------------------------------------------------------------- Statistics 207 consistent gets
  • Updates and Deletes
  • Updates and DeletesALTER TABLE cities2 ADD (version number default 0 NOT NULL);CREATE INDEX i2_vpop ON cities2(budget_surplus, version, population);UPDATE cities2 SET version=1WHERE budget_surplus=Y AND version=0; Budget_surplus Y Y Budget_surplus Version 0 Y 1 Population
  • Updates and DeletesSELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y AND version=1 ORDER BY population DESC) WHERE rownum <= 5;--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 1 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_VPOP | 1 | 00:00:01 |-------------------------------------------------------------------- Statistics 9 consistent gets
  • PaginationSELECT * FROM ( SELECT * FROM ( SELECT name, population SELECT * FROM ( FROM cities SELECT name, population, WHERE state=Florida rownum AS rn ORDER BY population DESC FROM cities) WHERE rownum <= 10; WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 20 ) WHERE rn > 10;
  • Dumb Pagination) WHERE rownum <= 20 Statistics) WHERE rn > 10; 22 consistent gets) WHERE rownum <= 30 Statistics) WHERE rn > 20; 32 consistent gets
  • Smart paginationSELECT * FROM ( SELECT * FROM ( SELECT * FROM ( SELECT name, population SELECT name, population, FROM cities rownum AS rn WHERE state=Florida FROM cities AND population < 154750 WHERE state=Florida ORDER BY population DESC ORDER BY population DESC ) WHERE rownum <= 10; ) WHERE rownum <= 20) WHERE rn > 10;Statistics Statistics 22 consistent gets 12 consistent gets
  • Top-N with joins: Rules• ORDER BY only the LEADING table• Use NESTED LOOPS• Build indexes for STREAMING
  • Top-N with joinsSELECT * FROM ( Driving Filter state table: SELECT c.name as city, Order By population c.population, s.capital Join state_id FROM cities c, states s Select name WHERE c.state_id = s.id AND c.state=Florida ORDER BY c.population DESC Joined to Join id) WHERE rownum <= 5 table: Select capital/
  • Top-N with joins: Good-------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:13 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:13 || 3 | NESTED LOOPS | | 10 | 00:00:13 ||* 4 | INDEX RANGE SCAN| I_C | 506 | 00:00:07 ||* 5 | INDEX RANGE SCAN| I_S | 1 | 00:00:01 |-------------------------------------------------------
  • Top-N with joins: Bad-----------------------------------------------------------| Id | Operation | Name | Rows | Time |-----------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:07 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:07 ||* 3 | SORT ORDER BY STOPKEY| | 10 | 00:00:07 ||* 4 | HASH JOIN | | 10 | 00:00:07 ||* 5 | INDEX RANGE SCAN | I_C | 506 | 00:00:07 ||* 6 | INDEX RANGE SCAN | I_S | 1 | 00:00:01 |-----------------------------------------------------------
  • Gotchas? TMI“Too many indexes”
  • Thank you!