Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Track: Database                            SQL TOP-N AND PAGINATION PATTERN                                               ...
Track: Database    2. Displaying everything wouldn‟t be a very wise thing to do anyway, due to the fact that users (or, at...
Track: Database STATE varchar2(2), COUNTY varchar2(4), PLACE varchar2(100), COUSUB varchar2(100), CONCIT varchar2(100), NA...
Track: Databaseorganization external (  type              oracle_loader  default directory zips_dir  access parameters (  ...
Track: Databasedrop table ext_sub_est2011_al_mo/drop table ext_sub_est2011_mt_wy/exec dbms_stats.gather_table_stats(user, ...
Track: DatabaseORDER BY population DESC/NAME            POPULATION--------------- ----------Alabaster city       30352Adam...
Track: Databaseorder by population DESCLIMIT 5;In MongoDb (mongo shell):db.cities.find({}, {name: 1, population: 1, _id: 0...
Track: DatabaseLos Angeles city                   3792621Chicago city                       2695598Chicago city (pt.)     ...
Track: DatabaseWould it make a difference? Let‟s find out:CREATE TABLE ordered_cities pctfree 99 pctused 1  AS SELECT * FR...
Track: DatabaseIndex created.SQL> SELECT * FROM (  SELECT name, population  FROM cities  ORDER BY population DESC) WHERE r...
Track: DatabaseWHY INDEXES WORKThere are 3 reasons:    1) Because index is sorted, all the data that we are looking for is...
Track: DatabaseSELECT * FROM (  SELECT name, population  FROM cities c  WHERE state=Florida  ORDER BY population DESC) WHE...
Track: DatabaseConsider a multicolumn index on (STATE, POPULATION).It is ordered on STATE and also on STATE+POPULATION, bu...
Track: Database    3) If additional filtering (or additional data) is needed that is not in the index, for each index entr...
Track: DatabaseStatistics----------------------------------------------------------       1002 consistent getsCOVERING IND...
Track: Database    2) Make the best index    3) And read only from the indexLESS THAN IDEAL TOP-NHaving ideal situation to...
Track: Database---------------------------------------------------------------------------| Id | Operation           | Nam...
Track: DatabaseIt matters a great deal to ORACLE. Equality condition usually means a very specific and narrow range of rec...
Track: Database   3 - filter(ROWNUM<=10)   4 - filter("STATE">=Florida)Statistics-----------------------------------------...
Track: Database|   0 | SELECT STATEMENT              |         |    10 | 1170 |     26   (0)| 00:00:01 ||* 1 | COUNT STOPK...
Track: DatabaseLet‟s look at the example.SQL> CREATE TABLE cities2 (name, state, population, budget_surplus)PCTFREE 99 PCT...
Track: Database  SELECT r FROM (    SELECT rowid r FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE row...
Track: DatabaseWhen we re-run our top-5 query now we suddenly have to read a lot more blocks (through holes) to get to the...
Track: DatabaseSQL> CREATE INDEX i_pop_v ON cities2 (budget_surplus, version, population)pctfree 99;Index created.-- Let‟s...
Track: DatabaseSELECT * FROM (  SELECT name, population FROM cities2  WHERE budget_surplus=Y AND version=0 ORDER BY popula...
Track: Database|   0 | SELECT STATEMENT               |         |     5 |   325 |    12   (0)| 00:00:01 ||* 1 | COUNT STOP...
Track: Database    1) The inner query that restricts and orders data    2) The intermediate query that restricts the upper...
Track: Database/Statistics----------------------------------------------------------       1002 consistent getsNotice a cu...
Track: DatabaseJacksonville city                   821784Miami city                          399457Miami city             ...
Track: Database              13     consistent getsNotice that we are reading essentially the same number of blocks to ans...
Track: DatabaseLet‟s now run our top-N query that will select top-5 cities in Florida AND their state capital.SELECT * FRO...
Track: DatabaseThere are a number of requirements for successful top-N queries with joins that you need to follow and here...
Upcoming SlideShare
Loading in …5
×

SQL Top-N and Pagination Pattern (IOUG) - Whitepaper

1,311 views

Published on

Published in: Technology
  • Be the first to comment

SQL Top-N and Pagination Pattern (IOUG) - Whitepaper

  1. 1. Track: Database SQL TOP-N AND PAGINATION PATTERN Maxym KharchenkoABSTRACT Pagination (of which top-N is a special use case) is a very common SQL query technique. It deals with extracting a limited number of “most interesting” records from a potentially large qualifying result set. While pagination requests seem simple (and the next generation of ORACLE database made them even simpler), executing these queries efficiently requires some ground work in both schema design and query coding.TARGET AUDIENCEDBAs and developers, who design and tune SQL queries will benefit from this whitepaper. Attendees are expected to befamiliar with (ORACLE) SQL query syntax and be able to interpret and understand SQL plans and execution statistics.EXECUTIVE SUMMARY “Top-N” queries and their close cousins: “pagination” queries are a special class of SQL queries that have a very uniquerequirement: Do NOT return all the data that qualifies!Instead, these queries impose an additional “data window” restriction that is designed to return “no more than N” mostinteresting records to the user.While this additional restriction seems simple, it requires effort to implement it so that top-N and pagination queries areefficient every time they run. Making top-N and pagination queries efficient is the focus of this whitepaper. Learner will be able to: Design efficient and well performing top-N and pagination SQL queries. Design databases objects (i.e. indexes or additional columns) to support efficient pagination. Spot potential problems with pagination queries and address them.BACKGROUNDMost SQL queries are designed to answer user questions completely. That is, whatever restrictions a user puts in a WHEREclause, fully defines the data that user gets in the end. Be it 10 records, 1000 records or 10 million records.However, in some cases, knowing (and extracting) a full set of data may be overkill.Think of a typical search query on websites such as google.com or reddit.com. In these searches, users usually request searcheson very generic keywords, such as “LOLcats” or “snow chains”, that can result in thousands or millions of data hits. Now awebsite search engineer has a few problems to contend with while deciding how to display the data that qualified. 1. First of all, it would be blatantly impossible to “fit” all the qualified data in one screen. Even though computer monitors are getting increasingly larger, they are nowhere near the capacity to hold millions of “items of interest”. 1 Session # 403
  2. 2. Track: Database 2. Displaying everything wouldn‟t be a very wise thing to do anyway, due to the fact that users (or, at least, human users) do not have mental capacity to grasp “millions of items”. According to medical research, after a few seconds of focused attention, “it is likely that a person will look away, return to a previous task, or think about something else”.These are, obviously, major issues and the very common technique to address them is called “pagination”, where query resultsare organized and presented in “pages”: a user immediately sees the first “page” that contains the most relevant data (we‟ll talkabout what this means shortly) and is given ability to look at other “pages” (aka: “paginate”) to see progressively lessinteresting data.As pagination queries are very likely supported by some sort of back end database (such as: ORACLE), let‟s look at theproblem from a query designer perspective.One obvious thing is the definition of “most interesting data”. This implies that the data must be ordered or pre-ordered(from the most interesting to the least interesting) at some point during query execution.Another obvious thing is that “fact based WHERE restrictions” are not the whole story here. While it is possible to extract allthe qualified data by SQL and then order and select most interesting records by some other means outside the database, itwould obviously be very inefficient.So, our first requirement is to do ordering and selection in the database: design SQL queries that will return only the limitedsubset of “the most interesting data” (“Give me the top 10 results”, aka: “top-N queries”) or only the limited subsets of “a bitless interesting data” (“Give me only the results from 20 to 30”, aka: “pagination queries”).Our second requirement is to make top-N and pagination queries consistently efficient. In a nutshell, such query: Has to be fast (duh!) Its timing has to be constant, regardless of whether 10 records or 10 million records qualified through WHERE clause Its timing cannot depend on whether it is a “page 1” or “page 1000” in a range of most interesting recordsWriting efficient pagination queries is a challenge (in other words, it is very easy to write an inefficient pagination query).Fortunately, there are a number of query techniques that can help us meet our efficiency goals and it is these techniques I willfocus on in this whitepaper.TECHNICAL DISCUSSIONS AND EXAMPLESSETUPFor these exercises, I‟m going to use 2011 US census data, which you can download here:http://www.census.gov/popest/data/cities/totals/2011/files/SUB-EST2011_AL_MO.csvhttp://www.census.gov/popest/data/cities/totals/2011/files/SUB-EST2011_MT_WY.csvCREATE DIRECTORY zips_dir AS „directory path‟;GRANT read, write ON DIRECTORY zips_dir TO your_user;CREATE TABLE ext_sub_est2011_al_mo ( SUMLEV varchar2(100), 2 Session # 403
  3. 3. Track: Database STATE varchar2(2), COUNTY varchar2(4), PLACE varchar2(100), COUSUB varchar2(100), CONCIT varchar2(100), NAME varchar2(100), STNAME varchar2(100), CENSUS2010POP varchar2(100), ESTIMATESBASE2010 varchar2(100), POPESTIMATE2010 varchar2(100), POPESTIMATE2011 varchar2(100))organization external ( type oracle_loader default directory zips_dir access parameters ( records delimited by newline skip 2 fields terminated by , missing field values are null ) location (SUB-EST2011_AL_MO.csv))reject limit unlimited/select count(1) from ext_sub_est2011_al_mo/drop table ext_sub_est2011_mt_wy/create table ext_sub_est2011_mt_wy ( SUMLEV varchar2(100), STATE varchar2(2), COUNTY varchar2(4), PLACE varchar2(100), COUSUB varchar2(100), CONCIT varchar2(100), NAME varchar2(100), STNAME varchar2(100), CENSUS2010POP varchar2(100), ESTIMATESBASE2010 varchar2(100), POPESTIMATE2010 varchar2(100), POPESTIMATE2011 varchar2(100)) 3 Session # 403
  4. 4. Track: Databaseorganization external ( type oracle_loader default directory zips_dir access parameters ( records delimited by newline skip 2 fields terminated by , missing field values are null ) location (SUB-EST2011_MT_WY.csv))reject limit unlimited/select count(1) from ext_sub_est2011_mt_wy/drop table cities/create table cities ( name not null, state not null, population not null) pctfree 99 pctused 1as select name, stname, to_number(census2010pop)from ext_sub_est2011_al_mowhere regexp_like(census2010pop, d+) and name <> stname and name NOT LIKE %County/insert /*+ append */ into cities ( select name, stname, to_number(census2010pop) from ext_sub_est2011_mt_wy where regexp_like(census2010pop, d+) and name <> stname and name NOT LIKE %County)/commit;select count(1) from cities/ 4 Session # 403
  5. 5. Track: Databasedrop table ext_sub_est2011_al_mo/drop table ext_sub_est2011_mt_wy/exec dbms_stats.gather_table_stats(user, CITIES);In the end, you should have the CITIES table with this simple structure:SQL> @desc CITIESName Null? Type ----------------------------------------- -------- ---------------------------- NAME VARCHAR2(100) STATE VARCHAR2(100) POPULATION NUMBERAnd a fair size for our purposes:SQL> SELECT segment_name, segment_type, round(bytes/1024/1024/1024, 2) as size_gbFROM dba_segments WHERE segment_name=CITIES;SEGMENT_NAME SEGMENT_TYPE SIZE_GB------------------------------ -------------------- -----------CITIES TABLE .4SQL> SELECT count(1) FROM cities; COUNT(1)---------- 75727NAÏVE TOP-NWe will begin our quest with this simple top-N query:GIVE ME THE TOP 5 MOST POPULOUS CITIES IN THE UNITED STATES.Despite its simplicity, this is a stumbling block for many developers (which makes it a good interview question :-) ). For many,the first approach to this query looks like:SELECT name, populationFROM citiesWHERE rownum <= 5 5 Session # 403
  6. 6. Track: DatabaseORDER BY population DESC/NAME POPULATION--------------- ----------Alabaster city 30352Adamsville city 4522Abbeville city 2688Addison town 758Akron town 356--------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |--------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 115 | | 15503 (1)| 00:03:22 || 1 | SORT ORDER BY | | 5 | 115 | 2576K| 15503 (1)| 00:03:22 ||* 2 | COUNT STOPKEY | | | | | | || 3 | TABLE ACCESS FULL| CITIES | 81698 | 1835K| | 14950 (1)| 00:03:15 |--------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter(ROWNUM<=5)Statistics---------------------------------------------------------- 6 consistent getsThe query is blazingly fast (only 6 consistent gets), but … can you notice anything wrong about results?Of course! As much as I love “Alabaster City”, I doubt that it is the most populous city in the United States.What happened here can be explained by the fact that in ORACLE SQL, WHERE condition is evaluated before ORDER BY.Thus, the WHERE rownum <= 5 data is selected first (taking the first 5 records from the first data block) and then these“random results” are sorted.This is obviously not what we wanted.CORRECT TOP-NSince WHERE is processed before ORDER BY, we have to modify our query to get correct results.In many databases, top-N requests can be coded quite simply. I.e., this is an example of top-N query in MySql:SELECT name, populationFROM cities 6 Session # 403
  7. 7. Track: Databaseorder by population DESCLIMIT 5;In MongoDb (mongo shell):db.cities.find({}, {name: 1, population: 1, _id: 0}) .sort({population: -1}) .limit(5)And in ORACLE:SELECT name, populationFROM citiesORDER BY population DESCFETCH FIRST 5 ROWS ONLY;ORACLE query actually does not look too bad. Unfortunately, this syntax is only available in (as of yet unreleased) ORACLE12c. In ORACLE 11g and before, the top-N syntax is more complicated and consists of two queries: the inner query that does the ordering the outer query that does the limitingSELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5;CORRECT TOP-N QUERY: EXECUTION:Let‟s execute this query and see how it performs:set timi onset autotrace onSELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5/NAME POPULATION------------------ ----------New York city 8175133Los Angeles city 3792621 7 Session # 403
  8. 8. Track: DatabaseLos Angeles city 3792621Chicago city 2695598Chicago city (pt.) 2695598Elapsed: 00:00:04.44------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | | 14975 (1)| 00:03:15 ||* 1 | COUNT STOPKEY | | | | | | || 2 | VIEW | | 86662 | 5501K| | 14975 (1)| 00:03:15 ||* 3 | SORT ORDER BY STOPKEY| | 86662 | 5501K| 6488K| 14975 (1)| 00:03:15 || 4 | TABLE ACCESS FULL | CITIES | 86662 | 5501K| | 13646 (1)| 00:02:58 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - filter(ROWNUM<=5)Statistics---------------------------------------------------------- 52432 consistent getsThe results look right, but wow! – 4.4 seconds is A LOT of time for this query. The reason for that becomes clear if we look atthe execution plan – a full table scan is being performed to get the results!Logically, what is happening is this:We have 5 records that we are really looking for here and they are scattered in some random places in the table segment.ORACLE thus must scan the entire segment to find them.But wait! Perhaps we were just unlucky. What if the records that we searched for were found in the first few blocks that wesearched, something like this: 8 Session # 403
  9. 9. Track: DatabaseWould it make a difference? Let‟s find out:CREATE TABLE ordered_cities pctfree 99 pctused 1 AS SELECT * FROM cities ORDER BY population DESC/SELECT * FROM ( SELECT name, population FROM ordered_cities ORDER BY population DESC) WHERE rownum <= 5/Elapsed: 00:00:06.49--------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |--------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | | 14579 (1)| 00:03:10 ||* 1 | COUNT STOPKEY | | | | | | || 2 | VIEW | | 66254 | 4205K| | 14579 (1)| 00:03:10 ||* 3 | SORT ORDER BY STOPKEY| | 66254 | 4205K| 4968K| 14579 (1)| 00:03:10 || 4 | TABLE ACCESS FULL | ORDERED_CITIES | 66254 | 4205K| | 13562 (1)| 00:02:57 |--------------------------------------------------------------------------------------------------Statistics---------------------------------------------------------- 52089 consistent getsThere is no difference! The problem here is that even though, all the results ARE in the first searched block – ORACLE doesnot know that it is true.Data in a regular (heap) ORACLE table may or may not be sorted, but the important point is that sorting order is notguaranteed. Thus ORACLE can never be sure if these are no more “better qualifying” records somewhere down the road andthus it has to scan everything and then sort everything.Obviously the results would be drastically different if we could be sure that the records are truly sorted. If only there was astorage object in ORACLE that would enforce this …“GUARANTEED ORDER” DATA STRUCTURE (AKA: “AN INDEX”)The storage object, which we are looking for is called an index. Let‟s build one on CITIES.POPULATION column and seewhat happens:SQL> CREATE INDEX i_pop ON cities(population) pctfree 99; 9 Session # 403
  10. 10. Track: DatabaseIndex created.SQL> SELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5/NAME POPULATION------------------ ----------New York city 8175133Los Angeles city 3792621Los Angeles city 3792621Chicago city (pt.) 2695598Chicago city 2695598Elapsed: 00:00:00.02----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 22 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 22 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID| CITIES | 75727 | 1626K| 22 (0)| 00:00:01 || 4 | INDEX FULL SCAN DESCENDING| I_POP | 10 | | 12 (0)| 00:00:01 |----------------------------------------------------------------------------------------Statistics---------------------------------------------------------- 12 consistent getsHuge difference, only 12 consistent gets, 2 orders of magnitude smaller!Note: don‟t be freaked out by: INDEX FULL SCAN, it is really a range scan and ORACLE just misrepresents it. Theimportant metric here is: number of consistent gets and also, you can easily verify what is happening by running a SQL trace. 10 Session # 403
  11. 11. Track: DatabaseWHY INDEXES WORKThere are 3 reasons: 1) Because index is sorted, all the data that we are looking for is co-located together, so ORACLE only needs to scan a few pages to get all results 2) More importantly, because of the order guarantee, ORACLE can stop after reading 5 records from the index – as there cannot be any qualified data left 3) Finally, notice that SORT ORDER BY operation is gone and this is not a small thing by itself.UNCERTAIN NATURE OF FILTERINGLet‟s try a more elaborate top-N query by asking for the most populous cities, located in Florida.GIVE ME THE TOP 5 MOST POPULOUS CITIES IN FLORIDA.We are going to use the same index on POPULATION.SELECT * FROM ( SELECT name, population FROM cities c WHERE state=Florida ORDER BY population DESC) WHERE rownum <= 5/5 rows selected.----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 1015 (0)| 00:00:14 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 1015 (0)| 00:00:14 ||* 3 | TABLE ACCESS BY INDEX ROWID| CITIES | 1485 | 44550 | 1015 (0)| 00:00:14 || 4 | INDEX FULL SCAN DESCENDING| I_POP | 510 | | 512 (0)| 00:00:07 |----------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - filter("STATE"=Florida)Statistics---------------------------------------------------------- 284 consistent gets 11 Session # 403
  12. 12. Track: DatabaseSELECT * FROM ( SELECT name, population FROM cities c WHERE state=Florida ORDER BY population DESC) WHERE rownum <= 200/200 rows selected.----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10 | 650 | 1015 (0)| 00:00:14 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 1015 (0)| 00:00:14 ||* 3 | TABLE ACCESS BY INDEX ROWID| CITIES | 1485 | 44550 | 1015 (0)| 00:00:14 || 4 | INDEX FULL SCAN DESCENDING| I_POP | 510 | | 512 (0)| 00:00:07 |----------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=200) 3 - filter("STATE"=Florida)Statistics---------------------------------------------------------- 10010 consistent getsI think you can clearly see a problem here. We now have to filter in the index (in other words, we have to read through somejunk and throw it away). It might be ok for a small data window (depending on the data), but as pagination windows get larger,the problem becomes increasingly worse.A typical scenario here is that your query will return a few rows, “freeze”, return a few rows again etc. Clearly, it is not a goodsituation to be in.MULTICOLUMN INDEXESThere is a trick that we can do here and it is related to the fact that we have an equality condition on STATE. 12 Session # 403
  13. 13. Track: DatabaseConsider a multicolumn index on (STATE, POPULATION).It is ordered on STATE and also on STATE+POPULATION, but is not ordered on POPULATION directly.However, if we “fix the STATE with equality” (state = „Florida‟), we now have an effective subindex, which IS ordered byPOPULATION for Florida. And our top-N becomes efficient again.SQL> CREATE INDEX i_state_pop ON cities (state, population) pctfree 99;Index created.SELECT * FROM ( SELECT name, population FROM cities WHERE state=Florida ORDER BY population DESC) WHERE rownum <= 5/----------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 22 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 22 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1485 | 44550 | 22 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 10 | | 12 (0)| 00:00:01 |----------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida)Statistics---------------------------------------------------------- 12 consistent getsTRIPS TO THE TABLEOne final consideration for efficient top-N is trips to the table and it is a big one!Consider that top-N queries are, essentially, index range scans. A typical index range scan is executed in 3 steps: 1) Descend through the index structure to the beginning of the range 2) Traversing leaf blocks in the range until (top-N) condition is satisfied 13 Session # 403
  14. 14. Track: Database 3) If additional filtering (or additional data) is needed that is not in the index, for each index entry a trip to the table is made.Let‟s look at the costs of range scan, assuming our window size is 500 records (where rownum <= 500)Step 1 is usually very lightweight as ORACLE indexes are shallow. Typically we‟ll read 3-4 blocks here.Step 2 is heavier, but usually not by much, as all the data we are interested in is (hopefully) close together and on top of that,index entries are usually pretty small and you can pack a lot of them in one block. We are probably looking at something like5-10 logical reads here (assuming index is well packed and we do not filter a lot in the index itself).But step 3 is different. Since we need a separate trip to the table for every index entry that qualifies, we are looking at 500separate trips to the table (and logical reads). If we are lucky, and index clustering factor is small (that is: table and index areordered the same way), it is likely that the actual number of blocks being read would be fairly small (we still have to perform500 separate logical reads, but since we will be reusing blocks a lot, this might not be too bad). If we are unlucky and the tableand index are out of sync as far as ordering is concerned, we will have to read A LOT of table blocks.The really bad part here is that unless your data is pretty small (or, alternatively, memory is very large) it is likely that significantportion of table blocks will NOT be cached and thus we are slowing ourselves down even further.Think of it this way: in our (otherwise efficient) index range scan for 500 records, 500 out of 1002 logical reads (or ~ 50%)comes from step 3. This, by the way is a highly skewed result, caused by the fact that for this exercise we built indexes withPCTFREE 99. Typical ratio for Step 3 is much worse, usually as high as 75-95 % of all the reads.SELECT * FROM ( SELECT name, population FROM cities c WHERE state=Florida ORDER BY population DESC) WHERE rownum <= 500/500 rows selected.----------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10 | 650 | 22 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 22 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1485 | 44550 | 22 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 10 | | 12 (0)| 00:00:01 |----------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=500) 4 - access("STATE"=Florida) 14 Session # 403
  15. 15. Track: DatabaseStatistics---------------------------------------------------------- 1002 consistent getsCOVERING INDEXESThe good part here is that Step 3 is entirely optional. As long as all the data that is required by the query is already in the index,we do not need it.SQL> CREATE INDEX i_state_pop_c on cities (state, population, name) pctfree 99;Index created.SELECT * FROM ( SELECT name, population FROM cities WHERE state=Florida ORDER BY population DESC) WHERE rownum <= 5/-----------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 13 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 13 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 1485 | 44550 | 13 (0)| 00:00:01 |-----------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - access("STATE"=Florida)Statistics---------------------------------------------------------- 8 consistent getsAs we‟ve predicted, having all data required by the query in the same index, makes a lot of sense.IDEAL TOP-NParaphrasing a well known data modeling motto, the ideal top-N query is achieved when you: 1) Use the index 15 Session # 403
  16. 16. Track: Database 2) Make the best index 3) And read only from the indexLESS THAN IDEAL TOP-NHaving ideal situation to support your top-N queries (or anything) is, well, ideal Unfortunately, it does not always happen in a real world.There are several cases where top-N or pagination scenario becomes less than ideal and I‟m going to talk about 4 notable ones: 1) Effect of query conditions 2) Effect of DESC/ASC 3) Effect of deletes/updates 4) Driver “technicalities”EFFECT OF QUERY CONDITIONSHere is a simple test case. Let‟s say that you have a table with an ACTIVE column, which can only take 2 values: „Y‟ or „N‟.Suppose, you need to get the first 5 „ACTIVE‟ records (ordered by, say, a sequence).Would it matter if we select these records as ACTIVE=‟Y‟ or ACTIVE != „N‟CREATE TABLE t (n, active NOT NULL CHECK (active IN (Y, N)))PCTFREE 99 PCTUSED 1 AS SELECT level, CASE WHEN 0 = mod(level, 10) THEN Y ELSE N ENDFROM dual CONNECT BY level <= 10000/Table created.SQL> CREATE INDEX t_i ON t(active, n) PCTFREE 99;Index created.SELECT * FROM ( SELECT * FROM t WHERE active = Y ORDER BY n) WHERE rownum <= 5/5 rows selected.Elapsed: 00:00:00.01 16 Session # 403
  17. 17. Track: Database---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 80 | 13 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 160 | 13 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN| T_I | 10 | 60 | 13 (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - access("ACTIVE"=Y)Statistics---------------------------------------------------------- 7 consistent getsSELECT * FROM ( SELECT * FROM t WHERE active != N ORDER BY n) WHERE rownum <= 5/5 rows selected.Elapsed: 00:00:00.01--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 80 | 444 (1)| 00:00:06 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 1000 | 16000 | 444 (1)| 00:00:06 ||* 3 | SORT ORDER BY STOPKEY| | 1000 | 6000 | 444 (1)| 00:00:06 ||* 4 | TABLE ACCESS FULL | T | 1000 | 6000 | 443 (1)| 00:00:06 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - filter(ROWNUM<=5) 4 - filter("ACTIVE"<>N)Statistics---------------------------------------------------------- 1672 consistent gets 17 Session # 403
  18. 18. Track: DatabaseIt matters a great deal to ORACLE. Equality condition usually means a very specific and narrow range of records thatoptimizer can access and read directly, while inequality means “anything but that range”, which usually translates into readingthe entire data set and filtering out unneeded data. The latter means that if we order on a column below the filter, we have toread everything and sort.You can argue that in this case ACTIVE != „N‟ is equivalent to ACTIVE=‟Y‟ due to the way that we defined the table, butORACLE optimizer does not see it that way (yet).Bottom line: if in doubt, always choose equality.EFFECT OF DESC/ASCLet‟s say we need to select top 10 most populous cities in all states from Florida forwards, or STATE >= „Florida‟ (which,admittedly is an unusual request, but anything to prove a point).SQL> CREATE INDEX i_s_pop ON cities(state, population) PCTFREE 99;Index created.SELECT * FROM ( SELECT * FROM cities WHERE state >= Florida ORDER BY state, population DESC) WHERE rownum <= 10/10 rows selected.Elapsed: 00:00:04.69Execution Plan----------------------------------------------------------Plan hash value: 2951925630------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10 | 1170 | | 14132 (1)| 00:03:04 ||* 1 | COUNT STOPKEY | | | | | | || 2 | VIEW | | 60040 | 6860K| | 14132 (1)| 00:03:04 ||* 3 | SORT ORDER BY STOPKEY| | 60040 | 1758K| 2368K| 14132 (1)| 00:03:04 ||* 4 | TABLE ACCESS FULL | CITIES | 60040 | 1758K| | 13646 (1)| 00:02:58 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=10) 18 Session # 403
  19. 19. Track: Database 3 - filter(ROWNUM<=10) 4 - filter("STATE">=Florida)Statistics---------------------------------------------------------- 52208 consistent getsWhat happened? Why is there a full table scan here?Because we are using DESC, our index data in the range of STATE >= „Florida‟ is no longer ordered by population. Yes, it isordered by population (individually) for „Florida‟ and it is ordered by population (individually) in „Colorado‟, but with thecombined range: No.Yet, if we remove DESC requirement on population, our range is ordered by STATE, POPULATION again.SELECT * FROM ( SELECT * FROM cities WHERE state >= Florida ORDER BY state, population) WHERE rownum <= 10/10 rows selected.Elapsed: 00:00:00.43-----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------- 19 Session # 403
  20. 20. Track: Database| 0 | SELECT STATEMENT | | 10 | 1170 | 26 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 12 | 1404 | 26 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID| CITIES | 12 | 360 | 26 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN | I_S_POP | | | 14 (0)| 00:00:01 |-----------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=10) 4 - access("STATE">=Florida)Statistics---------------------------------------------------------- 53 consistent getsLesson to learn here: watch out for DESC/ASC and build indexes appropriately. In this case, an index on: (STATE,POPULATION DESC) would help the query.EFFECT OF DELETES OR UPDATESEven if indexes and top-N queries are efficient originally, the inevitable wear and tear with time cuts into it. The very commonfactor that affects efficiency of top-N and pagination queries is the effect of deletes and updates.Let‟s look at DELETE operation first as it is the simplest of the two.When a record gets deleted, its entry is marked „empty‟ in both table and all corresponding index structures.Empty space in a (heap) table can be reused immediately (with some caveats on block PCTUSED etc) by ANY new recordthat gets inserted into the table.Not so with an index, since index is a structure that enforces order. Empty space in an index can get reused only if new data„fits‟ within that space (by order). Thus, index „holes‟ are much more difficult to fill in, unless you are constantly reinserting thesame data.As for UPDATES - they actually run very different in indexes and tables.In tables, an update would modify the target record in place and unless something drastic happens, the record will likely stay inthe same data block.It is entirely different story in an index: since index structure is ordered, updated index entry HAS TO move to the new placethat fits in a sorted order, very likely to a different data block. Thus UPDATE in an index is really: DELETE+INSERT,which leaves a „hole‟ in the original index block, just like DELETE would.Bottom line is: as more and more „holes‟ accumulate in an index, top-N queries become progressively less efficient. 20 Session # 403
  21. 21. Track: DatabaseLet‟s look at the example.SQL> CREATE TABLE cities2 (name, state, population, budget_surplus)PCTFREE 99 PCTUSED 1AS SELECT name, state, population, Y FROM cities/Table created.SQL> CREATE INDEX i2_pop ON cities2(budget_surplus, population, name) PCTFREE 99/Index created.SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 5/5 rows selected.Elapsed: 00:00:00.00----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 14 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 12 | 780 | 14 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 75720 | 1774K| 14 (0)| 00:00:01 |----------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - access("BUDGET_SURPLUS"=Y)Statistics---------------------------------------------------------- 7 consistent gets-- This statement, by the way is executed very inefficiently, ignoring top-N-- optimization. I haven‟t been able to figure it out yetUPDATE cities2 SET budget_surplus=N WHERE rowid IN ( 21 Session # 403
  22. 22. Track: Database SELECT r FROM ( SELECT rowid r FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 200);200 rows updated.commit;SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 5/5 rows selected.Elapsed: 00:00:00.01----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 14 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 12 | 780 | 14 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 75720 | 1774K| 14 (0)| 00:00:01 |----------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 3 - access("BUDGET_SURPLUS"=Y)Statistics---------------------------------------------------------- 207 consistent getsDuring the first execution of the statement, we are looking at top 5 most populous cities that have a budget surplus. Theexecution is very efficient because we can quickly find the records.We then remove 200 records from the range, which leave 200 holes in the index. 22 Session # 403
  23. 23. Track: DatabaseWhen we re-run our top-5 query now we suddenly have to read a lot more blocks (through holes) to get to the data that weneed.You might point out that empty blocks would get eventually reused by ORACLE, which would resolve this issue and youwould, of course, be right. However, there are a couple of caveats here.First of all, only the blocks that are completely empty will get reused. In this (very artificial) case, most of them are, but in a realworld where deletes and updates tend to be more random, even one valid key entry remaining in the block will prevent it frombeing reused.Also, even if the block is completely empty, it can be reused only after we insert a new key to the index AND ORACLEhappens to pick up this block from the free list chain. In other words, if the data is fairly static, completely empty blocks canstay “inside the range” for quite a while.So, what is the solution?Well, even though it pains me to say it, you can see how an index rebuild (or coalesce) will actually help here by packing upindex entries and removing holes. The problem, of course, is that rebuilds or coalesces are BIG operations that can run forhours (days?) on real production databases and will affect the system pretty significantly.Is there a better way? Can we resolve the problem in minutes rather than hours?There might be a better way if we “narrow down” the fix to specific sub range of the index (i.e. data for “only one customer”)and are willing to tolerate a bit of dead space in the index.Here is how it works: we need to “version” our index sub trees. I.e. if originally, we had an index on:BUDGET_SURPLUSPOPULATION“Versioned” index looks like:BUDGET_SURPLUSVERSIONPOPULATIONSQL> ALTER TABLE cities2 ADD (version number DEFAULT 0 NOT NULL);Table altered. 23 Session # 403
  24. 24. Track: DatabaseSQL> CREATE INDEX i_pop_v ON cities2 (budget_surplus, version, population)pctfree 99;Index created.-- Let‟s run an original query, slightly modifying it to accept verion=0SQL> SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y AND version=0 ORDER BY population DESC) WHERE rownum <= 5/Elapsed: 00:00:00.01------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 12 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 11 | 715 | 12 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES2 | 757 | 24224 | 12 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_POP_V | 4 | | 7 (0)| 00:00:01 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("BUDGET_SURPLUS"=Y AND "VERSION"=0)Statistics---------------------------------------------------------- 14 consistent gets-- Let‟s now run the update that screws things upSQL> UPDATE cities2 SET budget_surplus=N WHERE rowid IN ( SELECT r FROM ( SELECT rowid r FROM cities2 WHERE budget_surplus=Y ORDER BY population DESC) WHERE rownum <= 200);200 records updated.SQL> commit;Commit complete.-- And verify that the problem with over reading the range appears 24 Session # 403
  25. 25. Track: DatabaseSELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y AND version=0 ORDER BY population DESC) WHERE rownum <= 5/Elapsed: 00:00:00.09------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 12 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 11 | 715 | 12 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES2 | 757 | 24224 | 12 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_POP_V | 4 | | 7 (0)| 00:00:01 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("BUDGET_SURPLUS"=Y AND "VERSION"=0)Statistics---------------------------------------------------------- 212 consistent gets-- And it does: we are now reading 212 blocks instead of 14-- Let‟s, “fix” the problem with a version:-- This query affects the entire table, but it is not a requirement-- We can easily construct better WHERE conditions in real lifeSQL> UPDATE cities2 SET version=1 WHERE budget_surplus=Y AND version=0;-- And now we use version 1, where index data is already compactedSQL> SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus=Y AND version=1 ORDER BY population DESC) WHERE rownum <= 5/Elapsed: 00:00:00.18------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------ 25 Session # 403
  26. 26. Track: Database| 0 | SELECT STATEMENT | | 5 | 325 | 12 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 11 | 715 | 12 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES2 | 757 | 24224 | 12 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_POP_V | 4 | | 7 (0)| 00:00:01 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("BUDGET_SURPLUS"=Y AND "VERSION"=1)Statistics---------------------------------------------------------- 17 consistent getsAs you can see, newly “versioned” data is compact enough that reads returned to (almost) pre-update level, so this process is agreat way to resolve “a hole” problem for a specific part of the index (i.e. “only one customer” or “a few orders”). It normallycan be done much faster than, say, index rebuild and, of course, it is a plain DML, so it can be done online.The one thing you probably noticed is that we supplied version number in our query explicitly. How did we know it? In somecases you just do, or, we can keep version numbers elsewhere, in a simpler table or entirely outside ORACLE.It is also a fairly lightweight modification to extract current version number using SELECT max() query, i.e.:SELECT max(version) FROM cities2 WHERE budget_surplus=Y/PAGINATIONIf ORACLE 11g top-N query looks a bit ugly, a traditional 11g pagination query looks a bit uglier still.SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10/Notice that we are now dealing with 3 queries: 26 Session # 403
  27. 27. Track: Database 1) The inner query that restricts and orders data 2) The intermediate query that restricts the upper bound of pagination window (WHERE rownum <= 20) 3) The outer query that restricts the lower bound of pagination window (WHERE rn >10)Let‟s look through a couple of data pages to see how this query executes.SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10/Statistics---------------------------------------------------------- 42 consistent getsSELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 30) WHERE rn > 20/Statistics---------------------------------------------------------- 62 consistent getsSELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 500) WHERE rn > 490 27 Session # 403
  28. 28. Track: Database/Statistics---------------------------------------------------------- 1002 consistent getsNotice a curious thing here: as we move along to later windows, even though our window size remains the same (that is: weare getting 10 records every time), we have to read more and more data to get them.This is because, with this type of pagination query, ORACLE does not know exactly where the window starts. It only knowswhere the start of “all windows” is (as defined by WHERE conditions).In other words, if we need to access page 1, this query is super efficient as ORACLE descends to the actual window start andreads only page 1 records.When we need page 2, ORACLE is a bit less efficient: it will still descend to the start of page 1, read the leaf blocks for page 1and page 2 and then throw away blocks for page 1.When we need page 5000, well, you can see where I am going with this.Because of this waste effect, this type of pagination query is sometimes called “dumb pagination”.SMART PAGINATIONHow can we make pagination query efficient every time?Remember that the reason that “dump pagination” gets inefficient is because ORACLE does not know where the actual datapage (that we are retrieving) starts and has to resort to counting records to find it. What if we could somehow supply “pagestart” information to ORACLE?This is easier to do than you think, once you realize that one almost never requests an individual page (say, page 20) in themiddle of the range, but rather pages are accesses in succession. First page 1, then page 2, page 3 etcThe great insight here is that previous page “knows” where it ends (and thus, the next one begins) and we can supply thisinformation back to the database.SELECT * FROM ( SELECT name, population FROM cities WHERE state=Florida ORDER BY population DESC ) WHERE rownum <= 5/NAME POPULATION-------------------- ----------Jacksonville city 821784 28 Session # 403
  29. 29. Track: DatabaseJacksonville city 821784Miami city 399457Miami city 399457Tampa city 335709Statistics---------------------------------------------------------- 12 consistent getsNotice that „Tampa city‟ is the last item on this page, but more importantly with POPULATION=335709.Let‟s feed this information back to the database and request a 2nd page:SELECT * FROM ( SELECT name, population FROM cities WHERE state=Florida AND population < 335709 ORDER BY population DESC ) WHERE rownum <= 5/NAME POPULATION-------------------- ----------St. Petersburg city 244769St. Petersburg city 244769Orlando city 238300Orlando city 238300Hialeah city 224669------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 325 | 23 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 650 | 23 (0)| 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 61 | 1830 | 23 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_S_POP | 10 | | 13 (0)| 00:00:01 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("STATE"=Florida AND "POPULATION"<335709)Statistics---------------------------------------------------------- 29 Session # 403
  30. 30. Track: Database 13 consistent getsNotice that we are reading essentially the same number of blocks to answer “page 2” query vs. “page 1” query (1 additionalread in this case is the artifact of a bit random data layout caused by PCTFREE 99).And the reads will stay constant for all subsequent pages. You can easily prove it to yourself by supplying a random number tocompare with POPULATION).Of course, there is a slight problem with the “page 1” query (“what should be the MAX population +1”), but it is easilysolvable if you know your data and get a little creative. I.e. I‟m fairly certain that we do not have any cities in the US exceeding1 billion inhabitants.POPULATION in this case is what is sometimes called a “pagination token” and the process of applying tokens is known as“tokenized pagination”, or, to put it simply a “smart pagination”.One common question that many people ask is: “what happens if I have duplicate entries in my pagination tokens”, i.e. whatwould happen if 10 cities have the same exact population and it just happens to be at the edge of a page?There are several workarounds here: 1) First of all, you can select a bigger range, change < to <= and add some additional logic to the application 2) Or, you can simply chose guaranteed unique “sequence based” column from the table (surrogate primary keys are a prime example here) and use it as a pagination token.TOP-N WITH JOINSEven though so far we‟ve only looked at “single table” Top-N and pagination queries, the pattern can also be applied to joins,although you have to be very careful how joins are executed. Let‟s look at an example.First of all, we‟ll construct a “second table”. Let‟s not be too sophisticated here and just make up something usable for anexample from the data that we already have, i.e. STATES table.CREATE TABLE states(state NOT NULL, capital NOT NULL) pctfree 99 pctused 1 AS SELECT state, max(name) FROM cities GROUP BY state/Table created.-- And add an index to look for the stateCREATE INDEX s_idx ON states(state, capital) pctfree 99/Index created. 30 Session # 403
  31. 31. Track: DatabaseLet‟s now run our top-N query that will select top-5 cities in Florida AND their state capital.SELECT * FROM ( SELECT /*+ leading(c) use_nl(s) */ c.name as city, c.state, c.population, s.capital FROM cities c, states s WHERE c.state = s.state AND c.state=Florida ORDER BY c.state, c.population DESC) WHERE rownum <= 5/CITY STATE POPULATION CAPITAL-------------------- --------------- ---------- ------------------------------Jacksonville city Florida 821784 Zolfo Springs townJacksonville city Florida 821784 Zolfo Springs townMiami city Florida 399457 Zolfo Springs townMiami city Florida 399457 Zolfo Springs townTampa city Florida 335709 Zolfo Springs town5 rows selected.------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 845 | 23 (0)| 00:00:01 ||* 1 | COUNT STOPKEY | | | | | || 2 | VIEW | | 10 | 1690 | 23 (0)| 00:00:01 || 3 | NESTED LOOPS | | 10 | 550 | 23 (0)| 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 1485 | 44550 | 13 (0)| 00:00:01 ||* 5 | INDEX RANGE SCAN | S_IDX | 1 | 25 | 1 (0)| 00:00:01 |------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<=5) 4 - access("C"."STATE"=Florida) 5 - access("S"."STATE"=Florida)Statistics---------------------------------------------------------- 19 consistent getsAnd, apart from the fact that state capital of Florida is apparently “Zolfo Springs Town”, I think you can agree that this querywas very efficient. 31 Session # 403
  32. 32. Track: DatabaseThere are a number of requirements for successful top-N queries with joins that you need to follow and here are some of themain ones: 1) All the sorting has to come from one table (more precisely: one index). This means that in ORDER BY there can only be columns from the join leading table (the only exception to that is the join columns themselves) 2) You have to use NESTED LOOPS join type. It is the only join type that can stop after reading N rows that qualify. 3) Indexes on a leading table must be built as: <Index filters (WHERE)>,<Order By>,<Join columns>,<Other baggage, i.e. select> 4) Indexes on “table 2” can be built a few different ways as long as they work efficiently with NESTED LOOPS coming from “table 1” (in our example, SELECT state, capital FROM states WHERE state=:x, which requires index on either <State> or, more efficiently on <State>,<Capital>) 32 Session # 403

×