Pagination Done the Right Way

68,700
-1

Published on

The SQL OFFSET keyword is evil. It basically behaves like SLEEP in other programming langauges: the bigger the number, the slower the execution.
Fetching results in a page-by-page fashion in SQL doesn't require OFFSET at all but an even simpler SQL clause. Besides being faster, you don't have to cope with drifting results if new data is inserted between two page fetches.

Pagination Done the Right Way

  1. 1. © 2013 by Markus Winand iStockPhoto/Mitshu Pagination done the Right Way @MarkusWinand @SQLPerfTips
  2. 2. © 2013 by Markus Winand Takeaways ‣OFFSET kills performance ‣Paging can be done without OFFSET ‣It’s faster and has fewer side effects
  3. 3. © 2013 by Markus Winand Note In this presentation index means B-tree index.
  4. 4. A Trivial Example A query to fetch the 10 most recent news: se!e"# * $%om news whe%e #op&" ' 1234 o!de! b" da#e des$, %d des$ &%m%# 10; $!ea#e %ndex .. on news(#op%$); Using o%de% b( to get the most recent first and !&m&# to fetch only the first 10. Alternative SQL-2008 syntax (since PostgreSQL 8.4) $e#"h $&%s# 10 %ows on!(
  5. 5. Worst Case: No Index for o%de% b( L&m&# (a$#ua& !ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (!ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (!ows'10000) Index *ond: (#op&" ' 1234)
  6. 6. Worst Case: No Index for order by The limiting factor is the number of rows that match the whe%e clause (“Base-Set Size”). The database might use an index to satisfy the whe%e clause, but must still fetch all matching rows to “sort” them.
  7. 7. Another Benchmark: Fetch Next Page Fetching the next page is easy using the o$$se# keyword: se!e"# * $%om news whe%e #op&" ' 1234 o%de% b( da#e des", &d des" o))se# 10 !&m&# 10;
  8. 8. Worst Case: No Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'20) So%# Me#hod: #op-N heapso!# Memo!": 19(B -> B&#map Heap S"an (a"#ua! %ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (a"#ua! %ows'10000) Index *ond: (#op&" ' 1234)
  9. 9. Worst Case: No Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'30) So%# Me#hod: #op-N heapso!# Memo!": 20(B -> B&#map Heap S"an (a"#ua! %ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (a"#ua! %ows'10000) Index *ond: (#op&" ' 1234)
  10. 10. Worst Case: No Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'40) So%# Me#hod: #op-N heapso!# Memo!": 22(B -> B&#map Heap S"an (a"#ua! %ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (a"#ua! %ows'10000) Index *ond: (#op&" ' 1234)
  11. 11. Worst Case: No Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10000) So%# Me#hod: ex#e!na& me!*e D%s(: 1200(B -> B&#map Heap S"an (a"#ua! %ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (a"#ua! %ows'10000) Index *ond: (#op&" ' 1234)
  12. 12. Worst Case: No Index for order by Sorting might become the limiting factor when browsing farther back. Fetching the last page can take considerably longer than fetching the first page.
  13. 13. Improvement 1: Indexed order by se!e"# * $%om news whe%e #op&" ' 1234 o%de% b( da#e des", &d des" o$$se# 10 !&m&# 10; $!ea#e %ndex .. on news (#op%$, da#e, %d); A single index to support the whe%e and o%de% b( clauses.
  14. 14. Improvement 1: Indexed order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: (#op&" ' 0)
  15. 15. Improvement 1: Indexed order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'20) Index *ond: (#op&" ' 0)
  16. 16. Improvement 1: Indexed order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'30) Index *ond: (#op&" ' 0)
  17. 17. Improvement 1: Indexed order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'40) Index *ond: (#op&" ' 0)
  18. 18. Improvement 1: Indexed order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10000) So%# Me#hod: ex#e!na& me!*e D%s(: 1200(B -> B&#map Heap S"an (a"#ua! %ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (a"#ua! %ows'10000) Index *ond: (#op&" ' 1234)
  19. 19. Improvement 1: Indexed order by Fetching the first page is not affected by the Base-Set size! Fetching the next page is also faster. However, PostgreSQL might take a Bitmap Index Scan when browsing to the end.
  20. 20. © 2013 by Markus Winand We can do better!
  21. 21. © 2013 by Markus Winand Don’t touch what you don’t need
  22. 22. Improvement 2: The Seek Method Instead of o$$se#, use a whe%e filter to remove the rows from previous pages. se!e"# * $%om news whe%e #op&" ' 1234 and (da#e, %d) < (p!e+_da#e, p!e+_%d) o%de% b( da#e des", &d des" !&m&# 10; Only select the rows “before” (=earlier date) the last!row from the previous page. A definite sort order is really required!
  23. 23. © 2013 by Markus Winand Side Note: Row Values Besides scalar values, SQL also defines “row values” or “composite values.” ‣ In the SQL standard since ages (SQL-92) ‣ All comparison operators are well defined ‣ E.g.: (x, y) > (a, b) is true iff (x > a or (x=a and y>b)) ‣ In other words, when (x,y) sorts after (a,b) ‣ Great PostgreSQL support since 8.0!
  24. 24. © 2013 by Markus Winand Row Values: How it works
  25. 25. © 2013 by Markus Winand Row Values: How it works
  26. 26. © 2013 by Markus Winand Row Values: How it works
  27. 27. © 2013 by Markus Winand Row Values: Emulating them ‣ There is only a single database that has sufficient Row-Values support for this trick (PostgreSQL). ‣ With all other database, you must emulate it according to this scheme: !"#$#%&'()*+',)%-.%/ %%%%012%3 %%%%%%%%%%%3&'()*+',)%-%/4 %%%%%%%%%5$ %%%%%%%%%%%3&'()*+',)%.%/%012%&'()*6+%-%/4 %%%%%%%%4 See: http://use-the-index-luke.com/sql/partial-results/fetch-next-page#sb-equivalent-logic
  28. 28. Seek-Method without Optimal Index se!e"# * $%om news whe%e #op&" ' 1234 and (da#e, &d) < (p%e+_da#e, p%e+_&d) o%de% b( da#e des", &d des" !&m&# 10; $!ea#e %ndex .. on news (#op%$);
  29. 29. Seek Method w/o Index for order by L&m&# (a$#ua& !ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (!ows'10000) Re"he") *ond: (#op&" ' 1234) -> B&#map Index S"an (!ows'10000) Index *ond: (#op&" ' 1234)
  30. 30. Seek Method w/o Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (a$#ua& !ows'9990) Rows Remo+ed b" F%&#e!: 10 (new &n 9.2) -> B&#map Index S"an (a$#ua& !ows'10000) Index *ond: (#op&" ' 1234)
  31. 31. Seek Method w/o Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (a$#ua& !ows'9980) Rows Remo+ed b" F%&#e!: 20 (new &n 9.2) -> B&#map Index S"an (a$#ua& !ows'10000) Index *ond: (#op&" ' 1234)
  32. 32. Seek Method w/o Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (a$#ua& !ows'9970) Rows Remo+ed b" F%&#e!: 30 (new &n 9.2) -> B&#map Index S"an (a$#ua& !ows'10000) Index *ond: (#op&" ' 1234)
  33. 33. Seek Method w/o Index for order by L&m&# (a"#ua! %ows'10) -> So%# (a$#ua& !ows'10) So%# Me#hod: #op-N heapso!# Memo!": 18(B -> B&#map Heap S"an (a$#ua& !ows'10) Rows Remo+ed b" F%&#e!: 9990 (new &n 9.2) -> B&#map Index S"an (a$#ua& !ows'10000) Index *ond: (#op&" ' 1234)
  34. 34. Seek Method w/o Index for order by Always needs to retrieve the full base set, but the top-n sort buffer needs to hold only 10 rows. The response time remains the same even when browsing to the last page. And the memory footprint is very low!
  35. 35. Seek-Method with Optimal Index se!e"# * $%om news whe%e #op&" ' 1234 and (da#e, &d) < (p%e+_da#e, p%e+_&d) o%de% b( da#e des", &d des" !&m&# 10; $!ea#e %ndex .. on news (#op%$, da#e, %d);
  36. 36. Seek Method with index for order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: (#op&" ' 1234)
  37. 37. Seek Method with index for order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: ((#op&" ' 1234) AND (ROW(d#, &d) < ROW(‘...’, 23456)))
  38. 38. Seek Method with index for order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: ((#op&" ' 1234) AND (ROW(d#, &d) < ROW(‘...’, 34567)))
  39. 39. Seek Method with index for order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: ((#op&" ' 1234) AND (ROW(d#, &d) < ROW(‘...’, 45678)))
  40. 40. Seek Method with index for order by L&m&# (a$#ua& !ows'10) -> Index S"an Ba")wa%d (a$#ua& !ows'10) Index *ond: ((#op&" ' 1234) AND (ROW(d#, &d) < ROW(‘...’, 56789)))
  41. 41. Seek Method with index for order by Successively browsing back doesn’t slow down. Neither the size of the base set(*) nor the fetched page number affects the response time. (*) the index tree depth still affects the response time.
  42. 42. ComparisonOffsetSeek W/O index for o%de% b( With index for o%de% b(
  43. 43. © 2013 by Markus Winand Too good to be true? The Seek Method has serious limitations ‣You cannot directly navigate to arbitrary pages ‣because you need the values from the previous page ‣Bi-directional navigation is possible but tedious ‣you need to revers the o%de% b( direction and RV comparison ‣Works best with full row values support ‣Workaround is possible, but ugly and less performant ‣Framework support? ‣jOOQ 3.3 introduced native support for the Seek-Method ‣order_query for Ruby by @glebm: http://github.com/glebm/order_query
  44. 44. © 2013 by Markus Winand A Perfect Match for Infinite Scrolling The “Infinite Scrolling” UI doesn’t need to ... ‣navigate to arbitrary pages ‣there are no buttons ‣Browse backwards ‣previous pages are still in the browser ‣Stable results ‣New rows don’t affect the result No need for de-duplication in JS
  45. 45. Also a Perfect Match for PostgreSQL o%de% b( support matrix row values support matrix Popular Popular Advanced Advanced
  46. 46. © 2013 by Markus Winand About Markus Winand Tuning developers for high SQL performance Training & co (one-man show): winand.at Geeky blog: use-the-index-luke.com Author of: SQL Performance Explained

×