libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
1. libinjection
New Techinques in Detecting SQLi Atttacks
iSEC Partners Open Forum
Gilt Group, New York, September 6 2102
Nick Galbreath @ngalbreath nickg@client9.com
3. First presented at
Black Hat USA 2012
http://client9.com/20120725
iSEC Partners party at Bellagio
4. The Next 15 Minutes
• You know what SQLi is and why it's
important
• Why detecting SQLi is a hard problem
• Why current solutions aren't so good
• The libinjection algorithm and library
5. It's Easy to Get Started
with Regular Expressions
s/UNIONs+(ALL)?/i
‣ At least two open source WAF
use regular expressions.
‣ Failure cases in closed-source
WAFs also indicate regexp.
10. Money Literals
• MSSQL has a money type.
• -$45.12
• $123.0
• +$1,000,000.00 Commas ignored
• Haven't experimented with this yet.
• Does it auto-cast to a float or int type?
11. Ridiculous Operators
• != not equals, standard • ||/ cube root (pgsql)
• <=> mysql • ** exponents (oracle)
• <> mssql • # bitwise xor (pgsql
conflicts with mysql
• ^= oracle comment)
• !>, !< not less than mssql
• / oracle
• !! factorial (pgsql)
• |/ sqaure root (pgsql)
14. Proven Fail
‣ At Black Hat USA 2005, Hanson and
Patterson presented:
Guns and Butter: Towards Formal Axioms of
Validation (http://bit.ly/OBe7mJ)
‣ …formally proved that for any regex validator, we
could construct either a safe query which would be
flagged as dangerous, or a dangerous query which
would be flagged as correct.
‣ (summary from libdejector documentation)
17. Key Insight
‣ A SQLi attack must be parsed as SQL within
the original query.
‣ SQL has a rigid syntax
‣ it works, or it's a syntax error.
‣ Compare this to HTML/XSS rules
‣ "Is it a SQLi attack?" becomes
"Could it be a SQL snippet?"
18. Only 3 Contexts
User input is only "injected" into SQL in three
ways:
‣ As-Is
‣ Inside a single quoted string
‣ Inside a double quoted string
Means we have to parse input three times.
Compare to XSS
19. Identification of
SQL snippets
without context is hard
‣ 1-917-660-3400 my phone number or an
arithmetic expression in SQL?
‣ @ngalbreath my twitter account or a SQL
variable?
‣ English-like syntax and common keywords:
union, group, natural, left, right, join, top,
table, create, in, is, not, before, begin, between
20. Existing SQL Parsers
‣ Only parse their flavor of SQL
‣ Not well designed to handle snippets
‣ Hard to extend
‣ Worried about correctness
... so I wrote my own!
21. Tokenization
‣ Converts input into a stream of tokens
‣ Uses "master list" of keywords and functions
across all databases.
‣ Handles comments, string, literals, weirdos.
22. 5000224' UNION USER_ID>0--
[ ('...500224', string),
('UNION', union operator),
('USER_ID', name),
('>', operator),
('0', number),
('--.....', comment) ]
23. Meet the Tokens
‣ none/name ‣ group-like operation
‣ variable ‣ union-like operator
‣ string ‣ logical operator
‣ regular operator ‣ function
‣ unknown ‣ comma
‣ number ‣ semi-colon
‣ comment ‣ left parens
‣ keyword ‣ right parens
24. Merging,
Specialization,
Disambiguation
‣ "IS", "NOT" ==> "IS NOT" (single op)
‣ "NATURAL", "JOIN" => "NATURAL JOIN"
‣ ("+", operator) -> ("+", unary operator)
‣ (COS, function), (1, number) ==>
(COS, name), (1, number)
functions must be followed with a
parenthesis!
25. Folding
‣ This step actually isn't needed to detect, but
is needed to reduce false positives.
‣ Converts simple arithmetic expressions into a
single value (don't try to evaluate them).
‣ 1-917-660-3400 -> "1"
26. Knows nothing about SQLi
‣ So far this is purely a parsing problem.
‣ Knows nothing about SQLi (which is evolving)
‣ Can be 100% tested against any SQL input
(not SQLi) for correctness.
‣ Language independent test cases
$ cat test-tokens-numbers-floats-003.txt
--TEST--
floating-point parsing test
--INPUT--
SELECT .0;
--EXPECTED--
k SELECT
1 .0
; ;
27. Fingerprints
‣ The token types of a user input form a hash or
a fingerprint.
‣ -6270" UNION ALL SELECT 5594, 5594, 5594, 5594, 5594, 5594,
5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594,
5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594,
5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594, 5594,
5594, 5594, 5594, 5594, 5594# AND "JWWQ"="JWWQ
‣ becomes "sUk1,1,1,1,1,1,1,1,&"
‣ Now let's generate fingerprints from
Real World Data.
‣ Can we distinguish between SQLi and benign
input?
28. Training on SQLi
‣ Parse known SQLi attacks from
‣ SQLi vulnerability scanners
‣ Published reports
‣ SQLI How-Tos
‣ > 32,000 total
‣ Since Black Hat, donations from
‣ modsecurity
‣ qualys
‣ > 50,000 total
29. Training on Real Input
‣ 100s of Millions of user inputs from Etsy's
access logs were also parsed.
‣ Large enough to get a good sample (Top 50
USA site)
‣ Old enough to have lots of odd ways of query
string formatting.
‣ Full text search with an diverse subject
domain
30. How many tokens are
needed to determine if
user input is SQLi or not?
33. The Library
On GitHub Now
~500 Lines of Code
One file + data
No memory allocation
No threads
No external dependencies
Fixed stack size
>100k checks a second
34. tada
#include "sqlparse.h"
#include <string.h>
int main()
{
const char* ucg = "1 OR 1=1";
// input should be normalized, upper-cased
// You can use sqli_normalize
// if you don't have your own function
sfilter sf;
return is_sqli(&sf, ucg, strlen(ucg));
}
$ gcc -Wall -Wextra sample.c sqlparse.c
$ ./a.out
$ echo $?
1
35. What's Next?
• Change API to allow passing in
fingerprint data or a function. Allows
upgrades without code changes.
• Can we reduce the number of tokens?
String, variables, numbers are all just
values.
• Folding of comma-separated values?
1,2,3,4 => 1
• Can we just eliminate all parenthesis?
36. Help!
• More SQLi from the field please!
• False positives welcome
• More test cases with exotic SQL to test
parser.
• Ports to other languages (the language-
neutral test framework should make this
easier).
• Compiling on Windows (mostly tested on
Mac OS X and Linux)
37. Slides and Source Code:
http://www.client9.com/libinjection/
Nick Galbreath
@ngalbreath
nickg@client9.com
Sept 20 OWASP NY
SQL obfuscation and libinjection
Oct 25, OWASP USA, Austin, Texas
Continuous Deployment and Security
Thanks