Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

sqlmap - Under the Hood


Published on

These are the slides from a talk "sqlmap - Under the Hood" held at PHDays 2013 conference (Russia / Moscow 23rd–24th May 2013) by Miroslav Stampar.

Published in: Technology
  • Be the first to comment

sqlmap - Under the Hood

  1. 1. sqlmap – Under the HoodMiroslav Štampar( – Under the HoodMiroslav Štampar(
  2. 2. PHDays 2013, Moscow (Russia) May 23, 2013 2BigArray Support for huge table dumps (e.g. millions ofrows) Raw data needs to be held somewhere beforebeing processed (and eventually stored) In-memory was a good enough choice untilrecent years (user appetites went bigger) Avoidance of MemoryError Memory mapping into smaller chunks/pages(e.g. 4096 entries) Temporary files are used for storing chunks O(1) read/write access (page table principle)
  3. 3. PHDays 2013, Moscow (Russia) May 23, 2013 3HashDB Storage of resumable session data atcentralized place (local SQLite3 database) Non-ASCII values are automaticallyserialized/deserialized (pickle) INSERT INTO storage VALUES(LONG(MD5(target_url || key ||MILESTONE_SALT)[:8]), stored_value) MILESTONE_SALT is changed whenever there is achange in HashDB mechanism that is bringingincompatibility with previous versions key uniquely describes storage_value for agiven target_url (e.g.: KB_INJECTIONS, SELECTbanner FROM v$version WHERE ROWNUM=1, etc.)
  4. 4. PHDays 2013, Moscow (Russia) May 23, 2013 4Payloads XML format (xml/payloads.xml) Tag type <boundary> used for storage of allpossible prefix and suffix formations (<prefix>,<suffix>) together with context sensitiveinformation (subtags <level>, <clause>,<where> and <ptype>) Tag type <test> used for storage of datarequired for successful testing and usage ofeach SQL injection payload type (subtags<title>, <stype>, <level>, <risk>, <clause>,<where>, <vector>, <request> and <response>)
  5. 5. PHDays 2013, Moscow (Russia) May 23, 2013 5Payloads (2)<boundary><level>1</level><clause>1</clause><where>1,2</where><ptype>1</ptype><prefix>)</prefix><suffix>AND([RANDNUM]=[RANDNUM]</suffix></boundary>
  6. 6. PHDays 2013, Moscow (Russia) May 23, 2013 6Payloads (3)<test><title>Microsoft SQL Server/Sybase AND error-based - WHERE or HAVINGclause (IN)</title><stype>2</stype><level>2</level><risk>0</risk><clause>1</clause><where>1</where><vector>AND [RANDNUM] IN (([DELIMITER_START]+([QUERY])+[DELIMITER_STOP]))</vector><request><payload>AND [RANDNUM] IN (([DELIMITER_START]+(SELECT (CASE WHEN([RANDNUM]=[RANDNUM]) THEN 1 ELSE 0 END))+[DELIMITER_STOP]))</payload></request><response><grep>[DELIMITER_START](?P&lt;result&gt;.*?)[DELIMITER_STOP]</grep></response><details><dbms>Microsoft SQL Server</dbms><dbms>Sybase</dbms><os>Windows</os></details></test>
  7. 7. PHDays 2013, Moscow (Russia) May 23, 2013 7Queries XML format (xml/queries.xml) Tag type <dbms> used for storage of all DBMSspecific SQL formations required for successfulenumeration (subtags <users>, <passwords>,<dbs>, <tables>, <columns>, <dump_table>, etc.)and resulting data (pre)processing (subtags<cast>, <length>, <isnull>, <count>,<substring>, <concatenate>, etc.) Each enumeration subtag has an <inband> and<blind> form used in respective techniques
  8. 8. PHDays 2013, Moscow (Russia) May 23, 2013 8Queries (2)<dbms value="MySQL"><cast query="CAST(%s AS CHAR)"/><length query="CHAR_LENGTH(%s)"/><isnull query="IFNULL(%s, )"/><delimiter query=","/><limit query="LIMIT %d,%d"/>…<passwords><inband query="SELECT user,passwordFROM mysql.user" condition="user"/><blind query="SELECT DISTINCT(password)FROM mysql.user WHERE user=%s LIMIT %d,1"count="SELECT COUNT(DISTINCT(password)) FROMmysql.user WHERE user=%s"/></passwords>…
  9. 9. PHDays 2013, Moscow (Russia) May 23, 2013 9Multithreading Multithreading implemented whereverapplicable (option --threads) Techniques covered: boolean-based blind,error-based and partial UNION query Deliberately turned off for techniques: time-based and stacked (lots of reasons) Each thread covers a part of value in case ofboolean-based blind In other techniques, each thread covers oneenumerated entry Also, implemented for brute force column/tablename search and crawling
  10. 10. PHDays 2013, Moscow (Russia) May 23, 2013 10Direct connection Direct connection to DBMS (option -d) python -d“mysql://root:password123@” Support for: Microsoft SQL Server, MySQL,Oracle, PostgreSQL, SQLite, Microsoft Access,Firebird, SAP MaxDB, Sybase, IBM DB2 Using of 3rdparty connectors (e.g. python-pymssql, pymysql, cx_Oracle, python-psycopg2,etc.) SQLAlchemy used as an alternative
  11. 11. PHDays 2013, Moscow (Russia) May 23, 2013 11Load request(s) from file Load HTTP request(s) from a textual file (option-r) Supporting RAW request format (any MITMproxy can be used to catch one) Particularly usable in requests with largecontent body (e.g. POST) Load and parse log files (option -l) Supporting Burp and WebScarab log formats Unlimited number of parsed HTTP requests(using only unique ones)
  12. 12. PHDays 2013, Moscow (Russia) May 23, 2013 12Content type detection Automatic detection of (specialized) requestcontent types Supporting SOAP, JSON and (generic) XML For example:--data="{ "pid": 4412, "id":1, "action": "do"}"--data="<request><pid>4412</pid><id>1</id><action>do</action></request>" Appropriate exploitation of parameter values In case of non-supported format(s), custominjection mark (*) can be used
  13. 13. PHDays 2013, Moscow (Russia) May 23, 2013 13Site crawling/form searching Collect usable (on site) target links (option--crawl) User defines crawling depth (e.g. 3) limitingsearch based on distance from starting page Optional form searching at visited pages(switch --forms) Arbitrary filling of missing form data Reparation of non-HTML compliant pages foreasier processing
  14. 14. PHDays 2013, Moscow (Russia) May 23, 2013 14Mnemonics Usage of mnemonics for faster setting up ofsqlmap options and switches (option -z) Longer (original):python --flush-session--threads=4 --ignore-proxy --batch --banner-u … Shorter (using mnemonics):python -z“flu,thre=4,ign,bat,ban” -u … Highly generic prefix based recognition (e.g. -z“flu,bat,ban” is interpreted the same as -z“flush,batc,bann”)
  15. 15. PHDays 2013, Moscow (Russia) May 23, 2013 15Keep-alive HTTP persistent connection (switch --keep-alive) Opposed to new connection for every singlerequest/response pair Slightly adapted 3rdparty module keepaliveand adjusted for multi-threading Connection pool – reusage of existing targetconnection(s) where applicable Reduced network congestion (fewer TCPconnections), reduced latency (nohandshaking), faster enumeration, etc.
  16. 16. PHDays 2013, Moscow (Russia) May 23, 2013 16Tor Support for The Onion Router (Tor) onlineanonymity network (switch --tor) Concealing identity and network activity Used against surveillance and (targeted) trafficsniffing Configurable Tor proxy type (option --tor-type)and port number (option --tor-port) DNS leakage is prevented (no DNS requestsoutside of Tor) Available safety check for proper usage of Tor(switch --check-tor)
  17. 17. PHDays 2013, Moscow (Russia) May 23, 2013 17Domain name resolution caching DNS resolution request is done by default foreach HTTP request (from Python HTTPdedicated modules – e.g. httplib) Noticeable slowdown in some cases (e.g.excessive network latency) Problem noticed and reported by (nagging)users (looking into Wireshark traffic captures) Problem patched at the lowest level (methodsocket.getaddrinfo(*args, **kwargs) isencapsulated for caching)
  18. 18. PHDays 2013, Moscow (Russia) May 23, 2013 18Authentication methods Implemented support for authenticationmethods: basic, digest, NTLM and certificate(options --auth-type, --auth-cred and --auth-cert) python -u“”--auth-type=basic --auth-cred=”testuser:testpass” Handling HTTP status code 401 (Unauthorized) Authorization headers are being cached (whereapplicable)
  19. 19. PHDays 2013, Moscow (Russia) May 23, 2013 19Reflection detection and removal Noisy response resulting from requestreflection Query results for: 1%20AND%201%3D1 Can cause problems in detection phase Particularly problematic for boolean-basedblind technique (fuzzy page comparison) Automatic detection of reflected payload valueand marking with predefined constant value Query results for: __REFLECTED_VALUE__
  20. 20. PHDays 2013, Moscow (Russia) May 23, 2013 20Dynamicity detection and removal Noisy response resulting from sporadicallychanging content (e.g. ads, banners, etc.) Can cause problems in both detection andenumeration phase Particularly problematic for boolean-basedblind technique Automatic detection and marking of dynamicparts (info held in internal knowledge base) In best case, automatic recognition and usageof string value appearing only in Trueresponses (option --string)
  21. 21. PHDays 2013, Moscow (Russia) May 23, 2013 21Content filtering Occasionally pages are bulked with non-textualcontent (CSS styles, comments, JavaScript,HTML tags, embedded objects, etc.) Changes regarding boolean-based blindtechnique are usually affecting only one smalltextual part (e.g. table entry) Optional filtering of non-textual content (switch–text-only) For example: <html>...<td>Toothfairy</td>...</html> is filtered to ...Toothfairy... Better detection and less trash(y) results
  22. 22. PHDays 2013, Moscow (Russia) May 23, 2013 22Wizard mode For beginner users and script kiddies (switch--wizard) Questions asked:Target URLPOST data (if any)Injection difficulty (Normal/Medium/Hard)Enumeration (Basic/Intermediate/All) Infamous for Comodo Brazil breach (March2011) – attackers posted wizard mode consoleoutput to the Pastebin
  23. 23. PHDays 2013, Moscow (Russia) May 23, 2013 23Level/risk of detection Number of requests per each parameter intesting phase can grow from 10 up to 10K To prevent unnecessary noise and speed up thetesting time, tests are classified by level andrisk Level (option --level) represents (passing)possibility/usability of the test case (higherlevel means lower possibility) Risk (option --risk) represents potentialdamage that the test case can cause (higherrisk means higher potential damage)
  24. 24. PHDays 2013, Moscow (Russia) May 23, 2013 24Heuristic SQL injection checks Recognition of the backend DBMS if errormessage can be provoked with arbitrary invalidSQL sequence (e.g. ())”(”) In case that the parameter value is integer andresponse for (e.g.) 1 is the same as for (2-1),there is a good chance that the target isvulnerable In case of detected boolean-based blindtechnique, DBMS specific queries are used (e.g.(SELECT 0x616263)=0x616263) to potentiallymove focus to a particular DBMS in furthertests
  25. 25. PHDays 2013, Moscow (Russia) May 23, 2013 25Type casting detection Type casting is an efficient way for dealing withSQL injection on numeric values $query = "SELECT * FROM log WHERE id=" .intval($_GET[id]); Implemented automatic detection of suchcases In case that the parameter value is integer andresponse for (e.g.) 1 is the same as for 1foobar,there is a good chance that the target is usinginteger casting User is warned of a potentially “futile” run
  26. 26. PHDays 2013, Moscow (Russia) May 23, 2013 26Fingerprinting Web server is being fingerprinted by knownHTTP headers, cookie values, etc. DBMS is being fingerprinted through errormessage parsing, banner parsing and testswith version specific payloads (obtained fromrelease notes and reference manuals) For example, cookie value ASP.NET_SessionId isspecific for ASP.NET/IIS/Windows platform,while TO_SECONDS(950501)>0 check should workonly on MySQL >= 5.5.0 Detailed DBMS version check is done only ifswitch -f/--fingerpint is used
  27. 27. PHDays 2013, Moscow (Russia) May 23, 2013 27Suhosin-patch detection Open source patch for PHP, protecting webserver from “insecure PHP practices” suhosin.get.max_value_length (default: 512),, etc. Causing problems in enumeration phase whenpayloads are big (e.g. enumerating columnnames) After the detection phase single payload(depending on detected techniques) is senthaving size greater than 512 (e.g. 1 AND 6525= … 6525) User is warned in case of False response
  28. 28. PHDays 2013, Moscow (Russia) May 23, 2013 28WAF/IDS/IPS detection Sending one “suspicious” request (in form ofdummy parameter value) and checking forresponse change(s) when compared to original(switch --check-waf) WAF scripts (switch --identify-waf) do athrough checking, each focusing onpeculiarities of a particular product For example, WebKnight responds with HTTPstatus code 999 on detected suspicious activity Currently there are 29 WAF scripts (,,, etc.)
  29. 29. PHDays 2013, Moscow (Russia) May 23, 2013 29WAF/IDS/IPS bypass Tamper scripts (option --tamper) do changes oninjected payload before its being sent User has to choose appropriate one(s) basedon collected knowledge of targets behaviorand/or detected WAF/IDS/IPS product If required, a chain of tamper scripts can beused (e.g. --tamper=”between,ifnull2ifisnull”) Currently there are 36 tamper scripts(,,, etc.)
  30. 30. PHDays 2013, Moscow (Russia) May 23, 2013 30String value escaping Each string value inside payload isautomatically escaped (quoteless format)depending on targeted DBMS For example: 1 ... AND username=”root”-- isin case of MySQL escaped to 1 ... ANDusername=0x726f6f74-- Avoidance of filter-based escaping functions(e.g. addslashes) Adding implicit dependence to targeted DBMS Payload obfuscation (harder noticeability intarget log files)
  31. 31. PHDays 2013, Moscow (Russia) May 23, 2013 31Evaluation of custom code Custom Python code can be evaluated beforeeach request (option --eval) In such code, each request parameter isaccessible as a local variable All resulting variable values are included intothe request as new parameter values --eval="importhashlib;hash=hashlib.md5(id).hexdigest()" AND1=1&hash=7f134e52836a00e26493e690ed8aa735
  32. 32. PHDays 2013, Moscow (Russia) May 23, 2013 32Fuzzy page comparison Used (mostly) in boolean-based blindtechnique Gestalt pattern matching (Ratcliff-Obershelpalgorithm) Supported by standard Python module difflib Class SequenceMatcher Method ratio() (or faster quick_ratio())giving a measure of the sequences’ similarityas a float in range [0, 1] True result if ratio() > 0.98 when comparedwith original page
  33. 33. PHDays 2013, Moscow (Russia) May 23, 2013 33Definite page comparison Used mostly in boolean-based blind technique When fuzzy page comparison fails (e.g. toomuch page dynamicity) and user is able todistinguish True from False responses byhimself (non-n**b) String to match when result should berecognized as True (option --string) Regular expression to match … (option --regex) Compare HTTP codes (switch --code) Compare HTML titles (switch --title)
  34. 34. PHDays 2013, Moscow (Russia) May 23, 2013 34Null connection Sometimes there is no need for retrieval ofwhole page content (size can be enough) Boolean-based blind technique 3 methods: Range, HEAD and “skip-read” Range: bytes=-1Content-Range: bytes 4789-4790/4790 HEAD /search.aspx HTTP/1.1Content-Length: 4790 Both are resulting (if applicable) with eitherempty or 1 char long response Method “skip-read” retrieves only HTTPheaders looking for Content-Length
  35. 35. PHDays 2013, Moscow (Russia) May 23, 2013 35False positive detection False positives are highly undesirable Specific for boolean-based blind and time-based blind techniques False positive tests are done in cases whenonly one of those techniques is detected Set of trivial mathematical checks performed tosee if target can “respond” correctly For example:(123+447)=570319>(519+110)(654+267)>854
  36. 36. PHDays 2013, Moscow (Russia) May 23, 2013 36Delay detection Detection of “artificial” delay Statistical comparison with normal responsetimes Response time must fit under the Gaussian bellcurve to be marked as “normal” Is <current_response_time> >avg(<normal_response_times>)+7*stdev(<normal_response_times>)? If answer is yes, probability that we are dealingwith “artificial” delay is 99.9999999997440% Especially useful when heavy queries are used(not knowing expected delay value)
  37. 37. PHDays 2013, Moscow (Russia) May 23, 2013 37Delay detection (2)
  38. 38. PHDays 2013, Moscow (Russia) May 23, 2013 38UNION query column # UNION query requires knowledge of number ofcolumns (N) for vulnerable SQL statement Two methods used: ORDER BY and statistical(same principle as in delay detection) ORDER BY N+1 should respond noticeablydifferent (preferably with error message) thanfor ORDER BY N (binary searched) In statistical method responses for candidates(UNION SELECT NULL, NULL,...) are comparedto original (not injected) response Right one is the one that seems “not normal”(having ratio outside the Gaussian bell curve)
  39. 39. PHDays 2013, Moscow (Russia) May 23, 2013 39Output prediction Inference techniques (boolean-based blind andtime-based blind) require optimizationwherever and whenever possible In certain cases prediction(s) can be made Checking if current retrieved entry shares sameprefix with previous retrieved entr(ies) For example DROP ANY ROLE has same prefix asDROP ANY RULE (one request per checkedcharacter compared to bit-by-bit retrieval) Using common output values too (e.g.information_schema, phpmyadmin, etc.)
  40. 40. PHDays 2013, Moscow (Russia) May 23, 2013 40Brute forcing identifier names In case of missing schema (e.g. deletedinformation_schema) brute force search isrequired (e.g. 1=(SELECT 1 FROM users)) Searching for common table names (switch--common-tables) Searching for common column names (switch--common-columns) Conducted automated search and parsing ofresulting SQL files for chosen Google dorks(e.g. ext:sql “CREATE TABLE”) Collected most frequent 3.3K table names and2.5K column names
  41. 41. PHDays 2013, Moscow (Russia) May 23, 2013 41Pivot dump table Some DBMSes (e.g. Microsoft SQL Server) donthave OFFSET/LIMIT query mechanism makingenumeration problematic in non-UNION querytechniques Column with most DISTINCT values isautomatically chosen as the pivot column Pivots first value bigger than previous (e.g.SELECT MIN(id) WHERE id > ) is retrieved Entries for other columns (e.g. SELECT nameWHERE id=1) are being retrieved using currentpivot value Iterative process
  42. 42. PHDays 2013, Moscow (Russia) May 23, 2013 42International letters Добрый день Россия Page encoding is parsed from Content-TypeHTTP header, Content-Type meta HTML headeror heuristically detected (3rdparty modulechardet) RAW target response is automatically decodedto Unicode (using detected page encoding) In case of inband techniques (UNION query anderror-based) results with international lettersare already supported if decoding wentproperly
  43. 43. PHDays 2013, Moscow (Russia) May 23, 2013 43International letters (2) In case of inference techniques (boolean-basedblind and time-based blind) characters arebeing inferred already in their Unicode form Potential problems occur when stored dataand/or database connector use different (non-compatible) charset than targets response In case of unsuccessful decoding ofinternational letters (e.g. gibberish output)charset can be enforced (option --charset)
  44. 44. PHDays 2013, Moscow (Russia) May 23, 2013 44Hex encoding retrieved data All supported DBMSes have capabilities toencode resulting data to hexadecimal format(switch --hex) Most useful in cases when (parts of) results arepotentially lost (e.g. binary data in inbandtechniques) Retrieved data is automatically decoded to itsoriginal (non-hexadecimal) format Such binary content is checked for knownformats (usign 3rdparty module magic) and (ifrecognized) stored to output files
  45. 45. PHDays 2013, Moscow (Russia) May 23, 2013 45Dump format Dumped table content can be stored in 3different formats: CSV (default), HTML andSQLite (option --dump-format) In CSV format each row is represented by oneline and each column entry is being separatedby a predefined separator character (e.g. ,) In HTML format dump is stored into a visuallyrecognizable (browser) table In SQLite format dump is “replicated” to alocally stored SQLite3 database giving apossibility of (among others) running queriesagainst it
  46. 46. PHDays 2013, Moscow (Russia) May 23, 2013 46Password cracking Implemented support for detection andwordlist-based cracking of 14 differentcommonly used hash algorithms MySQL (newer and older), MsSQL (newer andolder), Oracle (newer and older), PostgreSQL,MD5, SHA1, etc. Automatic analysis of retrieved passwords (--passwords) and table dumps (--dump) (Optional) common suffix forms (1, 123, etc.) Multiprocessed attack (# of CPUs) 1M MySQL hash guesses in under 10 secondson 4 core Intel Xeon W3550 @ 3.07GHz
  47. 47. PHDays 2013, Moscow (Russia) May 23, 2013 47Large dictionary support Distributed access in multiprocessingenvironment Support for huge dictionaries (chunk read) Support for dictionary lists Support for ZIP compressed dictionaries Included custom built and compresseddictionary (1.2M entries) based on highlypopular and publicly available dumps, likeRockYou, Gawker, Yahoo, etc.
  48. 48. PHDays 2013, Moscow (Russia) May 23, 2013 48Stagers and backdoors Stagers are used for uploading arbitrary(binary) files (e.g. UDF files, backdoors, etc.) Backdoors are used for OS command execution(switches --os-cmd and --os-shell) Prerequisite is that one of known SQL file writemethods can be used (e.g. INTO DUMPFILE, EXECxp_cmdshell debug.exe < dump.src, etc.) 4 different platforms supported: ASP, ASP.NET,JSP and PHP Stored in “cloaked” format (preventing local AVtriggering) inside shell directory
  49. 49. PHDays 2013, Moscow (Russia) May 23, 2013 49Metasploit integration Automatized creation, upload and run ofMetasploit shellcode payload (switch --os-pwn) User can choose payload (Meterpreter, shellor VNC), connection (reverse TCP, reverse HTTP,etc.) and encoder type (no encoder, Call+4Dword XOR Encoder, etc.) shellcodeexec(.exe) is being uploaded alongwith (non-compiled) Metasploit shellcodepayload using stager or other means Metasploit CLI is being run at the host machine Payload is being executed at the targetmachine connecting back to the host machine
  50. 50. PHDays 2013, Moscow (Russia) May 23, 2013 50Second order SQL injection Occurs when provided user data stored at oneplace is being used in vulnerable SQLstatement at the other place Similar to permanent XSS User can explicitly set the location where tolook for the response (option --second-order) Effectively doubling number of requiredrequests
  51. 51. PHDays 2013, Moscow (Russia) May 23, 2013 51DNS exfiltration Out-of-band SQL injection technique using DNSresolution mechanism (option --dns-domain) Fake DNS server instance is automaticallybeing made at the host machine SQL injection payloads being sent aredeliberately provoking DNS resolutionmechanism at the target machine Provoked DNS requests carry results of a query Fake DNS server instance intercepts requestsand responds with dummy resolution answers Requires registration of a nameserver for theused domain pointing to the host machine
  52. 52. PHDays 2013, Moscow (Russia) May 23, 2013 52Output purging Output directory can be (optionally) “safely”removed (switch --purge-output) Content of all contained files (sessions, logs,dumps, etc.) is being overwritten with randomdata Files truncated and renamed to random values (sub)directories renamed to random values At the end, whole output directory tree is beingremoved
  53. 53. PHDays 2013, Moscow (Russia) May 23, 2013 53Questions?