Your SlideShare is downloading. ×

Miroslav Stampar. Sqlmap — Under the Hood.


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. sqlmap – Under the HoodMiroslav Štampar( – Under the HoodMiroslav Štampar(
  • 2. PHDays 2013, Moscow (Russia) May 23, 2013 2BigArray Support for huge table dumps (e.g. millions ofrows) Raw data needs to be held somewhere beforebeing processed (and eventually stored) In-memory was a good enough choice untilrecent years (user appetites went bigger) Avoidance of MemoryError Memory mapping into smaller chunks/pages(e.g. 4096 entries) Temporary files are used for storing chunks O(1) read/write access (page table principle)
  • 3. PHDays 2013, Moscow (Russia) May 23, 2013 3HashDB Storage of resumable session data atcentralized place (local SQLite3 database) Non-ASCII values are automaticallyserialized/deserialized (pickle) INSERT INTO storage VALUES(LONG(MD5(target_url || key ||MILESTONE_SALT)[:8]), stored_value) MILESTONE_SALT is changed whenever there is achange in HashDB mechanism that is bringingincompatibility with previous versions key uniquely describes storage_value for agiven target_url (e.g.: KB_INJECTIONS, SELECTbanner FROM v$version WHERE ROWNUM=1, etc.)
  • 4. PHDays 2013, Moscow (Russia) May 23, 2013 4Payloads XML format (xml/payloads.xml) Tag type <boundary> used for storage of allpossible prefix and suffix formations (<prefix>,<suffix>) together with context sensitiveinformation (subtags <level>, <clause>,<where> and <ptype>) Tag type <test> used for storage of datarequired for successful testing and usage ofeach SQL injection payload type (subtags<title>, <stype>, <level>, <risk>, <clause>,<where>, <vector>, <request> and <response>)
  • 5. PHDays 2013, Moscow (Russia) May 23, 2013 5Payloads (2)<boundary><level>1</level><clause>1</clause><where>1,2</where><ptype>1</ptype><prefix>)</prefix><suffix>AND([RANDNUM]=[RANDNUM]</suffix></boundary>
  • 6. PHDays 2013, Moscow (Russia) May 23, 2013 6Payloads (3)<test><title>Microsoft SQL Server/Sybase AND error-based - WHERE or HAVINGclause (IN)</title><stype>2</stype><level>2</level><risk>0</risk><clause>1</clause><where>1</where><vector>AND [RANDNUM] IN (([DELIMITER_START]+([QUERY])+[DELIMITER_STOP]))</vector><request><payload>AND [RANDNUM] IN (([DELIMITER_START]+(SELECT (CASE WHEN([RANDNUM]=[RANDNUM]) THEN 1 ELSE 0 END))+[DELIMITER_STOP]))</payload></request><response><grep>[DELIMITER_START](?P&lt;result&gt;.*?)[DELIMITER_STOP]</grep></response><details><dbms>Microsoft SQL Server</dbms><dbms>Sybase</dbms><os>Windows</os></details></test>
  • 7. PHDays 2013, Moscow (Russia) May 23, 2013 7Queries XML format (xml/queries.xml) Tag type <dbms> used for storage of all DBMSspecific SQL formations required for successfulenumeration (subtags <users>, <passwords>,<dbs>, <tables>, <columns>, <dump_table>, etc.)and resulting data (pre)processing (subtags<cast>, <length>, <isnull>, <count>,<substring>, <concatenate>, etc.) Each enumeration subtag has an <inband> and<blind> form used in respective techniques
  • 8. PHDays 2013, Moscow (Russia) May 23, 2013 8Queries (2)<dbms value="MySQL"><cast query="CAST(%s AS CHAR)"/><length query="CHAR_LENGTH(%s)"/><isnull query="IFNULL(%s, )"/><delimiter query=","/><limit query="LIMIT %d,%d"/>…<passwords><inband query="SELECT user,passwordFROM mysql.user" condition="user"/><blind query="SELECT DISTINCT(password)FROM mysql.user WHERE user=%s LIMIT %d,1"count="SELECT COUNT(DISTINCT(password)) FROMmysql.user WHERE user=%s"/></passwords>…
  • 9. PHDays 2013, Moscow (Russia) May 23, 2013 9Multithreading Multithreading implemented whereverapplicable (option --threads) Techniques covered: boolean-based blind,error-based and partial UNION query Deliberately turned off for techniques: time-based and stacked (lots of reasons) Each thread covers a part of value in case ofboolean-based blind In other techniques, each thread covers oneenumerated entry Also, implemented for brute force column/tablename search and crawling
  • 10. PHDays 2013, Moscow (Russia) May 23, 2013 10Direct connection Direct connection to DBMS (option -d) python -d“mysql://root:password123@” Support for: Microsoft SQL Server, MySQL,Oracle, PostgreSQL, SQLite, Microsoft Access,Firebird, SAP MaxDB, Sybase, IBM DB2 Using of 3rdparty connectors (e.g. python-pymssql, pymysql, cx_Oracle, python-psycopg2,etc.) SQLAlchemy used as an alternative
  • 11. PHDays 2013, Moscow (Russia) May 23, 2013 11Load request(s) from file Load HTTP request(s) from a textual file (option-r) Supporting RAW request format (any MITMproxy can be used to catch one) Particularly usable in requests with largecontent body (e.g. POST) Load and parse log files (option -l) Supporting Burp and WebScarab log formats Unlimited number of parsed HTTP requests(using only unique ones)
  • 12. PHDays 2013, Moscow (Russia) May 23, 2013 12Content type detection Automatic detection of (specialized) requestcontent types Supporting SOAP, JSON and (generic) XML For example:--data="{ "pid": 4412, "id":1, "action": "do"}"--data="<request><pid>4412</pid><id>1</id><action>do</action></request>" Appropriate exploitation of parameter values In case of non-supported format(s), custominjection mark (*) can be used
  • 13. PHDays 2013, Moscow (Russia) May 23, 2013 13Site crawling/form searching Collect usable (on site) target links (option--crawl) User defines crawling depth (e.g. 3) limitingsearch based on distance from starting page Optional form searching at visited pages(switch --forms) Arbitrary filling of missing form data Reparation of non-HTML compliant pages foreasier processing
  • 14. PHDays 2013, Moscow (Russia) May 23, 2013 14Mnemonics Usage of mnemonics for faster setting up ofsqlmap options and switches (option -z) Longer (original):python --flush-session--threads=4 --ignore-proxy --batch --banner-u … Shorter (using mnemonics):python -z“flu,thre=4,ign,bat,ban” -u … Highly generic prefix based recognition (e.g. -z“flu,bat,ban” is interpreted the same as -z“flush,batc,bann”)
  • 15. PHDays 2013, Moscow (Russia) May 23, 2013 15Keep-alive HTTP persistent connection (switch --keep-alive) Opposed to new connection for every singlerequest/response pair Slightly adapted 3rdparty module keepaliveand adjusted for multi-threading Connection pool – reusage of existing targetconnection(s) where applicable Reduced network congestion (fewer TCPconnections), reduced latency (nohandshaking), faster enumeration, etc.
  • 16. PHDays 2013, Moscow (Russia) May 23, 2013 16Tor Support for The Onion Router (Tor) onlineanonymity network (switch --tor) Concealing identity and network activity Used against surveillance and (targeted) trafficsniffing Configurable Tor proxy type (option --tor-type)and port number (option --tor-port) DNS leakage is prevented (no DNS requestsoutside of Tor) Available safety check for proper usage of Tor(switch --check-tor)
  • 17. PHDays 2013, Moscow (Russia) May 23, 2013 17Domain name resolution caching DNS resolution request is done by default foreach HTTP request (from Python HTTPdedicated modules – e.g. httplib) Noticeable slowdown in some cases (e.g.excessive network latency) Problem noticed and reported by (nagging)users (looking into Wireshark traffic captures) Problem patched at the lowest level (methodsocket.getaddrinfo(*args, **kwargs) isencapsulated for caching)
  • 18. PHDays 2013, Moscow (Russia) May 23, 2013 18Authentication methods Implemented support for authenticationmethods: basic, digest, NTLM and certificate(options --auth-type, --auth-cred and --auth-cert) python -u“”--auth-type=basic --auth-cred=”testuser:testpass” Handling HTTP status code 401 (Unauthorized) Authorization headers are being cached (whereapplicable)
  • 19. PHDays 2013, Moscow (Russia) May 23, 2013 19Reflection detection and removal Noisy response resulting from requestreflection Query results for: 1%20AND%201%3D1 Can cause problems in detection phase Particularly problematic for boolean-basedblind technique (fuzzy page comparison) Automatic detection of reflected payload valueand marking with predefined constant value Query results for: __REFLECTED_VALUE__
  • 20. PHDays 2013, Moscow (Russia) May 23, 2013 20Dynamicity detection and removal Noisy response resulting from sporadicallychanging content (e.g. ads, banners, etc.) Can cause problems in both detection andenumeration phase Particularly problematic for boolean-basedblind technique Automatic detection and marking of dynamicparts (info held in internal knowledge base) In best case, automatic recognition and usageof string value appearing only in Trueresponses (option --string)
  • 21. PHDays 2013, Moscow (Russia) May 23, 2013 21Content filtering Occasionally pages are bulked with non-textualcontent (CSS styles, comments, JavaScript,HTML tags, embedded objects, etc.) Changes regarding boolean-based blindtechnique are usually affecting only one smalltextual part (e.g. table entry) Optional filtering of non-textual content (switch–text-only) For example: <html>...<td>Toothfairy</td>...</html> is filtered to ...Toothfairy... Better detection and less trash(y) results
  • 22. PHDays 2013, Moscow (Russia) May 23, 2013 22Wizard mode For beginner users and script kiddies (switch--wizard) Questions asked:Target URLPOST data (if any)Injection difficulty (Normal/Medium/Hard)Enumeration (Basic/Intermediate/All) Infamous for Comodo Brazil breach (March2011) – attackers posted wizard mode consoleoutput to the Pastebin
  • 23. PHDays 2013, Moscow (Russia) May 23, 2013 23Level/risk of detection Number of requests per each parameter intesting phase can grow from 10 up to 10K To prevent unnecessary noise and speed up thetesting time, tests are classified by level andrisk Level (option --level) represents (passing)possibility/usability of the test case (higherlevel means lower possibility) Risk (option --risk) represents potentialdamage that the test case can cause (higherrisk means higher potential damage)
  • 24. PHDays 2013, Moscow (Russia) May 23, 2013 24Heuristic SQL injection checks Recognition of the backend DBMS if errormessage can be provoked with arbitrary invalidSQL sequence (e.g. ())”(”) In case that the parameter value is integer andresponse for (e.g.) 1 is the same as for (2-1),there is a good chance that the target isvulnerable In case of detected boolean-based blindtechnique, DBMS specific queries are used (e.g.(SELECT 0x616263)=0x616263) to potentiallymove focus to a particular DBMS in furthertests
  • 25. PHDays 2013, Moscow (Russia) May 23, 2013 25Type casting detection Type casting is an efficient way for dealing withSQL injection on numeric values $query = "SELECT * FROM log WHERE id=" .intval($_GET[id]); Implemented automatic detection of suchcases In case that the parameter value is integer andresponse for (e.g.) 1 is the same as for 1foobar,there is a good chance that the target is usinginteger casting User is warned of a potentially “futile” run
  • 26. PHDays 2013, Moscow (Russia) May 23, 2013 26Fingerprinting Web server is being fingerprinted by knownHTTP headers, cookie values, etc. DBMS is being fingerprinted through errormessage parsing, banner parsing and testswith version specific payloads (obtained fromrelease notes and reference manuals) For example, cookie value ASP.NET_SessionId isspecific for ASP.NET/IIS/Windows platform,while TO_SECONDS(950501)>0 check should workonly on MySQL >= 5.5.0 Detailed DBMS version check is done only ifswitch -f/--fingerpint is used
  • 27. PHDays 2013, Moscow (Russia) May 23, 2013 27Suhosin-patch detection Open source patch for PHP, protecting webserver from “insecure PHP practices” suhosin.get.max_value_length (default: 512),, etc. Causing problems in enumeration phase whenpayloads are big (e.g. enumerating columnnames) After the detection phase single payload(depending on detected techniques) is senthaving size greater than 512 (e.g. 1 AND 6525= … 6525) User is warned in case of False response
  • 28. PHDays 2013, Moscow (Russia) May 23, 2013 28WAF/IDS/IPS detection Sending one “suspicious” request (in form ofdummy parameter value) and checking forresponse change(s) when compared to original(switch --check-waf) WAF scripts (switch --identify-waf) do athrough checking, each focusing onpeculiarities of a particular product For example, WebKnight responds with HTTPstatus code 999 on detected suspicious activity Currently there are 29 WAF scripts (,,, etc.)
  • 29. PHDays 2013, Moscow (Russia) May 23, 2013 29WAF/IDS/IPS bypass Tamper scripts (option --tamper) do changes oninjected payload before its being sent User has to choose appropriate one(s) basedon collected knowledge of targets behaviorand/or detected WAF/IDS/IPS product If required, a chain of tamper scripts can beused (e.g. --tamper=”between,ifnull2ifisnull”) Currently there are 36 tamper scripts(,,, etc.)
  • 30. PHDays 2013, Moscow (Russia) May 23, 2013 30String value escaping Each string value inside payload isautomatically escaped (quoteless format)depending on targeted DBMS For example: 1 ... AND username=”root”-- isin case of MySQL escaped to 1 ... ANDusername=0x726f6f74-- Avoidance of filter-based escaping functions(e.g. addslashes) Adding implicit dependence to targeted DBMS Payload obfuscation (harder noticeability intarget log files)
  • 31. PHDays 2013, Moscow (Russia) May 23, 2013 31Evaluation of custom code Custom Python code can be evaluated beforeeach request (option --eval) In such code, each request parameter isaccessible as a local variable All resulting variable values are included intothe request as new parameter values --eval="importhashlib;hash=hashlib.md5(id).hexdigest()" AND1=1&hash=7f134e52836a00e26493e690ed8aa735
  • 32. PHDays 2013, Moscow (Russia) May 23, 2013 32Fuzzy page comparison Used (mostly) in boolean-based blindtechnique Gestalt pattern matching (Ratcliff-Obershelpalgorithm) Supported by standard Python module difflib Class SequenceMatcher Method ratio() (or faster quick_ratio())giving a measure of the sequences’ similarityas a float in range [0, 1] True result if ratio() > 0.98 when comparedwith original page
  • 33. PHDays 2013, Moscow (Russia) May 23, 2013 33Definite page comparison Used mostly in boolean-based blind technique When fuzzy page comparison fails (e.g. toomuch page dynamicity) and user is able todistinguish True from False responses byhimself (non-n**b) String to match when result should berecognized as True (option --string) Regular expression to match … (option --regex) Compare HTTP codes (switch --code) Compare HTML titles (switch --title)
  • 34. PHDays 2013, Moscow (Russia) May 23, 2013 34Null connection Sometimes there is no need for retrieval ofwhole page content (size can be enough) Boolean-based blind technique 3 methods: Range, HEAD and “skip-read” Range: bytes=-1Content-Range: bytes 4789-4790/4790 HEAD /search.aspx HTTP/1.1Content-Length: 4790 Both are resulting (if applicable) with eitherempty or 1 char long response Method “skip-read” retrieves only HTTPheaders looking for Content-Length
  • 35. PHDays 2013, Moscow (Russia) May 23, 2013 35False positive detection False positives are highly undesirable Specific for boolean-based blind and time-based blind techniques False positive tests are done in cases whenonly one of those techniques is detected Set of trivial mathematical checks performed tosee if target can “respond” correctly For example:(123+447)=570319>(519+110)(654+267)>854
  • 36. PHDays 2013, Moscow (Russia) May 23, 2013 36Delay detection Detection of “artificial” delay Statistical comparison with normal responsetimes Response time must fit under the Gaussian bellcurve to be marked as “normal” Is <current_response_time> >avg(<normal_response_times>)+7*stdev(<normal_response_times>)? If answer is yes, probability that we are dealingwith “artificial” delay is 99.9999999997440% Especially useful when heavy queries are used(not knowing expected delay value)
  • 37. PHDays 2013, Moscow (Russia) May 23, 2013 37Delay detection (2)
  • 38. PHDays 2013, Moscow (Russia) May 23, 2013 38UNION query column # UNION query requires knowledge of number ofcolumns (N) for vulnerable SQL statement Two methods used: ORDER BY and statistical(same principle as in delay detection) ORDER BY N+1 should respond noticeablydifferent (preferably with error message) thanfor ORDER BY N (binary searched) In statistical method responses for candidates(UNION SELECT NULL, NULL,...) are comparedto original (not injected) response Right one is the one that seems “not normal”(having ratio outside the Gaussian bell curve)
  • 39. PHDays 2013, Moscow (Russia) May 23, 2013 39Output prediction Inference techniques (boolean-based blind andtime-based blind) require optimizationwherever and whenever possible In certain cases prediction(s) can be made Checking if current retrieved entry shares sameprefix with previous retrieved entr(ies) For example DROP ANY ROLE has same prefix asDROP ANY RULE (one request per checkedcharacter compared to bit-by-bit retrieval) Using common output values too (e.g.information_schema, phpmyadmin, etc.)
  • 40. PHDays 2013, Moscow (Russia) May 23, 2013 40Brute forcing identifier names In case of missing schema (e.g. deletedinformation_schema) brute force search isrequired (e.g. 1=(SELECT 1 FROM users)) Searching for common table names (switch--common-tables) Searching for common column names (switch--common-columns) Conducted automated search and parsing ofresulting SQL files for chosen Google dorks(e.g. ext:sql “CREATE TABLE”) Collected most frequent 3.3K table names and2.5K column names
  • 41. PHDays 2013, Moscow (Russia) May 23, 2013 41Pivot dump table Some DBMSes (e.g. Microsoft SQL Server) donthave OFFSET/LIMIT query mechanism makingenumeration problematic in non-UNION querytechniques Column with most DISTINCT values isautomatically chosen as the pivot column Pivots first value bigger than previous (e.g.SELECT MIN(id) WHERE id > ) is retrieved Entries for other columns (e.g. SELECT nameWHERE id=1) are being retrieved using currentpivot value Iterative process
  • 42. PHDays 2013, Moscow (Russia) May 23, 2013 42International letters Добрый день Россия Page encoding is parsed from Content-TypeHTTP header, Content-Type meta HTML headeror heuristically detected (3rdparty modulechardet) RAW target response is automatically decodedto Unicode (using detected page encoding) In case of inband techniques (UNION query anderror-based) results with international lettersare already supported if decoding wentproperly
  • 43. PHDays 2013, Moscow (Russia) May 23, 2013 43International letters (2) In case of inference techniques (boolean-basedblind and time-based blind) characters arebeing inferred already in their Unicode form Potential problems occur when stored dataand/or database connector use different (non-compatible) charset than targets response In case of unsuccessful decoding ofinternational letters (e.g. gibberish output)charset can be enforced (option --charset)
  • 44. PHDays 2013, Moscow (Russia) May 23, 2013 44Hex encoding retrieved data All supported DBMSes have capabilities toencode resulting data to hexadecimal format(switch --hex) Most useful in cases when (parts of) results arepotentially lost (e.g. binary data in inbandtechniques) Retrieved data is automatically decoded to itsoriginal (non-hexadecimal) format Such binary content is checked for knownformats (usign 3rdparty module magic) and (ifrecognized) stored to output files
  • 45. PHDays 2013, Moscow (Russia) May 23, 2013 45Dump format Dumped table content can be stored in 3different formats: CSV (default), HTML andSQLite (option --dump-format) In CSV format each row is represented by oneline and each column entry is being separatedby a predefined separator character (e.g. ,) In HTML format dump is stored into a visuallyrecognizable (browser) table In SQLite format dump is “replicated” to alocally stored SQLite3 database giving apossibility of (among others) running queriesagainst it
  • 46. PHDays 2013, Moscow (Russia) May 23, 2013 46Password cracking Implemented support for detection andwordlist-based cracking of 14 differentcommonly used hash algorithms MySQL (newer and older), MsSQL (newer andolder), Oracle (newer and older), PostgreSQL,MD5, SHA1, etc. Automatic analysis of retrieved passwords (--passwords) and table dumps (--dump) (Optional) common suffix forms (1, 123, etc.) Multiprocessed attack (# of CPUs) 1M MySQL hash guesses in under 10 secondson 4 core Intel Xeon W3550 @ 3.07GHz
  • 47. PHDays 2013, Moscow (Russia) May 23, 2013 47Large dictionary support Distributed access in multiprocessingenvironment Support for huge dictionaries (chunk read) Support for dictionary lists Support for ZIP compressed dictionaries Included custom built and compresseddictionary (1.2M entries) based on highlypopular and publicly available dumps, likeRockYou, Gawker, Yahoo, etc.
  • 48. PHDays 2013, Moscow (Russia) May 23, 2013 48Stagers and backdoors Stagers are used for uploading arbitrary(binary) files (e.g. UDF files, backdoors, etc.) Backdoors are used for OS command execution(switches --os-cmd and --os-shell) Prerequisite is that one of known SQL file writemethods can be used (e.g. INTO DUMPFILE, EXECxp_cmdshell debug.exe < dump.src, etc.) 4 different platforms supported: ASP, ASP.NET,JSP and PHP Stored in “cloaked” format (preventing local AVtriggering) inside shell directory
  • 49. PHDays 2013, Moscow (Russia) May 23, 2013 49Metasploit integration Automatized creation, upload and run ofMetasploit shellcode payload (switch --os-pwn) User can choose payload (Meterpreter, shellor VNC), connection (reverse TCP, reverse HTTP,etc.) and encoder type (no encoder, Call+4Dword XOR Encoder, etc.) shellcodeexec(.exe) is being uploaded alongwith (non-compiled) Metasploit shellcodepayload using stager or other means Metasploit CLI is being run at the host machine Payload is being executed at the targetmachine connecting back to the host machine
  • 50. PHDays 2013, Moscow (Russia) May 23, 2013 50Second order SQL injection Occurs when provided user data stored at oneplace is being used in vulnerable SQLstatement at the other place Similar to permanent XSS User can explicitly set the location where tolook for the response (option --second-order) Effectively doubling number of requiredrequests
  • 51. PHDays 2013, Moscow (Russia) May 23, 2013 51DNS exfiltration Out-of-band SQL injection technique using DNSresolution mechanism (option --dns-domain) Fake DNS server instance is automaticallybeing made at the host machine SQL injection payloads being sent aredeliberately provoking DNS resolutionmechanism at the target machine Provoked DNS requests carry results of a query Fake DNS server instance intercepts requestsand responds with dummy resolution answers Requires registration of a nameserver for theused domain pointing to the host machine
  • 52. PHDays 2013, Moscow (Russia) May 23, 2013 52Output purging Output directory can be (optionally) “safely”removed (switch --purge-output) Content of all contained files (sessions, logs,dumps, etc.) is being overwritten with randomdata Files truncated and renamed to random values (sub)directories renamed to random values At the end, whole output directory tree is beingremoved
  • 53. PHDays 2013, Moscow (Russia) May 23, 2013 53Questions?