sqlmap internalssqlmap internals
Miroslav Stampar
(mstampar@zsis.hr; miroslav@sqlmap.org)
sqlmap internalssqlmap internals
Miroslav Stampar
(mstampar@zsis.hr; miroslav@sqlmap.org)
SecAdmin, Sevilla (Spain) November 24th, 2017 2
IntroductionIntroduction
Free and open source penetration testing tool
that automates the process of detecting and
exploiting SQL injection flaws and taking over
of database server(s)
Written in Python (2)
11 years old (July 25th
2006)
2 authors / core developers (Bernardo Damele
and Miroslav Stampar)
65K LoC (Lines of Code)
100% accuracy and 0% false-positives by
WAVSEP benchmark of 64 Web Application
Scanners (sectoolmarket.com)
SecAdmin, Sevilla (Spain) November 24th, 2017 3
CapabilitiesCapabilities
78 switches (e.g. --tor) and 91 options (e.g.
--url=”...”) in 15 categories (Target,
Request, Optimization, Injection, etc.)
Full coverage for (relational DBMS-es): MySQL,
Oracle, PostgreSQL, Microsoft SQL Server,
Microsoft Access, IBM DB2, SQLite, Firebird,
Sybase, SAP MaxDB, HSQLDB and Informix
Full support for SQLi techniques: boolean-
based blind, time-based blind, error-based,
UNION query-based and stacked queries
Database enumeration, file-system
manipulation, out-of-band communication, etc.
SecAdmin, Sevilla (Spain) November 24th, 2017 4
Sample runSample run
SecAdmin, Sevilla (Spain) November 24th, 2017 5
Socket pre-connect (1)Socket pre-connect (1)
TCP three-way handshake (SYN, SYN-ACK,
ACK) is inherently slow (“necessary evil”)
Each HTTP request requires a completed
TCP handshake procedure
sqlmap runs a “pre-connect” thread in
background filling a pool of (e.g. 3)
connections with TCP handshake done
Overrides Python’s socket.connect()
25% speed-up of a program’s run on
average
SecAdmin, Sevilla (Spain) November 24th, 2017 6
Socket pre-connect (2)Socket pre-connect (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 7
NULL connection (1)NULL connection (1)
In boolean-based blind SQLi response sizes
should suffice (e.g. >1000 bytes → TRUE)
“NULL” naming because of skipping the
retrieval of complete HTTP response
Range: bytes=-1
Content-Range: bytes 4789-4789/4790
HEAD /search.aspx HTTP/1.1
Content-Length: 4790
Both are resulting (if applicable) with empty
HTTP body (faster retrieval of responses)
By looking into “length” headers we can
differentiate TRUE from FALSE answers
SecAdmin, Sevilla (Spain) November 24th, 2017 8
NULL connection (2)NULL connection (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 9
HashDB (1)HashDB (1)
Storage of resumable session data at
centralized place (local SQLite3 database)
Non-ASCII values are being automatically
serialized/deserialized (pickle)
INSERT INTO storage VALUES
(INT(MD5(target_url, uid, MILESTONE_SALT)
[:8]), stored_value)
uid uniquely describes stored_value for a
given target_url (e.g.: KB_INJECTIONS, SELECT
VERSION(), etc.)
MILESTONE_SALT changed whenever there is an
incompatible update of HashDB mechanism
SecAdmin, Sevilla (Spain) November 24th, 2017 10
HashDB (2)HashDB (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 11
BigArray (1)BigArray (1)
Support for huge table dumps (e.g. millions of
rows)
Raw data needs to be held somewhere before
being processed (and eventually stored)
In memory storage was a good enough choice
until user appetites went bigger (!)
Memory mapping into smaller chunks (1MB) –
memory pages
Temporary files store (compressed) chunks
In-memory caching of currently used chunk
O(1) read/write access
SecAdmin, Sevilla (Spain) November 24th, 2017 12
BigArray (2)BigArray (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 13
Heuristics (1)Heuristics (1)
“Educational shortcuts to ease the cognitive
load of making a decision”
Resulting with a solution which is not
guaranteed to be optimal (though very helpful)
Type casting (e.g. ?id=1foobar)
DBMS error reporting (e.g. ?id=1())'”(”')
Character filtering (e.g. ?id=1 AND 7=(7))
Length constraining (e.g. id=1 AND 3182=
3182)
(quick) DBMS detection (e.g. ?id=1 AND
(SELECT 0x73716c)=0x73716c)
SecAdmin, Sevilla (Spain) November 24th, 2017 14
Heuristics (2)Heuristics (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 15
Boolean inference (1)Boolean inference (1)
Binary search using greater-than operator
O(Log2n) complexity compared to sequential
search with O(n)
Faster than bit-by-bit extraction (on average 6
requests compared to 8 requests)
For example:
Sample initial table ['A','B',...'Z']
AND (...) > 'M' → TRUE → ['N',...'Z']
AND (...) > 'S' → FALSE → ['N',...'S']
AND (...) > 'O' → TRUE → ['P', 'R', 'S']
AND (...) > 'R' → FALSE → ['P', 'R']
AND (...) > 'P' → FALSE → 'P' (result)
SecAdmin, Sevilla (Spain) November 24th, 2017 16
Boolean inference (2)Boolean inference (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 17
Boundaries / levels / risks (1)Boundaries / levels / risks (1)
SQLi detection requires working payload
(e.g. AND 1=1) together with proper
boundaries (e.g. ?query=test’ AND 1=1
AND ‘x’=’x)
Number of tested prefix/suffix boundaries is
constrained with option --level (e.g.
“)))))
Number of tested payloads is constrained
with option --risk (e.g. OR 1=1)
Greater the level and risk, greater the
number of testing cases
SecAdmin, Sevilla (Spain) November 24th, 2017 18
Boundaries / levels / risks (2)Boundaries / levels / risks (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 19
Statistics (1)Statistics (1)
Network latency (or lagging) is the main
problem of time-based blind technique
For example, used deliberate delay is 1 sec,
normal response times are >0.5 and <2.0 secs,
what we can conclude for 1.5 sec response?
sqlmap learns what's normal and what's not
from non-delay based payload responses (e.g.
boolean-based blind payloads)
Normal distribution is being calculated
(Gaussian bell-shaped curve)
Everything inside is considered as “normal”,
outside as “not normal”
SecAdmin, Sevilla (Spain) November 24th, 2017 20
Statistics (2)Statistics (2)
Everything that's normal (i.e. not deliberately
delayed) should fit under the curve
μ(t) represents a mean, while σ(t) represents
a standard deviation of response times
99.99% of normal response times fall under the
upper border value μ(t) + 7σ(t)
SecAdmin, Sevilla (Spain) November 24th, 2017 21
False-positive detection (1)False-positive detection (1)
Detection of “error” in SQLi detection engine
Giving false sense of certainty while in reality
there is nothing exploitable at the other side
Almost exclusive to boolean-based blind and
time-based blind cases
Simple tests are being done after the detection
Comparing responses to boolean operations
with expected results (e.g. id=1 AND 95=27)
If any of results is contrary to the expected
value, SQLi is discarded as a false-positive (or
unexploitable)
SecAdmin, Sevilla (Spain) November 24th, 2017 22
False-positive detection (2)False-positive detection (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 23
WAF/IDS/IPS detection (1)WAF/IDS/IPS detection (1)
Sending deliberately suspicious payloads and
checking response(s) for unique characteristics
(e.g.) ?id=1&bwXY=5253 AND 1=1 UNION ALL
SELECT 1,NULL,'<script>alert("XSS")
</script>',table_name FROM
information_schema.tables WHERE
2>1--/**/; EXEC xp_cmdshell('cat ../../
../etc/passwd')#
ModSecurity returns HTTP error code 501 on
detected attack, F5 BIG-IP adds its own X-
Cnection HTTP header, etc.
Fingeprinting 59 different WAF/IDS/IPS products
SecAdmin, Sevilla (Spain) November 24th, 2017 24
WAF/IDS/IPS detection (2)WAF/IDS/IPS detection (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 25
Tamper scripts (1)Tamper scripts (1)
Auxiliary python scripts modifying the payload
before being sent (e.g. ?id=1 AND 2>1 to
?id=1 AND 2 NOT BETWEEN 0 AND 1)
Currently 54 tamper scripts (between.py,
space2randomblank.py, versionedkeywords.py,
etc.)
User has to choose appropriate one(s) based
on collected knowledge of target's behavior
and/or detected WAF/IDS/IPS product
Chain of tamper scripts (if required) can be
used (e.g. --tamper=”between,
ifnull2ifisnull”)
SecAdmin, Sevilla (Spain) November 24th, 2017 26
Tamper scripts (2)Tamper scripts (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 27
Brute-forcing identifiers (1)Brute-forcing identifiers (1)
In some cases system tables are unreadable
(e.g. because of lack of permissions)
Hence, no way to retrieve identifier names
(tables and columns)
sqlmap does guessing by brute-forcing
availability of most common identifiers (e.g.
?id=1 AND EXISTS(SELECT 123 FROM users))
Identifiers (3369 table and 2601 column
names) have been collected and frequency-
sorted by retrieving and parsing thousands
of online SQL scripts
SecAdmin, Sevilla (Spain) November 24th, 2017 28
Brute-forcing identifiers (2)Brute-forcing identifiers (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 29
Hash cracking (1)Hash cracking (1)
Automatic recognition and dictionary
cracking of 30 different hash algorithms
(e.g. mysql, mssql, md5_generic,
sha1_generic, etc.)
Included dictionary with 1.4 million wordlist
entries (RockYou, MySpace, Gawker, etc.)
Multiprocessing (# of cores)
Blazing fast (e.g. under 10 seconds for
whole dictionary pass with mysql routine)
Stores uncracked hashes to file for eventual
further processing (with other tools)
SecAdmin, Sevilla (Spain) November 24th, 2017 30
Hash cracking (2)Hash cracking (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 31
Stagers / backdoors (1)Stagers / backdoors (1)
Stager uploaded in a first (dirty) stage (e.g.
possibility of a query junk in case of INTO
OUTFILE method)
Stager has a functionality of uploading
arbitrary files
Backdoor (or any binary) uploaded in second
(clean) stage by using stager
Backdoor has a functionality of executing
arbitrary OS commands
Supported platforms: PHP, ASP, ASPX, JSP
SecAdmin, Sevilla (Spain) November 24th, 2017 32
Stagers / backdoors (2)Stagers / backdoors (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 33
DNS exfiltration (1)DNS exfiltration (1)
In some cases it's possible to incorporate
SQL (sub)query results into DNS resolution
requests
Microsoft SQL Server, Oracle, MySQL and
PostgreSQL
Dozens of resulting characters can be
transferred per single request (compared to
boolean-based blind and time-based blind)
Domain name server entry (e.g.
ns1.attacker.com) has to point to IP
address of machine running sqlmap
SecAdmin, Sevilla (Spain) November 24th, 2017 34
DNS exfiltration (2)DNS exfiltration (2)
SecAdmin, Sevilla (Spain) November 24th, 2017 35
DNS exfiltration (3)DNS exfiltration (3)
SecAdmin, Sevilla (Spain) November 24th, 2017 36
Questions?Questions?

sqlmap internals

  • 1.
    sqlmap internalssqlmap internals MiroslavStampar (mstampar@zsis.hr; miroslav@sqlmap.org) sqlmap internalssqlmap internals Miroslav Stampar (mstampar@zsis.hr; miroslav@sqlmap.org)
  • 2.
    SecAdmin, Sevilla (Spain)November 24th, 2017 2 IntroductionIntroduction Free and open source penetration testing tool that automates the process of detecting and exploiting SQL injection flaws and taking over of database server(s) Written in Python (2) 11 years old (July 25th 2006) 2 authors / core developers (Bernardo Damele and Miroslav Stampar) 65K LoC (Lines of Code) 100% accuracy and 0% false-positives by WAVSEP benchmark of 64 Web Application Scanners (sectoolmarket.com)
  • 3.
    SecAdmin, Sevilla (Spain)November 24th, 2017 3 CapabilitiesCapabilities 78 switches (e.g. --tor) and 91 options (e.g. --url=”...”) in 15 categories (Target, Request, Optimization, Injection, etc.) Full coverage for (relational DBMS-es): MySQL, Oracle, PostgreSQL, Microsoft SQL Server, Microsoft Access, IBM DB2, SQLite, Firebird, Sybase, SAP MaxDB, HSQLDB and Informix Full support for SQLi techniques: boolean- based blind, time-based blind, error-based, UNION query-based and stacked queries Database enumeration, file-system manipulation, out-of-band communication, etc.
  • 4.
    SecAdmin, Sevilla (Spain)November 24th, 2017 4 Sample runSample run
  • 5.
    SecAdmin, Sevilla (Spain)November 24th, 2017 5 Socket pre-connect (1)Socket pre-connect (1) TCP three-way handshake (SYN, SYN-ACK, ACK) is inherently slow (“necessary evil”) Each HTTP request requires a completed TCP handshake procedure sqlmap runs a “pre-connect” thread in background filling a pool of (e.g. 3) connections with TCP handshake done Overrides Python’s socket.connect() 25% speed-up of a program’s run on average
  • 6.
    SecAdmin, Sevilla (Spain)November 24th, 2017 6 Socket pre-connect (2)Socket pre-connect (2)
  • 7.
    SecAdmin, Sevilla (Spain)November 24th, 2017 7 NULL connection (1)NULL connection (1) In boolean-based blind SQLi response sizes should suffice (e.g. >1000 bytes → TRUE) “NULL” naming because of skipping the retrieval of complete HTTP response Range: bytes=-1 Content-Range: bytes 4789-4789/4790 HEAD /search.aspx HTTP/1.1 Content-Length: 4790 Both are resulting (if applicable) with empty HTTP body (faster retrieval of responses) By looking into “length” headers we can differentiate TRUE from FALSE answers
  • 8.
    SecAdmin, Sevilla (Spain)November 24th, 2017 8 NULL connection (2)NULL connection (2)
  • 9.
    SecAdmin, Sevilla (Spain)November 24th, 2017 9 HashDB (1)HashDB (1) Storage of resumable session data at centralized place (local SQLite3 database) Non-ASCII values are being automatically serialized/deserialized (pickle) INSERT INTO storage VALUES (INT(MD5(target_url, uid, MILESTONE_SALT) [:8]), stored_value) uid uniquely describes stored_value for a given target_url (e.g.: KB_INJECTIONS, SELECT VERSION(), etc.) MILESTONE_SALT changed whenever there is an incompatible update of HashDB mechanism
  • 10.
    SecAdmin, Sevilla (Spain)November 24th, 2017 10 HashDB (2)HashDB (2)
  • 11.
    SecAdmin, Sevilla (Spain)November 24th, 2017 11 BigArray (1)BigArray (1) Support for huge table dumps (e.g. millions of rows) Raw data needs to be held somewhere before being processed (and eventually stored) In memory storage was a good enough choice until user appetites went bigger (!) Memory mapping into smaller chunks (1MB) – memory pages Temporary files store (compressed) chunks In-memory caching of currently used chunk O(1) read/write access
  • 12.
    SecAdmin, Sevilla (Spain)November 24th, 2017 12 BigArray (2)BigArray (2)
  • 13.
    SecAdmin, Sevilla (Spain)November 24th, 2017 13 Heuristics (1)Heuristics (1) “Educational shortcuts to ease the cognitive load of making a decision” Resulting with a solution which is not guaranteed to be optimal (though very helpful) Type casting (e.g. ?id=1foobar) DBMS error reporting (e.g. ?id=1())'”(”') Character filtering (e.g. ?id=1 AND 7=(7)) Length constraining (e.g. id=1 AND 3182= 3182) (quick) DBMS detection (e.g. ?id=1 AND (SELECT 0x73716c)=0x73716c)
  • 14.
    SecAdmin, Sevilla (Spain)November 24th, 2017 14 Heuristics (2)Heuristics (2)
  • 15.
    SecAdmin, Sevilla (Spain)November 24th, 2017 15 Boolean inference (1)Boolean inference (1) Binary search using greater-than operator O(Log2n) complexity compared to sequential search with O(n) Faster than bit-by-bit extraction (on average 6 requests compared to 8 requests) For example: Sample initial table ['A','B',...'Z'] AND (...) > 'M' → TRUE → ['N',...'Z'] AND (...) > 'S' → FALSE → ['N',...'S'] AND (...) > 'O' → TRUE → ['P', 'R', 'S'] AND (...) > 'R' → FALSE → ['P', 'R'] AND (...) > 'P' → FALSE → 'P' (result)
  • 16.
    SecAdmin, Sevilla (Spain)November 24th, 2017 16 Boolean inference (2)Boolean inference (2)
  • 17.
    SecAdmin, Sevilla (Spain)November 24th, 2017 17 Boundaries / levels / risks (1)Boundaries / levels / risks (1) SQLi detection requires working payload (e.g. AND 1=1) together with proper boundaries (e.g. ?query=test’ AND 1=1 AND ‘x’=’x) Number of tested prefix/suffix boundaries is constrained with option --level (e.g. “))))) Number of tested payloads is constrained with option --risk (e.g. OR 1=1) Greater the level and risk, greater the number of testing cases
  • 18.
    SecAdmin, Sevilla (Spain)November 24th, 2017 18 Boundaries / levels / risks (2)Boundaries / levels / risks (2)
  • 19.
    SecAdmin, Sevilla (Spain)November 24th, 2017 19 Statistics (1)Statistics (1) Network latency (or lagging) is the main problem of time-based blind technique For example, used deliberate delay is 1 sec, normal response times are >0.5 and <2.0 secs, what we can conclude for 1.5 sec response? sqlmap learns what's normal and what's not from non-delay based payload responses (e.g. boolean-based blind payloads) Normal distribution is being calculated (Gaussian bell-shaped curve) Everything inside is considered as “normal”, outside as “not normal”
  • 20.
    SecAdmin, Sevilla (Spain)November 24th, 2017 20 Statistics (2)Statistics (2) Everything that's normal (i.e. not deliberately delayed) should fit under the curve μ(t) represents a mean, while σ(t) represents a standard deviation of response times 99.99% of normal response times fall under the upper border value μ(t) + 7σ(t)
  • 21.
    SecAdmin, Sevilla (Spain)November 24th, 2017 21 False-positive detection (1)False-positive detection (1) Detection of “error” in SQLi detection engine Giving false sense of certainty while in reality there is nothing exploitable at the other side Almost exclusive to boolean-based blind and time-based blind cases Simple tests are being done after the detection Comparing responses to boolean operations with expected results (e.g. id=1 AND 95=27) If any of results is contrary to the expected value, SQLi is discarded as a false-positive (or unexploitable)
  • 22.
    SecAdmin, Sevilla (Spain)November 24th, 2017 22 False-positive detection (2)False-positive detection (2)
  • 23.
    SecAdmin, Sevilla (Spain)November 24th, 2017 23 WAF/IDS/IPS detection (1)WAF/IDS/IPS detection (1) Sending deliberately suspicious payloads and checking response(s) for unique characteristics (e.g.) ?id=1&bwXY=5253 AND 1=1 UNION ALL SELECT 1,NULL,'<script>alert("XSS") </script>',table_name FROM information_schema.tables WHERE 2>1--/**/; EXEC xp_cmdshell('cat ../../ ../etc/passwd')# ModSecurity returns HTTP error code 501 on detected attack, F5 BIG-IP adds its own X- Cnection HTTP header, etc. Fingeprinting 59 different WAF/IDS/IPS products
  • 24.
    SecAdmin, Sevilla (Spain)November 24th, 2017 24 WAF/IDS/IPS detection (2)WAF/IDS/IPS detection (2)
  • 25.
    SecAdmin, Sevilla (Spain)November 24th, 2017 25 Tamper scripts (1)Tamper scripts (1) Auxiliary python scripts modifying the payload before being sent (e.g. ?id=1 AND 2>1 to ?id=1 AND 2 NOT BETWEEN 0 AND 1) Currently 54 tamper scripts (between.py, space2randomblank.py, versionedkeywords.py, etc.) User has to choose appropriate one(s) based on collected knowledge of target's behavior and/or detected WAF/IDS/IPS product Chain of tamper scripts (if required) can be used (e.g. --tamper=”between, ifnull2ifisnull”)
  • 26.
    SecAdmin, Sevilla (Spain)November 24th, 2017 26 Tamper scripts (2)Tamper scripts (2)
  • 27.
    SecAdmin, Sevilla (Spain)November 24th, 2017 27 Brute-forcing identifiers (1)Brute-forcing identifiers (1) In some cases system tables are unreadable (e.g. because of lack of permissions) Hence, no way to retrieve identifier names (tables and columns) sqlmap does guessing by brute-forcing availability of most common identifiers (e.g. ?id=1 AND EXISTS(SELECT 123 FROM users)) Identifiers (3369 table and 2601 column names) have been collected and frequency- sorted by retrieving and parsing thousands of online SQL scripts
  • 28.
    SecAdmin, Sevilla (Spain)November 24th, 2017 28 Brute-forcing identifiers (2)Brute-forcing identifiers (2)
  • 29.
    SecAdmin, Sevilla (Spain)November 24th, 2017 29 Hash cracking (1)Hash cracking (1) Automatic recognition and dictionary cracking of 30 different hash algorithms (e.g. mysql, mssql, md5_generic, sha1_generic, etc.) Included dictionary with 1.4 million wordlist entries (RockYou, MySpace, Gawker, etc.) Multiprocessing (# of cores) Blazing fast (e.g. under 10 seconds for whole dictionary pass with mysql routine) Stores uncracked hashes to file for eventual further processing (with other tools)
  • 30.
    SecAdmin, Sevilla (Spain)November 24th, 2017 30 Hash cracking (2)Hash cracking (2)
  • 31.
    SecAdmin, Sevilla (Spain)November 24th, 2017 31 Stagers / backdoors (1)Stagers / backdoors (1) Stager uploaded in a first (dirty) stage (e.g. possibility of a query junk in case of INTO OUTFILE method) Stager has a functionality of uploading arbitrary files Backdoor (or any binary) uploaded in second (clean) stage by using stager Backdoor has a functionality of executing arbitrary OS commands Supported platforms: PHP, ASP, ASPX, JSP
  • 32.
    SecAdmin, Sevilla (Spain)November 24th, 2017 32 Stagers / backdoors (2)Stagers / backdoors (2)
  • 33.
    SecAdmin, Sevilla (Spain)November 24th, 2017 33 DNS exfiltration (1)DNS exfiltration (1) In some cases it's possible to incorporate SQL (sub)query results into DNS resolution requests Microsoft SQL Server, Oracle, MySQL and PostgreSQL Dozens of resulting characters can be transferred per single request (compared to boolean-based blind and time-based blind) Domain name server entry (e.g. ns1.attacker.com) has to point to IP address of machine running sqlmap
  • 34.
    SecAdmin, Sevilla (Spain)November 24th, 2017 34 DNS exfiltration (2)DNS exfiltration (2)
  • 35.
    SecAdmin, Sevilla (Spain)November 24th, 2017 35 DNS exfiltration (3)DNS exfiltration (3)
  • 36.
    SecAdmin, Sevilla (Spain)November 24th, 2017 36 Questions?Questions?