sqlmap internals

sqlmap internalssqlmap internals
Miroslav Stampar
(mstampar@zsis.hr; miroslav@sqlmap.org)
sqlmap internalssqlmap internals
Miroslav Stampar
(mstampar@zsis.hr; miroslav@sqlmap.org)

SecAdmin, Sevilla (Spain) November 24th, 2017 2
IntroductionIntroduction
Free and open source penetration testing tool
that automates the process of detecting and
exploiting SQL injection flaws and taking over
of database server(s)
Written in Python (2)
11 years old (July 25th
2006)
2 authors / core developers (Bernardo Damele
and Miroslav Stampar)
65K LoC (Lines of Code)
100% accuracy and 0% false-positives by
WAVSEP benchmark of 64 Web Application
Scanners (sectoolmarket.com)

CapabilitiesCapabilities
78 switches (e.g. --tor) and 91 options (e.g.
--url=”...”) in 15 categories (Target,
Request, Optimization, Injection, etc.)
Full coverage for (relational DBMS-es): MySQL,
Oracle, PostgreSQL, Microsoft SQL Server,
Microsoft Access, IBM DB2, SQLite, Firebird,
Sybase, SAP MaxDB, HSQLDB and Informix
Full support for SQLi techniques: boolean-
based blind, time-based blind, error-based,
UNION query-based and stacked queries
Database enumeration, file-system
manipulation, out-of-band communication, etc.

Sample runSample run

Socket pre-connect (1)Socket pre-connect (1)
TCP three-way handshake (SYN, SYN-ACK,
ACK) is inherently slow (“necessary evil”)
Each HTTP request requires a completed
TCP handshake procedure
sqlmap runs a “pre-connect” thread in
background filling a pool of (e.g. 3)
connections with TCP handshake done
Overrides Python’s socket.connect()
25% speed-up of a program’s run on
average

Socket pre-connect (2)Socket pre-connect (2)

NULL connection (1)NULL connection (1)
In boolean-based blind SQLi response sizes
should suffice (e.g. >1000 bytes → TRUE)
“NULL” naming because of skipping the
retrieval of complete HTTP response
Range: bytes=-1
Content-Range: bytes 4789-4789/4790
HEAD /search.aspx HTTP/1.1
Content-Length: 4790
Both are resulting (if applicable) with empty
HTTP body (faster retrieval of responses)
By looking into “length” headers we can
differentiate TRUE from FALSE answers

NULL connection (2)NULL connection (2)

HashDB (1)HashDB (1)
Storage of resumable session data at
centralized place (local SQLite3 database)
Non-ASCII values are being automatically
serialized/deserialized (pickle)
INSERT INTO storage VALUES
(INT(MD5(target_url, uid, MILESTONE_SALT)
[:8]), stored_value)
uid uniquely describes stored_value for a
given target_url (e.g.: KB_INJECTIONS, SELECT
VERSION(), etc.)
MILESTONE_SALT changed whenever there is an
incompatible update of HashDB mechanism

HashDB (2)HashDB (2)

BigArray (1)BigArray (1)
Support for huge table dumps (e.g. millions of
rows)
Raw data needs to be held somewhere before
being processed (and eventually stored)
In memory storage was a good enough choice
until user appetites went bigger (!)
Memory mapping into smaller chunks (1MB) –
memory pages
Temporary files store (compressed) chunks
In-memory caching of currently used chunk
O(1) read/write access

BigArray (2)BigArray (2)

Heuristics (1)Heuristics (1)
“Educational shortcuts to ease the cognitive
load of making a decision”
Resulting with a solution which is not
guaranteed to be optimal (though very helpful)
Type casting (e.g. ?id=1foobar)
DBMS error reporting (e.g. ?id=1())'”(”')
Character filtering (e.g. ?id=1 AND 7=(7))
Length constraining (e.g. id=1 AND 3182=
3182)
(quick) DBMS detection (e.g. ?id=1 AND
(SELECT 0x73716c)=0x73716c)

Heuristics (2)Heuristics (2)

Boolean inference (1)Boolean inference (1)
Binary search using greater-than operator
O(Log2n) complexity compared to sequential
search with O(n)
Faster than bit-by-bit extraction (on average 6
requests compared to 8 requests)
For example:
Sample initial table ['A','B',...'Z']
AND (...) > 'M' → TRUE → ['N',...'Z']
AND (...) > 'S' → FALSE → ['N',...'S']
AND (...) > 'O' → TRUE → ['P', 'R', 'S']
AND (...) > 'R' → FALSE → ['P', 'R']
AND (...) > 'P' → FALSE → 'P' (result)

Boolean inference (2)Boolean inference (2)

Boundaries / levels / risks (1)Boundaries / levels / risks (1)
SQLi detection requires working payload
(e.g. AND 1=1) together with proper
boundaries (e.g. ?query=test’ AND 1=1
AND ‘x’=’x)
Number of tested prefix/suffix boundaries is
constrained with option --level (e.g.
“)))))
Number of tested payloads is constrained
with option --risk (e.g. OR 1=1)
Greater the level and risk, greater the
number of testing cases

Boundaries / levels / risks (2)Boundaries / levels / risks (2)

Statistics (1)Statistics (1)
Network latency (or lagging) is the main
problem of time-based blind technique
For example, used deliberate delay is 1 sec,
normal response times are >0.5 and <2.0 secs,
what we can conclude for 1.5 sec response?
sqlmap learns what's normal and what's not
from non-delay based payload responses (e.g.
boolean-based blind payloads)
Normal distribution is being calculated
(Gaussian bell-shaped curve)
Everything inside is considered as “normal”,
outside as “not normal”

Statistics (2)Statistics (2)
Everything that's normal (i.e. not deliberately
delayed) should fit under the curve
μ(t) represents a mean, while σ(t) represents
a standard deviation of response times
99.99% of normal response times fall under the
upper border value μ(t) + 7σ(t)

False-positive detection (1)False-positive detection (1)
Detection of “error” in SQLi detection engine
Giving false sense of certainty while in reality
there is nothing exploitable at the other side
Almost exclusive to boolean-based blind and
time-based blind cases
Simple tests are being done after the detection
Comparing responses to boolean operations
with expected results (e.g. id=1 AND 95=27)
If any of results is contrary to the expected
value, SQLi is discarded as a false-positive (or
unexploitable)

False-positive detection (2)False-positive detection (2)

WAF/IDS/IPS detection (1)WAF/IDS/IPS detection (1)
Sending deliberately suspicious payloads and
checking response(s) for unique characteristics
(e.g.) ?id=1&bwXY=5253 AND 1=1 UNION ALL
SELECT 1,NULL,'<script>alert("XSS")
</script>',table_name FROM
information_schema.tables WHERE
2>1--/**/; EXEC xp_cmdshell('cat ../../
../etc/passwd')#
ModSecurity returns HTTP error code 501 on
detected attack, F5 BIG-IP adds its own X-
Cnection HTTP header, etc.
Fingeprinting 59 different WAF/IDS/IPS products

WAF/IDS/IPS detection (2)WAF/IDS/IPS detection (2)

Tamper scripts (1)Tamper scripts (1)
Auxiliary python scripts modifying the payload
before being sent (e.g. ?id=1 AND 2>1 to
?id=1 AND 2 NOT BETWEEN 0 AND 1)
Currently 54 tamper scripts (between.py,
space2randomblank.py, versionedkeywords.py,
etc.)
User has to choose appropriate one(s) based
on collected knowledge of target's behavior
and/or detected WAF/IDS/IPS product
Chain of tamper scripts (if required) can be
used (e.g. --tamper=”between,
ifnull2ifisnull”)

Tamper scripts (2)Tamper scripts (2)

Brute-forcing identifiers (1)Brute-forcing identifiers (1)
In some cases system tables are unreadable
(e.g. because of lack of permissions)
Hence, no way to retrieve identifier names
(tables and columns)
sqlmap does guessing by brute-forcing
availability of most common identifiers (e.g.
?id=1 AND EXISTS(SELECT 123 FROM users))
Identifiers (3369 table and 2601 column
names) have been collected and frequency-
sorted by retrieving and parsing thousands
of online SQL scripts

Brute-forcing identifiers (2)Brute-forcing identifiers (2)

Hash cracking (1)Hash cracking (1)
Automatic recognition and dictionary
cracking of 30 different hash algorithms
(e.g. mysql, mssql, md5_generic,
sha1_generic, etc.)
Included dictionary with 1.4 million wordlist
entries (RockYou, MySpace, Gawker, etc.)
Multiprocessing (# of cores)
Blazing fast (e.g. under 10 seconds for
whole dictionary pass with mysql routine)
Stores uncracked hashes to file for eventual
further processing (with other tools)

Hash cracking (2)Hash cracking (2)

Stagers / backdoors (1)Stagers / backdoors (1)
Stager uploaded in a first (dirty) stage (e.g.
possibility of a query junk in case of INTO
OUTFILE method)
Stager has a functionality of uploading
arbitrary files
Backdoor (or any binary) uploaded in second
(clean) stage by using stager
Backdoor has a functionality of executing
arbitrary OS commands
Supported platforms: PHP, ASP, ASPX, JSP

Stagers / backdoors (2)Stagers / backdoors (2)

DNS exfiltration (1)DNS exfiltration (1)
In some cases it's possible to incorporate
SQL (sub)query results into DNS resolution
requests
Microsoft SQL Server, Oracle, MySQL and
PostgreSQL
Dozens of resulting characters can be
transferred per single request (compared to
boolean-based blind and time-based blind)
Domain name server entry (e.g.
ns1.attacker.com) has to point to IP
address of machine running sqlmap

Questions?Questions?

sqlmap internals

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to sqlmap internals

Similar to sqlmap internals (20)

More from Miroslav Stampar

More from Miroslav Stampar (15)

Recently uploaded

Recently uploaded (20)

sqlmap internals