Data normalization
weaknesses
@d0znpp
VolgaCTF, 03/09/2013
Intro
• Researcher, bug-hunter, CEO
• Web application security in depth
• @d0znpp personal twitter
• lab.onsec.ru our blog (@ONsec_lab)
What is normalization?
• Transferring and storing data are always
accompanied by their formatting
• First normalization than formatting
• Encoding (different charsets)
• Truncation (limited sizes)
• Trims
• Canonizations
• ...
Data normalization or input
validation weaknesses?
Web application basics
• Client-Server model
• Client is browser (Chrome, Safari, IE, FF)
• Server is web server software (Nginx,
Apache)
• Application server (FastCGI,Tomcat)
• Database storage (SQL or noSQL)
Web application
example. Depth #1
Browser WebServer
Database
AppServer
HTTP FCGI SQL
Web application
example. Depth #2
Browser WebServer
Database
AppServer
HTTP FCGI SQL
Operation System
File System
FS driver
Web application
example. Depth #3
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Protocol level
normalization
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Protocol level
normalization
• Urlencoding - what could be simpler?
• %22 to «
• %23 to #
• %25 to %
• Double url-encoding is basic bypass for
many input validators, right?
2+ urlencoding
Why not?!
Browser Frontend Backend
HTTP FCGI
OS
Balancer
HTTP
%252527 %2527 %27
Input
validator
Protocol level
normalization
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Protocol level
normalization
• HTTP parameter pollution
• https://www.owasp.org/images/b/ba/
AppsecEU09_CarettoniDiPaola_v0.8.pdf
• ?id=1&id=2 id=1,2
• HTTP parameter contamination
• http://netsec.rs/files/Http%20Parameter
%20Contamination%20-%20Ivan%20Markovic
%20NSS.pdf
• ?load[file ?load_file
Protocol level
normalization
• Something new?
• Why only parameters?
• Let’s try to fuzz smth else! :)
• GET{F}/{F}HTTP.1.1
• {F} = 0x09, 0x0b, 0x0c, 0x0d, 0x32
• Apache/2.2.22 (Unix)
• GET / bla-bla bla bla bla ehohoh
Valid packet!
File paths normalization
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Filesystem names
canonization
• Path Traversal
• /../../../../../../etc/passwd
• Normalization
• http://www.ush.it/2009/02/08/php-
filesystem-attack-vectors/
• http://onsec.ru/
onsec.whitepaper-02.eng.pdf
Filesystem names
canonization
• Normalization
• /etc/passwd//////////////////////////////////.php
• C:boot.<<
• C:boot’‘ini
• C:boot.in>
Database storing
normalization
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Database storing
normalization
• Encodings
• Client encoding
• Storing encoding
• Trim
• Size limited truncation
Database storing
normalization
• VARCHAR or BLOB ?
• What size limit of CREATE TABLE t1 (login
TEXT) ?
• INSERT INTO loginsVALUES
(:id, :login, :password)
• $login = « admin aa»
Application layer
normalization
Browser WebServer
Database
AppServer
HTTP FCGI SQL
OS
File System
FS driver
Network layer
Application layer
normalization
• SSRF bible. Cheatsheet
• https://docs.google.com/document/d/
1v1TkWZtrhzRLy0bYXBcdLUedXGb9nj
TNIJXa3u9akHM/#
• PHP fsockopen() url parsing tricks
Application layer
normalization
• Port overwriting, formatting
• localhost:81
• localhost:+81AAAAA
• localhost: 00081 AAA
IT IS ENCODING !!!
Multibyte encodings
• One byte for one char
• More bytes for one char !
• á
• 0xE1
• 0xC3A1 UTF-8 C-form
• 0x61CC81 UTF-8 D-form
addslashes() bypass
• http://shiflett.org/blog/2006/jan/addslashes-
versus-mysql-real-escape-string
• ’ to ’
• Replace 0x27 byte to 0x5c27
• But what about multibyte?
• 0xbf5c - valid char for GBK encoding
• 0xbf5c27 -> 0xbf5c 0x27
addslashes() bypass
• http://kuza55.blogspot.ru/2007/06/mysql-
injection-encoding-attacks.html
• Find all encodings where 0x5c is valid second
byte at any char
• big5, [A1-F9]
• sjis, [81-9F], [E0-FC]
• gbk, [81-FE]
• cp932, [81-9F], [E0-FC]
Homework!
escapeshellarg/cmd()
• Note that:
• PHP use SH by default at system(), not
BASH
• SH have no multibyte encoding
• escapeshellarg cut bytes 0x80-0xFF
But... escapeshellarg()
• http://lab.onsec.ru/2013/03/breaking-
escapeshellarg-news.html
• for shell no differences between
• ls -la
• ls ‘’-la’’
• ls ‘-la’
• unzip ‘-d/var/www’ - escaped, but arg!
PHP string encoding
http://www.php.net/manual/
language.types.string.php#language.types.string.details
• String will be encoded in whatever fashion it is encoded in
the script file
• If Zend Multibyte is enabled, the script may be written in
an arbitrary encoding (which is explicity declared or is
detected) and then converted to a certain internal
encoding, which is then the encoding that will be used for
the string literals
• State-dependent encodings where the same byte values can
be used in initial and non-initial shift states may be
problematic
Multibyte problems
• Lengths in chars or bytes?
• State-dependent encodings
• 0x0102 char
• 0x0203 char
• 0x01020203 two chars
• But what about case when 0x0202 is valid
char also?
• Try to find 0x0202 in this string ;)
Thanks for attention!
d0znpp@ONsec.ru
@d0znpp
@ONsec_lab
lab.onsec.ru

Data normalization weaknesses