Remote file path traversal attacks for fun and profit

Remote File Path Traversal
Attacks for Fun and Profit
Dr. Dharma Ganesan

Disclaimer
● The opinions expressed here are my own but not the views of my employer
● The source code fragments shown here can be reused but
○ without any warranty nor accept any responsibility for failures
● Do not apply the exploit discussed here on other systems
○ without obtaining authorization from owners
2

Goal
● Demonstrate how attackers can steal information from servers
● Present an anti-pattern that enables file path traversal attacks
● Discuss how to prevent file path traversal attacks (in C)
● Present some metrics to compare # of lines before and after patching
3

Intended Audience
● Anyone interested in foundations of secure programming (in C)
● Exploits discussed here are well-known to the security community
● But I hope it is still informative for newcomers to software security
4

Context: Client-Server Architectural Style
● Clients send request for a file to the server
○ Clients can be web-browsers, telnet clients, web clients, etc.
● Server sends the requested file to the clients
○ Server can be any program (e.g., web server) that responds to requests
● Of course, the server should not disclose not-public files to the clients
5

Context: Client-Server Architectural Style...
Request (public) File
Response File
Threat: Attackers could steal files from the private folder of the server.
Caution: This threat is not only applicable to web but also to any distributed systems
Server
6

System under attack and Tools
● System under attack: The web server used here is part of a Hacking book by
Jon Erickson
● The web server vulnerability presented here is not discussed in the book
○ Slides complement the book by exploiting a different vulnerability
● Tools
○ Telnet web client - part of standard Linux installations
■ Telnet has its own security problems but it is fine for this demo
○ But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.)
7

A simple web application - Proof of Concept
What are the attack surfaces
to steal private files?
- No forms to submit
- No javascripts
Take sometime before seeing
the rest of the slides
8

Right click to view page source of index.html
<html>
<img src="image.jpg">
</html>
So image.jpg must be the smiley face
9

Under-the-hood steps to just smile
1. Request: “GET / HTTP/1.1” from the browser to the server
2. Response: index.html from the server to the browser
3. Request: “GET /image.jpg” from the browser to the server
4. Response: image.jpg contents from the server to the browser
5. Request: Browser sends an implicit request for the url icon
1.GET / HTTP/1.1
2. Contents of index.html
3. GET /image.jpg HTTP/1.1
4. Contents of image.jpg
Server
10

Confirm these steps - take a peek at server’s log
● Just to be sure let’s see our web server’s log fragment:
● Got request from 127.0.0.1:50628 "GET / HTTP/1.1"
Opening ./webroot/index.html't 200 OK
Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1"
Opening ./webroot/image.jpg't 200 OK
● The log shows that the browser actually sent two requests as expected
● The files are delivered relative to the (public) webroot directory
11

Attack surface to reach the server’s private files
● Of course, what if we target one of the GET request’s path?
● For example, the second GET request and mess with the file name
● Core Idea: Instead of image.jpg what if we request any other file
12

Let’s steal arbitrary file contents (telnet client)
● Instead of a web browser, let’s use a telnet client
○ Telnet has its own security problems but for our purpose it is fine.
● # telnet localhost 80 (i.e., connect to our web server listening at port 80)
...
GET / HTTP/1.1 (I typed this command)
HTTP/1.0 200 OK
Server: Tiny webserver
<html>
<img src="image.jpg">
</html>
13

Let’s steal server’s /etc/passwd using a telnet client
# telnet localhost 80
...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 200 OK
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...
Contents of the
/etc/passwd
exposed
14

The web server log fragment and analysis
# ./tinyweb
Accepting web requests on port 80
Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1"
Opening ./webroot/../../../etc/passwd't 200 OK
● The attacker was able to get out of the public root directory!
● The tiny web server is clearly vulnerable to remote file content disclosure
● It appears that the server strcat the web root with the incoming file name
○ We will confirm by looking into the source code of the web server
15

Analysis of the web server code fragment
if(strncmp(request, "GET ", 4) == 0) // GET request
ptr = request+4; // ptr is the URL.
...
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // ptr points to the input url string in GET
…
}
● The anti-pattern is that the server concatenate the public web root with the input
filename (can be evil)
● The resource variable contains the filename as a string but it was not
evaluated/canonicalized before opening the file
16

How to fix this vulnerability? (high-level idea)
● It is not that difficult to find this anti-pattern but fixing is important
○ For large systems, grep for strcpy followed by strcat with variable names (e.g. filename)
○ grep -A 1 “strcpy” -r * | grep “strcat” …
● Canonicalize after combining the public webroot with the input filename
● Evaluate whether the canonicalized file is within the webroot
● If yes, we are safe and can disclose the content
● Otherwise, raise a generic error that the file is not found
17

Canonicalize and then validate filenames - core idea
● canonicalize_file_name is a library function
● For example, if the input string is ./webroot/../../../etc/passwd
● Output of canonicalize is : /etc/passwd (in my case)
○ Do not forgot to free the memory returned by canonicalize
● Check whether the prefix of the canonicalized file name is within the public dir
○ See starts_with_substr in the appendix
○ I don’t think the C language has built-in functions to check the prefix
❖ If you find bugs in my patch (see the appendix), please contact me
18

The patched web server stopped the exploit
# telnet localhost 80
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 404 NOT FOUND
<html><head><title>404 NOT Found</title></head><body><h1>URL not
found</h1></body></html>
Connection closed by foreign host 19

The patched web server log fragment and analysis
# ./tinyweb_secure
Accepting web requests on port 80
Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1"
input file name = /etc/passwd
Unsafe file ./webroot/../../../etc/passwd't 404 Not Found
● The server knows that the attacker is reaching into non-public directories
● The server successfully stopped the attacker
20

Number of lines before and after patching
# wc -l tinyweb_secure.c tinyweb.c
224 tinyweb_secure.c (after patching)
122 tinyweb.c (before patching)
● To my surprise, the patched version (tinyweb_secure.c) has nearly two times
more code than the original version (tinyweb.c)
○ Comments are inlined using “//” - so they do not contribute much to the metrics
● This shows to me that secure coding (in C) will take at least two times more
coding effort than “traditional” coding
○ If I add my code review and testing effort, it is at least three times more expensive!
● More study is needed on other systems to confirm my claims on effort
○ May be it is the C language and its small library contribute to more application code
○ Or, it is me who did not patch it in a compact way - but I doubt 21

Space of inputs for traditional vs secure programs
22
Evil input values
Valid input values
● Traditional programs handle only valid input
values well
● Secure coding requires the programs to
handle evil input values, too
● The problem is that the threats (evil inputs)
have to identified up-front and
○ software has to be designed to resist
and recover

Conclusion and broader applicability
● Using string concat to construct file names can be dangerous
○ This anti-pattern should be avoided
● The server should canonicalize file names and check the resulting filenames
○ Otherwise attackers will get into private directories and steal files
● File Path exploitation is independent of web-applications
● Any client-server architecture must close this attack surface
● Usage of TLS between clients and server will not stop the attack
○ In this case, TLS will just help attackers to securely download private files :)
● Firewalls do not usually stop file path exploitation payload
23

References
1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP
2. OWASP: https://www.owasp.org/index.php/Path_Traversal
3. Jon Erickson, Hacking - The Art of Exploitation
a. The web server of this book is exploited for buffer overflows in the book
b. My slides show a different vulnerability not discussed in the book AFAIK
4. Robert Seacord, Secure coding in C and C++.
a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo
b. Usage of strcat to construct and test filenames is also well-explained
24

Questions/Comments
dharmalingam.ganesan11@gmail.com
25

Appendix - Implementation to fix the vulnerability
26

How to fix this vulnerability?
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // and join it with resource path pointed by ptr.
if(is_safe_file(resource)) { // Is it inside the web root?
fd = open(resource, O_RDONLY, 0); // Try to open the file.
printf("tOpening %s't", resource);
}
else {
printf("tUnsafe file %s't", resource); // Hacker is attacking us.
fd = -1;
}
27

My implementation of is_safe_file
/* Returns -1 if the absolute filename is NULL.
* Returns 1 if the absolute filename is present inside the web root.
* Returns 0 otherwise.
*/
int is_safe_file(char *filename) {
char *realFname;
char *fullwebRootPath;
int status = -1;
if(NULL == filename)
return status;
fullwebRootPath = getWebrootFullPath();
if(fullwebRootPath != NULL) {
realFname = canonicalize_file_name(filename);
status = starts_with_substr(realFname, fullwebRootPath);
printf("input file name = %sn", realFname);
free(realFname);
free(fullwebRootPath);
}
return status;
}
cont...
28

My implementation of starts_with_substr
/* Returns 1 if the given str starts with the given prefix. * Returns -1 if the
arguments are invalid. Otherwise, returns 0.
*/
int starts_with_substr(char *str, char *prefix) {
if(NULL == str || NULL == prefix)
return -1;
while(*prefix) {
if(*str++ != *prefix++) return 0;
}
return 1;
} 29

My implementation of the get web root path
/* Returns the full web root file path. The caller must free the returned memory.
*/
char* getWebrootFullPath() {
long size;
char *cwd; /* current working dir */
char *ptr = NULL;
char *webrootPath = NULL;
char *webrootCanoPath = NULL;
size = pathconf(".", _PC_PATH_MAX);
if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL;
30

get web root path ...
if(getcwd(cwd, (size_t)size)) {
size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0'
if ((webrootPath = (char*)malloc((size_t)size)) == NULL) {
free(cwd);
return NULL;
}
strcpy(webrootPath, cwd);
strcat(webrootPath, "/");
strcat(webrootPath, WEBROOT);
}
webrootCanoPath = canonicalize_file_name(webrootPath);
free(cwd);
free(webrootPath);
return webrootCanoPath;
} 31

Remote file path traversal attacks for fun and profit

More Related Content

What's hot

Similar to Remote file path traversal attacks for fun and profit

More from Dharmalingam Ganesan

Recently uploaded

Remote file path traversal attacks for fun and profit