Remote File Path Traversal
Attacks for Fun and Profit
Dr. Dharma Ganesan
Disclaimer
● The opinions expressed here are my own but not the views of my employer
● The source code fragments shown here can be reused but
○ without any warranty nor accept any responsibility for failures
● Do not apply the exploit discussed here on other systems
○ without obtaining authorization from owners
2
Goal
● Demonstrate how attackers can steal information from servers
● Present an anti-pattern that enables file path traversal attacks
● Discuss how to prevent file path traversal attacks (in C)
● Present some metrics to compare # of lines before and after patching
3
Intended Audience
● Anyone interested in foundations of secure programming (in C)
● Exploits discussed here are well-known to the security community
● But I hope it is still informative for newcomers to software security
4
Context: Client-Server Architectural Style
● Clients send request for a file to the server
○ Clients can be web-browsers, telnet clients, web clients, etc.
● Server sends the requested file to the clients
○ Server can be any program (e.g., web server) that responds to requests
● Of course, the server should not disclose not-public files to the clients
5
Context: Client-Server Architectural Style...
Request (public) File
Response File
Threat: Attackers could steal files from the private folder of the server.
Caution: This threat is not only applicable to web but also to any distributed systems
Server
6
System under attack and Tools
● System under attack: The web server used here is part of a Hacking book by
Jon Erickson
● The web server vulnerability presented here is not discussed in the book
○ Slides complement the book by exploiting a different vulnerability
● Tools
○ Telnet web client - part of standard Linux installations
■ Telnet has its own security problems but it is fine for this demo
○ But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.)
7
A simple web application - Proof of Concept
What are the attack surfaces
to steal private files?
- No forms to submit
- No javascripts
Take sometime before seeing
the rest of the slides
8
Right click to view page source of index.html
<html>
<img src="image.jpg">
</html>
So image.jpg must be the smiley face
9
Under-the-hood steps to just smile
1. Request: “GET / HTTP/1.1” from the browser to the server
2. Response: index.html from the server to the browser
3. Request: “GET /image.jpg” from the browser to the server
4. Response: image.jpg contents from the server to the browser
5. Request: Browser sends an implicit request for the url icon
1.GET / HTTP/1.1
2. Contents of index.html
3. GET /image.jpg HTTP/1.1
4. Contents of image.jpg
Server
10
Confirm these steps - take a peek at server’s log
● Just to be sure let’s see our web server’s log fragment:
● Got request from 127.0.0.1:50628 "GET / HTTP/1.1"
Opening ./webroot/index.html't 200 OK
Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1"
Opening ./webroot/image.jpg't 200 OK
● The log shows that the browser actually sent two requests as expected
● The files are delivered relative to the (public) webroot directory
11
Attack surface to reach the server’s private files
● Of course, what if we target one of the GET request’s path?
● For example, the second GET request and mess with the file name
● Core Idea: Instead of image.jpg what if we request any other file
12
Let’s steal arbitrary file contents (telnet client)
● Instead of a web browser, let’s use a telnet client
○ Telnet has its own security problems but for our purpose it is fine.
● # telnet localhost 80 (i.e., connect to our web server listening at port 80)
...
GET / HTTP/1.1 (I typed this command)
HTTP/1.0 200 OK
Server: Tiny webserver
<html>
<img src="image.jpg">
</html>
13
Let’s steal server’s /etc/passwd using a telnet client
# telnet localhost 80
...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 200 OK
Server: Tiny webserver
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...
Contents of the
/etc/passwd
exposed
14
The web server log fragment and analysis
# ./tinyweb
Accepting web requests on port 80
Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1"
Opening ./webroot/../../../etc/passwd't 200 OK
● The attacker was able to get out of the public root directory!
● The tiny web server is clearly vulnerable to remote file content disclosure
● It appears that the server strcat the web root with the incoming file name
○ We will confirm by looking into the source code of the web server
15
Analysis of the web server code fragment
if(strncmp(request, "GET ", 4) == 0) // GET request
ptr = request+4; // ptr is the URL.
...
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // ptr points to the input url string in GET
…
}
● The anti-pattern is that the server concatenate the public web root with the input
filename (can be evil)
● The resource variable contains the filename as a string but it was not
evaluated/canonicalized before opening the file
16
How to fix this vulnerability? (high-level idea)
● It is not that difficult to find this anti-pattern but fixing is important
○ For large systems, grep for strcpy followed by strcat with variable names (e.g. filename)
○ grep -A 1 “strcpy” -r * | grep “strcat” …
● Canonicalize after combining the public webroot with the input filename
● Evaluate whether the canonicalized file is within the webroot
● If yes, we are safe and can disclose the content
● Otherwise, raise a generic error that the file is not found
17
Canonicalize and then validate filenames - core idea
● canonicalize_file_name is a library function
● For example, if the input string is ./webroot/../../../etc/passwd
● Output of canonicalize is : /etc/passwd (in my case)
○ Do not forgot to free the memory returned by canonicalize
● Check whether the prefix of the canonicalized file name is within the public dir
○ See starts_with_substr in the appendix
○ I don’t think the C language has built-in functions to check the prefix
❖ If you find bugs in my patch (see the appendix), please contact me
18
The patched web server stopped the exploit
# telnet localhost 80
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 404 NOT FOUND
Server: Tiny webserver
<html><head><title>404 NOT Found</title></head><body><h1>URL not
found</h1></body></html>
Connection closed by foreign host 19
The patched web server log fragment and analysis
# ./tinyweb_secure
Accepting web requests on port 80
Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1"
input file name = /etc/passwd
Unsafe file ./webroot/../../../etc/passwd't 404 Not Found
● The server knows that the attacker is reaching into non-public directories
● The server successfully stopped the attacker
20
Number of lines before and after patching
# wc -l tinyweb_secure.c tinyweb.c
224 tinyweb_secure.c (after patching)
122 tinyweb.c (before patching)
● To my surprise, the patched version (tinyweb_secure.c) has nearly two times
more code than the original version (tinyweb.c)
○ Comments are inlined using “//” - so they do not contribute much to the metrics
● This shows to me that secure coding (in C) will take at least two times more
coding effort than “traditional” coding
○ If I add my code review and testing effort, it is at least three times more expensive!
● More study is needed on other systems to confirm my claims on effort
○ May be it is the C language and its small library contribute to more application code
○ Or, it is me who did not patch it in a compact way - but I doubt 21
Space of inputs for traditional vs secure programs
22
Evil input values
Valid input values
● Traditional programs handle only valid input
values well
● Secure coding requires the programs to
handle evil input values, too
● The problem is that the threats (evil inputs)
have to identified up-front and
○ software has to be designed to resist
and recover
Conclusion and broader applicability
● Using string concat to construct file names can be dangerous
○ This anti-pattern should be avoided
● The server should canonicalize file names and check the resulting filenames
○ Otherwise attackers will get into private directories and steal files
● File Path exploitation is independent of web-applications
● Any client-server architecture must close this attack surface
● Usage of TLS between clients and server will not stop the attack
○ In this case, TLS will just help attackers to securely download private files :)
● Firewalls do not usually stop file path exploitation payload
23
References
1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP
2. OWASP: https://www.owasp.org/index.php/Path_Traversal
3. Jon Erickson, Hacking - The Art of Exploitation
a. The web server of this book is exploited for buffer overflows in the book
b. My slides show a different vulnerability not discussed in the book AFAIK
4. Robert Seacord, Secure coding in C and C++.
a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo
b. Usage of strcat to construct and test filenames is also well-explained
24
Questions/Comments
dharmalingam.ganesan11@gmail.com
25
Appendix - Implementation to fix the vulnerability
26
How to fix this vulnerability?
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // and join it with resource path pointed by ptr.
if(is_safe_file(resource)) { // Is it inside the web root?
fd = open(resource, O_RDONLY, 0); // Try to open the file.
printf("tOpening %s't", resource);
}
else {
printf("tUnsafe file %s't", resource); // Hacker is attacking us.
fd = -1;
}
27
My implementation of is_safe_file
/* Returns -1 if the absolute filename is NULL.
* Returns 1 if the absolute filename is present inside the web root.
* Returns 0 otherwise.
*/
int is_safe_file(char *filename) {
char *realFname;
char *fullwebRootPath;
int status = -1;
if(NULL == filename)
return status;
fullwebRootPath = getWebrootFullPath();
if(fullwebRootPath != NULL) {
realFname = canonicalize_file_name(filename);
status = starts_with_substr(realFname, fullwebRootPath);
printf("input file name = %sn", realFname);
free(realFname);
free(fullwebRootPath);
}
return status;
}
cont...
28
My implementation of starts_with_substr
/* Returns 1 if the given str starts with the given prefix. * Returns -1 if the
arguments are invalid. Otherwise, returns 0.
*/
int starts_with_substr(char *str, char *prefix) {
if(NULL == str || NULL == prefix)
return -1;
while(*prefix) {
if(*str++ != *prefix++) return 0;
}
return 1;
} 29
My implementation of the get web root path
/* Returns the full web root file path. The caller must free the returned memory.
*/
char* getWebrootFullPath() {
long size;
char *cwd; /* current working dir */
char *ptr = NULL;
char *webrootPath = NULL;
char *webrootCanoPath = NULL;
size = pathconf(".", _PC_PATH_MAX);
if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL;
30
get web root path ...
if(getcwd(cwd, (size_t)size)) {
size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0'
if ((webrootPath = (char*)malloc((size_t)size)) == NULL) {
free(cwd);
return NULL;
}
strcpy(webrootPath, cwd);
strcat(webrootPath, "/");
strcat(webrootPath, WEBROOT);
}
webrootCanoPath = canonicalize_file_name(webrootPath);
free(cwd);
free(webrootPath);
return webrootCanoPath;
} 31

Remote file path traversal attacks for fun and profit

  • 1.
    Remote File PathTraversal Attacks for Fun and Profit Dr. Dharma Ganesan
  • 2.
    Disclaimer ● The opinionsexpressed here are my own but not the views of my employer ● The source code fragments shown here can be reused but ○ without any warranty nor accept any responsibility for failures ● Do not apply the exploit discussed here on other systems ○ without obtaining authorization from owners 2
  • 3.
    Goal ● Demonstrate howattackers can steal information from servers ● Present an anti-pattern that enables file path traversal attacks ● Discuss how to prevent file path traversal attacks (in C) ● Present some metrics to compare # of lines before and after patching 3
  • 4.
    Intended Audience ● Anyoneinterested in foundations of secure programming (in C) ● Exploits discussed here are well-known to the security community ● But I hope it is still informative for newcomers to software security 4
  • 5.
    Context: Client-Server ArchitecturalStyle ● Clients send request for a file to the server ○ Clients can be web-browsers, telnet clients, web clients, etc. ● Server sends the requested file to the clients ○ Server can be any program (e.g., web server) that responds to requests ● Of course, the server should not disclose not-public files to the clients 5
  • 6.
    Context: Client-Server ArchitecturalStyle... Request (public) File Response File Threat: Attackers could steal files from the private folder of the server. Caution: This threat is not only applicable to web but also to any distributed systems Server 6
  • 7.
    System under attackand Tools ● System under attack: The web server used here is part of a Hacking book by Jon Erickson ● The web server vulnerability presented here is not discussed in the book ○ Slides complement the book by exploiting a different vulnerability ● Tools ○ Telnet web client - part of standard Linux installations ■ Telnet has its own security problems but it is fine for this demo ○ But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.) 7
  • 8.
    A simple webapplication - Proof of Concept What are the attack surfaces to steal private files? - No forms to submit - No javascripts Take sometime before seeing the rest of the slides 8
  • 9.
    Right click toview page source of index.html <html> <img src="image.jpg"> </html> So image.jpg must be the smiley face 9
  • 10.
    Under-the-hood steps tojust smile 1. Request: “GET / HTTP/1.1” from the browser to the server 2. Response: index.html from the server to the browser 3. Request: “GET /image.jpg” from the browser to the server 4. Response: image.jpg contents from the server to the browser 5. Request: Browser sends an implicit request for the url icon 1.GET / HTTP/1.1 2. Contents of index.html 3. GET /image.jpg HTTP/1.1 4. Contents of image.jpg Server 10
  • 11.
    Confirm these steps- take a peek at server’s log ● Just to be sure let’s see our web server’s log fragment: ● Got request from 127.0.0.1:50628 "GET / HTTP/1.1" Opening ./webroot/index.html't 200 OK Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1" Opening ./webroot/image.jpg't 200 OK ● The log shows that the browser actually sent two requests as expected ● The files are delivered relative to the (public) webroot directory 11
  • 12.
    Attack surface toreach the server’s private files ● Of course, what if we target one of the GET request’s path? ● For example, the second GET request and mess with the file name ● Core Idea: Instead of image.jpg what if we request any other file 12
  • 13.
    Let’s steal arbitraryfile contents (telnet client) ● Instead of a web browser, let’s use a telnet client ○ Telnet has its own security problems but for our purpose it is fine. ● # telnet localhost 80 (i.e., connect to our web server listening at port 80) ... GET / HTTP/1.1 (I typed this command) HTTP/1.0 200 OK Server: Tiny webserver <html> <img src="image.jpg"> </html> 13
  • 14.
    Let’s steal server’s/etc/passwd using a telnet client # telnet localhost 80 ... Connected to localhost. Escape character is '^]'. GET /../../../etc/passwd HTTP/1.1 HTTP/1.0 200 OK Server: Tiny webserver root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin ... Contents of the /etc/passwd exposed 14
  • 15.
    The web serverlog fragment and analysis # ./tinyweb Accepting web requests on port 80 Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1" Opening ./webroot/../../../etc/passwd't 200 OK ● The attacker was able to get out of the public root directory! ● The tiny web server is clearly vulnerable to remote file content disclosure ● It appears that the server strcat the web root with the incoming file name ○ We will confirm by looking into the source code of the web server 15
  • 16.
    Analysis of theweb server code fragment if(strncmp(request, "GET ", 4) == 0) // GET request ptr = request+4; // ptr is the URL. ... strcpy(resource, WEBROOT); // Begin resource with web root path strcat(resource, ptr); // ptr points to the input url string in GET … } ● The anti-pattern is that the server concatenate the public web root with the input filename (can be evil) ● The resource variable contains the filename as a string but it was not evaluated/canonicalized before opening the file 16
  • 17.
    How to fixthis vulnerability? (high-level idea) ● It is not that difficult to find this anti-pattern but fixing is important ○ For large systems, grep for strcpy followed by strcat with variable names (e.g. filename) ○ grep -A 1 “strcpy” -r * | grep “strcat” … ● Canonicalize after combining the public webroot with the input filename ● Evaluate whether the canonicalized file is within the webroot ● If yes, we are safe and can disclose the content ● Otherwise, raise a generic error that the file is not found 17
  • 18.
    Canonicalize and thenvalidate filenames - core idea ● canonicalize_file_name is a library function ● For example, if the input string is ./webroot/../../../etc/passwd ● Output of canonicalize is : /etc/passwd (in my case) ○ Do not forgot to free the memory returned by canonicalize ● Check whether the prefix of the canonicalized file name is within the public dir ○ See starts_with_substr in the appendix ○ I don’t think the C language has built-in functions to check the prefix ❖ If you find bugs in my patch (see the appendix), please contact me 18
  • 19.
    The patched webserver stopped the exploit # telnet localhost 80 Trying ::1... Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /../../../etc/passwd HTTP/1.1 HTTP/1.0 404 NOT FOUND Server: Tiny webserver <html><head><title>404 NOT Found</title></head><body><h1>URL not found</h1></body></html> Connection closed by foreign host 19
  • 20.
    The patched webserver log fragment and analysis # ./tinyweb_secure Accepting web requests on port 80 Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1" input file name = /etc/passwd Unsafe file ./webroot/../../../etc/passwd't 404 Not Found ● The server knows that the attacker is reaching into non-public directories ● The server successfully stopped the attacker 20
  • 21.
    Number of linesbefore and after patching # wc -l tinyweb_secure.c tinyweb.c 224 tinyweb_secure.c (after patching) 122 tinyweb.c (before patching) ● To my surprise, the patched version (tinyweb_secure.c) has nearly two times more code than the original version (tinyweb.c) ○ Comments are inlined using “//” - so they do not contribute much to the metrics ● This shows to me that secure coding (in C) will take at least two times more coding effort than “traditional” coding ○ If I add my code review and testing effort, it is at least three times more expensive! ● More study is needed on other systems to confirm my claims on effort ○ May be it is the C language and its small library contribute to more application code ○ Or, it is me who did not patch it in a compact way - but I doubt 21
  • 22.
    Space of inputsfor traditional vs secure programs 22 Evil input values Valid input values ● Traditional programs handle only valid input values well ● Secure coding requires the programs to handle evil input values, too ● The problem is that the threats (evil inputs) have to identified up-front and ○ software has to be designed to resist and recover
  • 23.
    Conclusion and broaderapplicability ● Using string concat to construct file names can be dangerous ○ This anti-pattern should be avoided ● The server should canonicalize file names and check the resulting filenames ○ Otherwise attackers will get into private directories and steal files ● File Path exploitation is independent of web-applications ● Any client-server architecture must close this attack surface ● Usage of TLS between clients and server will not stop the attack ○ In this case, TLS will just help attackers to securely download private files :) ● Firewalls do not usually stop file path exploitation payload 23
  • 24.
    References 1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP 2.OWASP: https://www.owasp.org/index.php/Path_Traversal 3. Jon Erickson, Hacking - The Art of Exploitation a. The web server of this book is exploited for buffer overflows in the book b. My slides show a different vulnerability not discussed in the book AFAIK 4. Robert Seacord, Secure coding in C and C++. a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo b. Usage of strcat to construct and test filenames is also well-explained 24
  • 25.
  • 26.
    Appendix - Implementationto fix the vulnerability 26
  • 27.
    How to fixthis vulnerability? strcpy(resource, WEBROOT); // Begin resource with web root path strcat(resource, ptr); // and join it with resource path pointed by ptr. if(is_safe_file(resource)) { // Is it inside the web root? fd = open(resource, O_RDONLY, 0); // Try to open the file. printf("tOpening %s't", resource); } else { printf("tUnsafe file %s't", resource); // Hacker is attacking us. fd = -1; } 27
  • 28.
    My implementation ofis_safe_file /* Returns -1 if the absolute filename is NULL. * Returns 1 if the absolute filename is present inside the web root. * Returns 0 otherwise. */ int is_safe_file(char *filename) { char *realFname; char *fullwebRootPath; int status = -1; if(NULL == filename) return status; fullwebRootPath = getWebrootFullPath(); if(fullwebRootPath != NULL) { realFname = canonicalize_file_name(filename); status = starts_with_substr(realFname, fullwebRootPath); printf("input file name = %sn", realFname); free(realFname); free(fullwebRootPath); } return status; } cont... 28
  • 29.
    My implementation ofstarts_with_substr /* Returns 1 if the given str starts with the given prefix. * Returns -1 if the arguments are invalid. Otherwise, returns 0. */ int starts_with_substr(char *str, char *prefix) { if(NULL == str || NULL == prefix) return -1; while(*prefix) { if(*str++ != *prefix++) return 0; } return 1; } 29
  • 30.
    My implementation ofthe get web root path /* Returns the full web root file path. The caller must free the returned memory. */ char* getWebrootFullPath() { long size; char *cwd; /* current working dir */ char *ptr = NULL; char *webrootPath = NULL; char *webrootCanoPath = NULL; size = pathconf(".", _PC_PATH_MAX); if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL; 30
  • 31.
    get web rootpath ... if(getcwd(cwd, (size_t)size)) { size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0' if ((webrootPath = (char*)malloc((size_t)size)) == NULL) { free(cwd); return NULL; } strcpy(webrootPath, cwd); strcat(webrootPath, "/"); strcat(webrootPath, WEBROOT); } webrootCanoPath = canonicalize_file_name(webrootPath); free(cwd); free(webrootPath); return webrootCanoPath; } 31