Speaker: Andrey Plastunov
Language: English
Nowadays, one can find many many different web servers performing different kind of tasks: they may serve administrative requests on wifi hotspots (or any other embedded device), they may be used as gates to corporate intranet resources, etc. That's the logic part, but what do all these servers have under their hoods? Attentive people may note that the zoo of web servers is not limited to popular mainstream solutions. So how large the zoo is exactly? Do the developers, who creates their own servers, pay enough attention to security related problems? We will try to answer that questions.
The talk will cover methods of finding binary vulnerabilities in modern web server software using A fuzzing approach. As an example of this method, a custom fuzzing tool will be released. We will also demonstrate a bunch of vulnerabilities, found during the research.
CONFidence: http://confidence.org.pl/pl/
58. Web
Server
(Fuzzer)
Client
HTTP REQUEST
(FUZZ) HTTP RESPONSE
[Reverse fuzzing]
Difficulties:
➢There is no possibility to check the
client’s health by directly
communicating with it
➢Additional tweaks needed to re-run
the client after each request
Hi guys. Today i will talk about fuzzing modern web servers
My name is Andrey Plastunov. I am a penetration tester at Digital Security, a company from Saint-Petersburg, Russia.
Mostly i do penetration testing/security code review of modern web/mobile applications and related infrastructure.
So let’s start
Performing the lovely penetration testing tasks, there is a huge chance to run into some totally unknown http-based software, most likely - web proxies. Or at least it happened to me the first time i performed such a task: a Web proxy that worked on top of IIS 7, and acting as an ssl tunneling software. After that, i asked myself, how to quickly check such software for flaws in http parsers? Some googling gave me a couple solutions:
- Famous sulley framework (with its built-in description for generic http requests).
- A tool named phatod/pathoc
- A couple of commercial fuzzers (but actually, i am not rich enough to buy one)
Not as much as i hoped to find. But maybe my googling skills just suck. Anyway, the tools don't meet my requirements for the http fuzzer (I'll try to cover the reasons later) so the only solution that came to my mind was to create yet another crunchy fuzzing tool.
Okaaay
There is a truly great number of different web servers used in different ecosystems for different kind of tasks
On this slide i will try to cover some of them.
As i told in introduction, for me, it all started with an opaque http proxy
So, the first kind of web servers will be proxies
What does it do?
http proxy acts as an intermediary between a client and an actual server.
The proxies may be used for
- content-filtering
Such proxies provide functionality to control what client should and should not see. It may filter content based on URLs, MIME types, actual content in requests or responses (all these terms will be discussed later). So, as we can see, there is quite a lot of ways to affect content-filtering proxies
Tunneling (таннелинг) proxies (as i call them)
these kind of proxies are mostly used to tunnel plain http traffic inside an encrypted protocol (for example, tls/ssl). And may allow access, for example, from the internet to the corporate intranet
The list is definitely not complete, but gives quite an idea on what proxy servers are
The second group of servers i like to examine is web servers used on embedded systems
Servers of this category are basically used to perform administrative, monitoring or other system-related tasks
And the first group to mention in this category, of course, will be web servers on network devices (for example routers from simple dlink dir-300 to monsters like junipers)
You can find such devices almost everywhere. For example, in your favorite starbucks cafe, the wifi access point is managed via web gui. Imagine how cool it would be to find an RCE zero-day on one of these.
Following the latest fashion, industrial controller manufacturers also embed web servers into their software stack in the name of simplifying administrative tasks for network engineers
And so forth and so on
The next category of servers is actually not an independent category at all. i’d like to use this category for any custom or experimental module in all the mainstream servers (for example lighttpd, nginx, apache and so on) (remember the bugs in the experimental module ngx_http_spdy_module in the NGINX server - CVE-2014-0133 and CVE-2014-0088?)
And finally, other. I put in this category any other types of web servers which you may find on the internet
For example:
Most SIEM systems use their own web servers for users to perform any kind of monitoring or administrative tasks
Another example of such a server will be:
- A server for streaming video developed by some famous video adapter manufacturer
Thats all with web servers for now. But as a small easter egg, i want to add a very different category - The Clients!
For example, we can fuzz some curious security scanners =) Actually - it is my dream, to penetrate the penetration testers.
Well, i thought i should give a brief description on the protocol we actually want to fuzz.
As all of us know, http is a plaintext (usually) protocol usually based on simple request - response mechanism.
A standard http request consists of the following segments:
The first line includes method definition(for example: GET POST HEAD OPTIONS TRACE PUT DELETE etc), relative uri to the target resource (well, not always. In case of proxies, uri will be represented by its fully qualified value) and protocol version specification (for example, it may be either 0.9, 1.0 or 1.1)
The next segment is the header segment. It consists of several colon separated name:value pairs each occupying a separate line.
Common request headers included in such requests are Host (target's host name), User-Agent (some information on browser version), Accept (supported MIME types of documents), Referrer (represents originating page of request) or Cookie (some session information or other logical related stuff)
This segment is terminated by a single empty line, which may be followed by any payload the client wants to transmit to the server. The length of which must be specified in an additional header - The Content-Length header
Each line of the request is separated from the others by a single CRLF delimiter
Next we will look at each segment in detail
Let’s examine the first line of http request
POST /do/not/touch?my=server\r\n HTTP/1.1
The first thing to mention is a method definition.
As already mentioned, The method may be one of the following: GET POST HEAD OPTIONS TRACE PUT DELETE. But this list is not complete, we can add a large variety of webDAV methods (for example: COPY MOVE LOCK UNLOCK etc). And even some custom methods, the variety of which depends only on the imagination of the developers
While web servers definitely parse this methods to decide what they should do, there is always a non zero possibility of bugs during such parsing.
So, i think, fuzzing method definition will be useful and may give us some profit
Next, we can see a relative path to some resource.
What can happen while the server parses this path? It may contain bugs while parsing extremely large paths, or path consisting of a large number of separate directories (separated by slashes).
So, path is also a fuzzable thing
List of parameter=value pairs separated by & (ampersand) follows after path. These parameters definitely needs some fuzzing as they may lead to very different functionality, not available by any other means. For example, some API of some random binary may be accessed VIA this parameters
So...fuzzable!
There is also a value representing http protocol version to be used
It may be one of 0.9 1.0 or 1.1
Some servers parse this values in one or another way. But it really not so often that incorrect http versions may lead the software to crash
So its up to you, fuzz or not to fuzz the protocol version
...
There is another notable part of the first line, that appears only in case if the http client connects to the server using http proxy
The part is: protocol scheme plus server name
Both may be fuzzed, due the proxy servers often to analyze such names for example in regard of content filtering
...
Let's move on to the header section
As i said before, header is a name:value pair separated with colon
Header values may be of different types, for example: integers (both signed and unsigned), strings, list of strings or even complex types,
such as cookies, which in turn consist of name=(equal)value pairs separated from each other with a semicolon. Each value of each cookie may also consist of such pairs and so on and so on
Each value of each header should be fuzzed as incorrect values of headers may lead to security bugs. For example - putting a negative decimal into a unsigned integer field may cause an integer overflow
And this is not all about the headers.
Also, servers may encounter problems parsing large number of headers or duplicate headers, so the pairs themselves should be fuzzed as a single entity too. And do not forget to modify header names by some fuzzy values since it may lead to additional bugs.
...
The next section is data section
Here we are gonna look at a couple of different types of post data
First of all - default data type - application/x-www-form-urlencoded
In this type of post message, the data is constructed the same way as for GET but is transmitted in the Request payload instead, so it may be used with URL parameters simultaneously.
So as data construction is exactly the same as for GET request, the fuzzed entities are also the same
Same as URL data
The next type of post data is multipart/form-data
This type of data is mostly used to send content of some random file (including binary data)
The resulting request payload consists of a series of short MIME messages corresponding to every parameter of a request. These messages are delimited with a client-selected random, a unique boundary token that should otherwise not appear in the encapsulated data
So there are plenty of things to examine
first, the content-disposition header value. It may be one of the predefined values such as inline, attachment, form-data et cetera. Also it may be a custom defined value. That is up to the developers
The parser will definitely analyze the header
, so it must be fuzzed
Second, each MIME message may have a number of parameters, for example, name or filename or whatever else
These parameters will be analyzed by the server too.
So, fuzzable
The last thing worth mentioning in this type of request is the data of each mime message. It may be represented as plaintext or, for example, an integer,
That, of course, may be fuzzed
but it also may be binary data,
which should be fuzzed a little differently
Do not forget to fuzz all types of Delimiters encountered in your request
A generic request may consist of the following Delimiters: crlf, colons, semicolons, equals, question marks, ampersands
Multiplying, removing and manipulating all this delimiters may cause the parser to interpret the given request in a wrong way. For example, multiplying the Delimiters in a single header, e.g. Accept-Language tells the server that there are N supported languages. If N exceeds the maximum value specified by the developer, it may cause an overflow
Fuzzable!
The next part of my talk is about choosing the approach to testing web servers
Now we will discuss approaches that, i think, suit perfectly to the task of fuzzing such different kinds of web servers
The first approach is simple and straight client-originating fuzzing
In this approach, the fuzzer pretends to be a simple web-client, thus (фас) sending a single request to the server, one at a time, probing if it fails to parse the request, and if it does not, generating the next fuzzing request
So the scheme is quite simple
Client sends a fuzzing request to the server and waits for the answer.
If the server answers with a proper response - everything seems ok. If the server fails to answer a request or refuses any connections, there might be a bug
Second approach is used mostly to test clients or proxies. We call that approach - reverse fuzzing.
THe main concept of reverse fuzzing is to send a fuzzing message only in response to a request, which comes from the target. Therefore, the approach of reverse fuzzing may apply to testing web clients (for example, curl or wget) or web proxies from the perspective of the server.
The scheme describing this approach is a little bit more complicated than in straight fuzzing and looks as follows:
First, target (attention, it is a target client not a fuzzer) sends a request to the fuzzing server,
server then generates a fuzzing response and sends it back to the client.
The only possible way to determine if the client is dead or not - is to run a monitoring process to check the target’s health. In addition, we will need some tweaks to force our target to send another request again and again.
As a culmination of this two approaches, a monstrous method arises to test web proxies. And proxies only. i call it double fuzzing testing.
The idea is simple:
First - send a fuzzing request to a server via target proxy,
The proxy processes the request and transmits it to the server
Server totally ignores the request and sends a fuzzing response from its own queue
This allows us to kill all the birds in one shot:
Fuzz the proxy server from the client perspective
Fuzz the proxy server from the end-point server perspective
Now a few words on the process of detecting crashes and anomalous activities (such as memory consumption) on the target system
The first thing to mention is traffic analysis
In my fuzzer i didn’t perform any traffic analysis in the context of fuzzing, but this detection method should be mentioned anyway
Performing the traffic analysis, one could search for such anomalies as:
TCP RST packets without any actual data being sent
Timeouts in the responses
and so on
(можно немного нагуглить)
The second approach on bugs detection is to use a local monitoring process
The way to perform such detection is to install a monitoring process on target system
The installed process should then do the following:
Watch for system calls called by the target process
Watch for file system and other resource activities
Watch for unusual signals sent to or by the process (for example, segmentation fault)
Watch for memory allocations (malloc/calloc functions for example)
In this method, i places such techniques as
Analyzing http error codes received from the web server (for example, 502 or 503 error codes)
Analyzing socket errors (for example, CONNECTION REFUSED, CONNECTION RESET BY PEER, SOFTWARE CAUSED CONNECTION ABORT and so on)
There is one more approach on monitoring the target while directly interacting with it. Just before perform the fuzz testing You may try to harvest requests and responses (including error responses(e.g. 404)) typical for the analyzed software.
THe approach i’d like to mention is to compare each response on each fuzzing request with a reference(эталонный) response. If the responses differ, that may be a sign of some bug that needs further manual inspection. my bad, for now, my wuzzer is unable to perform such comparison, but i’m working on it
In this part of my presentation i will introduce my own tool (which is for now still in alpha version and has a very limited functionality), or better to say, not the tool, but the concepts i'm trying to put into the tool
First of all Which modules should a typical fuzzer have?
of course
1. Generator module
2. Transmitter module
3. Monitoring module
4. Some logging module
Now a closer look at each part
Generator - the main purpose of a generator is to generate data! isn’t it obvious?=)
In my own generator module i used some fuzzing primitives from the famous sulley framework. For example, they are: integer generator, string generator, delimiter generator
Next, to mutate binary data (for example, images sent to the server), i used a tool named pyZZuf (by @nezlooy) which is a python implementation of the general purpose fuzzer Zzuf. Now, i’ll show some advertisement to honor the developer of that tool
as already mentioned, to fuzz binary data, i used a tool named pyZZuf (by @nezlooy) which is a python implementation of Zzuf - a general purpose fuzzer
FOr now i assume (короче типа считаю что пока достаточно) that the given fuzzing primitives are enough to describe generators for more complex data, for example - headers.
I created some headers generators: Accept-Encoding, Content-Encoding (which is similar), Accept-Language, Accept-charset, Authorization, Range et cetera
each generator takes a valid header value as input and fuzzes it in all possible ways (fuzzing all the int's and strings, adding new values and cloning existing ones if a header supports multiple values to be used)
In the bottom line i have the following generators: fuzzing primitives generators (including integers, strings, Delimiters and blob types),
complex header generators (the ones that may take multiple values at a time or even multiple values of different types, for example - Cache-control or Cookies), URL path generator which in turn consists of:
- path to resource (for example /path/to/resource). Each part of the path acts as a string here and each slash acts as a delimiter
- set of parameters (for example a=hello&b=world). Here, each parameter is a name:value pair with equal sign as a delimiter, each pair is separated from the others by an ampersand (&)
POST-DATA generators which for now include the following types of generators:
applications/x-url-form-ulrencoded- is one line consisting of name:value pairs with equals as Delimiters - just like in URL parameters
binary objects - which may be used as a complete independent value or as a part of a multipart/ request used to upload some binary data
I also use a so called whole-request-generator which is used to fuzz the whole request at once. That generator tries to play with each kind of Delimiters included in a request (slashes, crlf's, question marks, ampersands, colons et cetera) duplicating them or removing them, to duplicate existing headers, to extend post data or URL paths and so on and so forth.
Transmitter is the core module of the fuzzer. The Transmitter has three roles:
- To receive fuzzing requests from the generator and then send them to the target, get back the answer or receive a socket error.
- To analyze the response from the target trying to determine if the target is out of health. So the transmitter is somewhat similar to the monitor module as it watches for the target to be alive
- To log all requests being sent and, especially, the requests that caused an error or an unusual response
The Monitor’s primary role is to watch the target's health without interacting with it directly.
There are two solutions for this task
1. Monitoring the target process, so the monitor (or its agent) need to be on the same physical machine with the target. For that purpose, i mostly use stack trace and a custom wrapper, which follows syscalls of the targets process and, if something is not ok, sends the Transmitter a message.
2. Monitoring the network flow. A monitor of this kind simply watches for anomalous network activity and sends a message to the transmitter if it detects something bad.
Some other features
fuzzing modes:
Header fuzzing
url-data fuzzing
post-data fuzzing
whole-request fuzzing
Method to fuzz
…
Possibility of using proxy servers, for example, to monitor http traffic, or to fuzz the proxy in a double fuzzing approach
some other options:
multithreading
delay - As i discovered, some web servers, especially ones deployed on embedded devices, lack the ability to handle multiple simultaneous connections due to a limited number of socket descriptors
whatever else
Right now i'm in a middle of my research of web server vulnerabilities and today i want to show you some results of that research. Of course, as soon as the research is completed, i will publish it on the internet
First of all, i would like to mention the bug that i've found on most web servers i fuzzed. And the bug is - improper validation of content-length header. For example:
Some parsers allow content-length to be a negative integer which may cause integer overflows
Other parsers will gladly accept extremely large values so the buffer, which is prepared to store given post data, may be overflowed, which causes the data to be written outside of a specified buffer.
Moreover, the problem lies not only in the process of validation, but also in the incorrect handling of http requests. In the case of content-length, a large number of servers will accept and parse the content-length header even if the request method is GET.
This bug was found in one popular streaming service, which, sadly, i cannot name right now due to a responsible dicslosure, but i will in the paper. An attacker could send a request with the content-length header set to minus two. While processing such a value, the server converts the negative number into a unsigned int, causing an integer overflow (give the value here). THereafter, server tries to allocate this amount of memory, which, in turn, causes a memory consumption vulnerability
Next, here is a bug, again on content-length processing.
The funny thing - the developers used a secure strcpy_s function, which is triggering an exception if something is going wrong. That is the good part
The bad part - developers forgot to handle these exceptions properly, so. when an exception occurs, the web server crashes immediately
The bug makes even the doge sad
Skip in 1-2 secs
The bug was found in one of third-party plugins for IIS, developed in the name of some secure tunnelling software which is kinda popular on the local market.
An attacker could send a request with the content-length header set to minus two. While processing such a value, the server converts the negative number into a unsigned int, causing an integer overflow (give the value here). THereafter, server tries to use strcpy function to writeextremily large value to a limited buffer wchich casues an stack buffer overflow
The bug appears in a router’s software
And it arises while parsing a Basic Authorization header with the login length of sixteen kilobytes
Unfortunately, i am unable to debug the bug, as it appears on a router web server and i simply do not know how to run that thing under debugger. But if I have to guess, i think it must be a buffer overflow
So this bug appears on router software two, so, as already mentioned, i could only guess the reasons why web server crashes
This bug arises while processing a large number of supported languages provided in the header.
And finally some bugs actually not founded by me, but anyway they may give additional point of view for http software fuzzing
First of all is famous bug in HTTP.sys - microsoft’s driver level web server MS15-034
Parsing such a range values causes the integer overflow vulnerability
The last bug appears if a long URL is passed to the Kolibri web server in a POST request.
The bug is a stack buffer overflow bug and may lead to Remote Code Execution
Also, yesterday guys from OWASP track also mentioned a vulnerability in AllegroSoft RomPager 4.34 which occurs during parsing of the oversized cookie which is causing memory corruption