Developed by Martin Holst Swende 2010-2011 Twitter: @mhswende [email_address]
<ul><li>This presentation is just a quick and steep dive into the Datafiddler. It does not cover much, but hopefully gives a bit of understanding about what the Datafiddler is capable of. </li></ul><ul><li>The Datafiddler operates on data stored by the Hatkit Proxy in a MongoDB database. The proxy is not covered in this presentation. </li></ul><ul><li>Two primary views exists; the tableview and the aggregrator. </li></ul><ul><li>A third view, 3rd party plugins, is planned but not implemented in the UI. </li></ul>
Dynamic display of data in a table-based layout (1:1 mapping)
This is what data is fetched from each document ('row') in the database. The variable 'v1' will contain request.time These are the column definitions. This is python code which is evaluated. They have access to the variables, and a library of 'transformations' date(millis) takes an UTC timestamp and converts it to a nice human readable format. The second column will be titled Date and contain the result of date(v1)
The v0 parameter is the object id. This column uses 'Coloring', which means that the value is not displayed, instead a color is calculated from the hash of the value. This is particularly useful e.g when values are long but not interesting. Cookie values take a lot of screen real estate, but often it is only interesting to see when they are changed – which is shown by the color.
There are a lot of prefedined 'transformers' which can be used when defining the columns For example, the function below makes it possible to display both URL-parmeters and POST-parameters in the same column. showparams(url,form) Sorts parameters by keys. You can send in two dicts, and get the combined result. This makes it easier to show both form-data and url-data in the same column. Example variable v2: request.url variable v3: request.data column: sortparams(v2, v3) //Another version variable v1: request column: sortparams(form=v1.data,url=v1.url)
It is simple to write the kind of view you need for the particular purpose at hand. Some example scenarios: - Analysing user interaction using several accounts with different browsers: * Color cookies * Color user-agent * Parameters * Response content type (?) - Analysing server infrastructure * Color server headers * Server header value for X-powered-by, Server etc. * File extension * Cookie names - Searching for reflected content (e.g. for XSS) * Parameter values * True/False if parameter value is found in response body (simple python hack) - Analyzing brute-force attempt * Request parameter username * Request parameter password * Response delay * Response body size * Response code * Response body hash After you write some good column definitions for a particular purpose, save it for next time
Displays aggregated data in a tree structure (1:N mapping)
Aggregation (grouping) is a feature of MongoDB. It is like a specialized Map/Reduce which can only be performed on <10 000 documents. You provide the framework with a couple of directives, and the database will return the results, which are different kinds of sums. This enables pretty nice kind of queries which can be displayed in a tree-form. Example: sitemap can be easily generated Example: Show all http response codes, sorted by host/path Example: Show all unique http header keys, sorted by extension Example: Show all request parameter names, grouped by host Example: Show all unique request parameter values, in grouped by host
Provides capabilities to use existing frameworks, libraries and applicationsfor analysing captured data
3rd party analysis – The idea is to use plugins that use the stored traffic and ’replays’ it through other frameworks. Status: API defined, no UI exists. Runnable through console. W3af plugin Plugin which uses the ’greppers’ in w3af to analyse each request/response pair. Requires w3af to be installed, calls relevant parts of the w3af code directly. Status: Code works, but not feature complete. Ratproxy plugin Plugin which starts ratproxy (by lcamtuf) and opens a port (X) for listening. It sets ratproxy to use port X as forward proxy, then replays all traffic through ratproxy, while capturing the output from the process. Status:PoC performed, but not nearly finished Httprint plugin Plugin which uses httprint to fingerprint remote servers. Status: Idea-stage, unsure if httprint is still alive
For ’breakers’ : Datafiddler is very useful for analyzing remote servers and applications, from a low-level infrastructure point-of-view to high-level application flow. For ’defenders’ : Hatkit proxy can be set as a reverse proxy, logging all incoming traffic. Datafiddler can be used as a tool to analyze user interaction, e.g. to detect malicious activity and perform post mortem analysis. The proxy is very lightweight on resources (using Rogan Dawes’ Owasp Proxy), and the backend (MongoDB) has great potential to scale and can handle massive amounts of data.
<ul><li>Hatkit proxy requirements: </li></ul><ul><li>Java </li></ul><ul><li>(optional** : MongoDB) </li></ul><ul><li>(mongodb java drivers included in binary release) </li></ul><ul><li>** Can be used in interception-only mode, where data is not stored. </li></ul><ul><li>Datafiddler Requirements (only tested on Linux / Ubuntu): </li></ul><ul><li>Python </li></ul><ul><li>Qt4 </li></ul><ul><li>PyQt4 bindings </li></ul><ul><li>Python mongodb driver </li></ul><ul><li>MongoDB </li></ul><ul><li>(optional: w3af) </li></ul><ul><li>(optional: ratproxy) </li></ul>To get up and running, grab Hatkit proxy : Src: http://martin.swende.se/hgwebdir.cgi/hatkit_proxy/ Bin: http://martin.swende.se/hgwebdir.cgi/hatkit_proxy/raw-file/tip/hatkit.zip And Datafiddler: Src: http://martin.swende.se/hgwebdir.cgi/hatkit_fiddler/