SlideShare a Scribd company logo
Crash Analysis with
Reverse Taint
Powered by Taintgrind
Marek Zmysłowski
V. 12122019
whoami
Security Researcher @
Interested in fuzzing and vulnerability finding
Fan of The Matrix and Hacker movies
Co-organizer of “H4x0r5 %40 Warsaw” meetings
Crash Analysis with Reverse Taint
Crash Analysis with Reverse Taint
- What is a crash?
Crash, or system crash, occurs when a computer program (...) stops functioning properly
and exits. *
An application typically crashes when it performs an operation that is not allowed by the
operating system. The operating system then triggers an exception or signal in the
application. Unix applications traditionally responded to the signal by dumping core. *
- Why is it important to identify a crash?
“performs an operation that is not allowed”. This indicate that inside the application a bug
exists. When the crash is identified and recurrent the bug can be found.
*https://en.wikipedia.org/wiki/Crash_(computing)
Crash Analysis with Reverse Taint
- How to identify a crash?
“operating system then triggers an exception or signal” Different operating
systems contain different mechanisms to “collect” crashes. Some perform core
dump, store the memory (Linux), other attach the debugger to allow user the
debugging session with the crashed application (Windows).
- How do we get crashes?
“That, Detective, is the ‘right question.’”
“I, Robot” (2004)
Hunting in the wild
- How can a crash be found?
By accident or by one of the most popular techniques that we will be also
mentioned here, fuzzing. The sooner the bug is found in the production
process, the less the costs are.
- So what is ‘fuzzing’?
The idea behind fuzzing is very simple. Let’s take an malformed input and feed it
to the application. Maybe it will crash. Of course, how the input is “chosen” and
how the crashes are caught is a topic for another presentation(s).
Fuzzers
The godfather of all is, of course, AFL.
However, recent years brought multiple
fuzzers used for different purposes. Some
of them are different clones of AFL, some of
them try to do things differently.
Everyone can find something for
themselves.
American Fuzzy Lop
HonggFuzz
AFL++
Angora
QSYM
WinAFL
Real Example - Fuzzing
jhead is used to display and manipulate data contained in the Exif header of
JPEG images from digital cameras. By default, jhead displays the more useful
camera settings from the file in a user-friendly format.
The version used here is 3.03.
http://www.sentex.net/~mwandel/jhead/
Real Example - Fuzzing
Crash
Example
Crash vs Bug
- What is the difference between Crash and Bug?
Crash is a result of incorrectly working code caused by a bug. Sometimes it
happens, that the crashing place and the bug place are “the same”. And
sometimes not ...
- Is one crash caused by one bug?
No.
Crash vs Bug
Case 1.
One bug causes one crash.
This is the easiest situation as the identification
is straightforward.
Crash vs Bug
Case 2.
One bug can cause a few
crashes.
This happens quite often especially
with simple buffer overflows where
the “size” variable is used. Direct
read or write and access different
memory regions cause different
crashes.
Crash vs Bug
Case 3.
A few bugs can cause one crash.
This depends on how we identify crash. The
simplest example can be a frame processor.
For the different types of frame, the size
parser works incorrectly and may cause
different crashes for different paths.
Crash vs Bug
In an additional experiment we computed a portion of groundtruth. We applied all patches to cxxfilt
from the version we fuzzed up until the present. We grouped together all inputs that a particular patch
caused to now gracefully exit [11], confirming that the patch represented a single conceptual bugfix. We
found that all 57,142 crashing inputs deemed “unique” by coverage profiles were addressed by 9
distinct patches.
Stack hashes did better, but still over-counted bugs. Instead of the bug mapping to, say 500 AFL
coverage-unique crashes in a given trial, it would map to about 46 stack hashes, on average.
Stackhashes were also subject to false negatives: roughly 16% of hashesfor crashes from one bug
were shared by crashes from another bug.In five cases, a distinct bug was found by only one crash,
and that crash had a non-unique hash, meaning that evidence of a distinct bug would have been
dropped by “de-duplication.”
“Evaluating Fuzz Testing” https://arxiv.org/pdf/1808.09700.pdf
Crash vs Bug
So what is needed?
Crash Analysis with Reverse Taint
Crash Analysis with Reverse Taint
- What is crash analysis and why do we need that?
Crash analysis is a process of evaluating exploitability of the crash and
identifying the root cause of this crash. If you are fuzzing something, the number
of crashes can be huge. Also the impact and consequences (criticality) the bug
might have, depends on the application technology and the system.
- So what exactly do we analyze?
There are two major things to analyze: the crash and the bug.
Analysis - Crash
- What type of the crash is it?
For example: Out-of-bound read, NULL Pointer Dereference, Buffer Overflow, etc.
- Is the crash exploitable?
It is a part of identification process to find out if the crash can be used to achieve
something more than just crash the application - read a piece data, overwrite
memory or execute code.
- Critical or exploitable - what is the difference?
The exploitability related only to bug and crash itself. The criticality is related to
the whole environment. A Safe NULL Pointer Dereference is different for a nuclear
power plant software and a kids game.
Analysis - Bug
The second important analysis part is to identify the bug. As this was mentioned
before, there can be different relations between bug and crash.
It is also important to how inputs (which bits and bytes) correlate with the bug.
This of course may influence the crash later and its exploitability.
- What different relation are here?
User data can control the crash directly (e: offset inside the table is calculated
based on user data) or indirectly (e: the incorrect branch is taken)
For example: NULL Pointer Dereference vs Safe NULL Pointer Dereference
Analysis - Bug (Direct vs. Indirect)
void *pointer = NULL
char table[100]
int index = 0
char user_data
user_data>100
pointer[index] = 0
table[user_data] = 0
Analysis - Few Interesting Tools
It runs crash files with instrumentation and outputs results in various formats.
It summarizes crashes in a crashwalk database by major / minor stack hash.
Although AFL (for example) already de-dupes crashes, bucketing summarizes
those crashes by an order of magnitude or more. Crashes that bucket the same
have exactly the same stack contents, so they're likely (not guaranteed) to be
the same bug.
It is a simple utility to output the filenames of all crashes matching a given hash. I
use it in combination with xargs to bulk delete / move crash files.
crashwalk
- cwtriage
- cwdump
- cwfind
https://github.com/bnagy/crashwalk
Analysis - Few Interesting Tools
afl-utils
- afl-collect
- afl-minimize
Copies all crash sample files from an afl synchronisation
directory (used by multiple afl instances when run in parallel)
into a single location providing easy access for further crash
analysis. Also executes exploitable on them and remove
uninteresting crashes.
Helps to create a minimized corpus from samples of a parallel
fuzzing job.
https://github.com/rc0r/afl-utils
afl-collect
https://github.com/rc0r/afl-utils
Results of the
“exploitable” plugin
afl-minimize
Reducing Input Files
Analysis - Few Interesting Tools
afl-analyze It takes an input file, attempts to
sequentially flip bytes, and observes
the behavior of the tested program. It
then color-codes the input based on
which sections appear to be critical,
and which are not.
While not bulletproof, it can often offer
quick insights into complex file
formats.
https://lcamtuf.blogspot.com/2016/02/say-hello-to-afl-analyze.html
afl-analyze
*Of course it works, it just does not always give expected results.
Crash Analysis with Reverse Taint
Tainting
- What is tainting?
The purpose of dynamic taint analysis is to track information flow between
sources and sinks. Any program value whose computation depends on data
derived from a taint source is considered tainted. Any other value is considered
untainted.
- What are the types of tainting?
● The direct value is tainted
● Indirect/Control flow
● Address/Pointer relation
https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf
Tainting - Types
Indirect/ Control Flow
if (X > 2)
Y = 5
else
Y = 10
Address/Pointer
Y = A[X]
Direct Value
Y = X + 2
Tainting Propagation (Policy)
Depends on the application, different rules can be used to propagate the taint. It
is a set of rules how the source operants are propagated to destination. In
standard taint analysis, the destination operand is typically marked as tainted if
any of the source operands is tainted regardless of how the specific semantics
of s affects its destination operands.
Tainting - Issues
- What is over-tainting?
Overtainting occurs when code or data identified by the analysis as tainted is
not in fact influenced by any taint source (false-positive).
- What is under-tainting?
Under-tainting occurs when code or data that is influenced by a taint source is
not identified by the analysis as tainted. Such imprecision can be problematic,
especially in systems where the result of the taint analysis is critically important
(false-negative).
Powered by Taintgrind
- What is Valgrind?
Valgrind
It is an instrumentation framework for
building dynamic analysis tools. It comes with
a set of tools each of which performs some
kind of debugging, profiling, or similar task
that helps you improve your programs.
Valgrind's architecture is modular, so new
tools can be created easily and without
disturbing the existing structure.
http://valgrind.org
Valgrind IR
Valgrind had an x86-specific,part D&R, part
C&A, assembly-code-like IR in which the
units of translation were basic blocks. Since
then Valgrind has had anarchitecture-
neutral, D&R, single-static-assignment
(SSA) IR that is more similar to what might
be used in a compiler. IR blocks are
superblocks: single-entry, multiple-exit
stretches of code.
*http://valgrind.org/docs/valgrind2007.pdf
Single-Static-Assignment (SSA)
It is a property of an intermediate representation (IR),
which requires that each variable is assigned exactly
once, and every variable is defined before it is used.
Existing variables in the original IR are split into
versions, new variables typically indicated by the
original name with a subscript in textbooks, so that
every definition gets its own version. In SSA form, use-
def chains are explicit and each contains a single
element.
*https://en.wikipedia.org/wiki/Static_single_assignment_form
- What is Taintgrind?
Taintgrind
Taintgrind is based on Valgrind's MemCheck and Flayer plugin.
Taintgrind borrows the bit-precise shadow memory from MemCheck and only
propagates explicit data flow. This means that Taintgrind will not propagate taint
in control structures such as if-else, for-loops and while-loops. Taintgrind will also
not propagate taint in dereferenced tainted pointers.
http://valgrind.org/docs/memcheck2005.pdf
Taintgrind - Propagation Rules
1. The direct value is tainted
2. Indirect/Control flow
3. Address/Pointer relation
Taintgrind - Propagation Rules
- What are the Taintgrind propagation rules?
The granularity for the memory operation is 1 byte.
For the registry operation it is the size related to the operand. Even if one byte
is used there, the whole register will still be tainted. In such case, the Taintgrind
is overtainting.
However, because the Taintgrind is handling first type, it is also under-tainting.
Taintgrind - Propagation Rules
WRITE READ
Overtainting bit-byte operation
Taintgrind
Here is the example of logs and how
the taint is propagated over the file.
The job is to find all the patch from
the end of the file to the beginning.
One instruction can be tainted with
multiple input.
Taintgrind
The original Taintgrind was not useful for the purpose of the reverse taint. It was
missing a few parts.
- What was changed?
The “Read” function was not showing the size of data that was read.
The “Load” and “Store” functions were also not presenting the size of the operation.
Tracked variables
Reported crash
Taintgrind
GDB
Function where the
crash occurred
Instruction that caused the crash
The crash occurred with the
reference to the address
stored in RAX register
GDB
Taintgrind
/work/taint-analysis/valgrind-
3.15.0/build/bin/valgrind
--tool=taintgrind --file-filter=/work/taint-
analysis/CRASH
--compact=yes
--taint-start=0
--taint-len=1504
/work/taint-analysis/jhead-3.03/jhead
/work/taint-analysis/CRASH
Taintgrind
/work/taint-analysis/valgrind-
3.15.0/build/bin/valgrind --tool=taintgrind
Calling the Taintgrind tool.
---file-filter=/work/taint-analysis/CRASH This is the name of the file that needs
to be tainted. It must be FULL path.
--compact=yes Makes the log file smaller.
--taint-start=0 Offset inside the file.
--taint-len=1504 Taint size
/work/taint-analysis/jhead-3.03/jhead
/work/taint-analysis/CRASH
Command
Crash Analysis with Reverse Taint
Reverse Tainting the Value
An example how the values are tracked.
Reverse Tainting the
Value
Parts of one tainted variable diagram
rtaint
https://github.com/Cycura/rtaint
rtaint
- -f
This is the name of the log file created by Taintgrind. It can be in the compact
version.
- -g
The script can also produce the file in dot format used to generate a graph.
- -s
This is the name of the file with the slice. Later, this can be used to display what
operations where tainted with the values.
- -k
This is the directory path where the KaiTai struct will be stored inside files.
Reverse Tainting
the Value
These are the
indexes from
the file that
are causing
the crash.
Crash File
with Kaitai
Struct
Reverse Tainting the Value
- With or without the file size?
What is the probability that different size files with the same KaiTai Struct will
have different root cause?
- What is the relation between AFL Unique Algorithm and the Tainted
Input?
It is an open question...
Reverse Tainting the Value - Results
413 total crashes found by 4 instances (1 master and 3 slaves)
- master - 44 crashes
- slave1 - 116 crashes
- slave2 - 124 crashes
- slave3 - 129 crashes
349 crashes were reproduced under Taintgraind
177 crashes had unique KaiTai structure.
Slicing
It is the computation of the set of program
statements, the program slice, that may
affect the values at some point of interest,
referred to as a slicing criterion.*
https://en.wikipedia.org/wiki/Program_slicing
Graph
Taintgrind and rtaint allows to
create a dot graph that can be
converted with the graphviz
package.
Simple Example
Crash Analysis with Reverse Taint
Powered by Taintgrind
Powered by ...
Moflow - https://github.com/Cisco-
Talos/moflow/tree/master/BAP-0.7-moflow
Binary Ninja - https://blog.trailofbits.com/2019/08/29/reverse-
taint-analysis-using-binary-ninja/
Triton - https://triton.quarkslab.com/
BARF - https://github.com/programa-stic/barf-project
libdft - http://www.cs.columbia.edu/~vpk/research/libdft/
CommercialFree
TETRANE - https://www.tetrane.com/
What Next?
The tainting starts from the last line inside the file. This is
useful when there is a crash. But there is no way to taint any
arbitrary instruction if the application doesn’t crash.
IDA Pro/Ghidra/Binary Ninja script for highlighting the tainted
instruction. This will help to easy identified the data flow.
The way as it is written currently makes it slow. Optimization
or the language change (thinking about Rust) is required.
Updates
- Address
- Scripts
- Speed
What Next?
Issues Currently the Taintgrind doesn't work on the ARM
processors. This is caused by the Valgrind itself. It is missing
some of the ARM conversions. The bug was already
reported.
Summary
The solution is based on the Valgrind/Taintgrind. It means that supports all the
system supported by Valgrind itself (+) But it also suffers from the Valgrind issues
(-)
The process of creating taint log is time consuming (-)
rtaint can be used in most of the cases making the analysis “faster” and
automated. Easy to incorporate to other tools. (+)
The Python may not be the best solution for the rtaint. Too slow? (-)
It requires more testing on the real live application. I’m happy to receive any
feedback :) (+)
References and Interesting Docs
https://github.com/wmkhoo/taintgrind
http://valgrind.org/
http://valgrind.org/docs/memcheck2005.pdf
https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf
https://www2.cs.arizona.edu/~debray/Publications/bit-level-taint.pdf
http://bitblaze.cs.berkeley.edu/papers/dta%2B%2B-ndss11.pdf
http://shell-storm.org/blog/Taint-analysis-and-pattern-matching-with-Pin/
https://www.blackhat.com/docs/eu-15/materials/eu-15-Kim-Triaging-Crashes-With-Backward-Taint-
Analysis-For-ARM-Architecture.pdf
Special Thanks
Wei Ming Khoo
Questions
Thank you :)
mzmyslowski@cycura.com
@marekzmyslowski
https://github.com/Cycura/rtaint
https://twitter.com/H4x0r54

More Related Content

What's hot

MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システム
Kouhei Sutou
 
RHEL on Azure、初めの一歩
RHEL on Azure、初めの一歩RHEL on Azure、初めの一歩
RHEL on Azure、初めの一歩
Ryo Fujita
 
VictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - PreviewVictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - Preview
VictoriaMetrics
 
BLS署名の実装とその応用
BLS署名の実装とその応用BLS署名の実装とその応用
BLS署名の実装とその応用
MITSUNARI Shigeo
 
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
devCAT Studio, NEXON
 
趣味と仕事の違い、現場で求められるアプリケーションの可観測性
趣味と仕事の違い、現場で求められるアプリケーションの可観測性趣味と仕事の違い、現場で求められるアプリケーションの可観測性
趣味と仕事の違い、現場で求められるアプリケーションの可観測性
LIFULL Co., Ltd.
 
Tortoise gitで日本語ファイル名を使うときのgitの選択について
Tortoise gitで日本語ファイル名を使うときのgitの選択についてTortoise gitで日本語ファイル名を使うときのgitの選択について
Tortoise gitで日本語ファイル名を使うときのgitの選択について
Kiyoshi SATOH
 
レシピの作り方入門
レシピの作り方入門レシピの作り方入門
レシピの作り方入門
Nobuhiro Iwamatsu
 
何となく勉強した気分になれるパーサ入門
何となく勉強した気分になれるパーサ入門何となく勉強した気分になれるパーサ入門
何となく勉強した気分になれるパーサ入門
masayoshi takahashi
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
Wei-Bo Chen
 
글쓰는 개발자 모임, 글또
글쓰는 개발자 모임, 글또글쓰는 개발자 모임, 글또
글쓰는 개발자 모임, 글또
Seongyun Byeon
 
Azure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
Azure Pipeline Tutorial | Azure DevOps Tutorial | EdurekaAzure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
Azure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
Edureka!
 
HttpClient詳解、或いは非同期の落とし穴について
HttpClient詳解、或いは非同期の落とし穴についてHttpClient詳解、或いは非同期の落とし穴について
HttpClient詳解、或いは非同期の落とし穴について
Yoshifumi Kawai
 
小さなサービスも契約する時代
小さなサービスも契約する時代小さなサービスも契約する時代
小さなサービスも契約する時代
Ryo Mitoma
 
twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧
twMVC
 
Docker入門 - 基礎編 いまから始めるDocker管理
Docker入門 - 基礎編 いまから始めるDocker管理Docker入門 - 基礎編 いまから始めるDocker管理
Docker入門 - 基礎編 いまから始めるDocker管理
Masahito Zembutsu
 
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
Simplilearn
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Kubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったことKubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったこと
h-otter
 
YoctoでLTSディストリを作るには
YoctoでLTSディストリを作るにはYoctoでLTSディストリを作るには
YoctoでLTSディストリを作るには
wata2ki
 

What's hot (20)

MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システム
 
RHEL on Azure、初めの一歩
RHEL on Azure、初めの一歩RHEL on Azure、初めの一歩
RHEL on Azure、初めの一歩
 
VictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - PreviewVictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - Preview
 
BLS署名の実装とその応用
BLS署名の実装とその応用BLS署名の実装とその応用
BLS署名の実装とその応用
 
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
강성훈, 실버바인 대기열 서버 설계 리뷰, NDC2019
 
趣味と仕事の違い、現場で求められるアプリケーションの可観測性
趣味と仕事の違い、現場で求められるアプリケーションの可観測性趣味と仕事の違い、現場で求められるアプリケーションの可観測性
趣味と仕事の違い、現場で求められるアプリケーションの可観測性
 
Tortoise gitで日本語ファイル名を使うときのgitの選択について
Tortoise gitで日本語ファイル名を使うときのgitの選択についてTortoise gitで日本語ファイル名を使うときのgitの選択について
Tortoise gitで日本語ファイル名を使うときのgitの選択について
 
レシピの作り方入門
レシピの作り方入門レシピの作り方入門
レシピの作り方入門
 
何となく勉強した気分になれるパーサ入門
何となく勉強した気分になれるパーサ入門何となく勉強した気分になれるパーサ入門
何となく勉強した気分になれるパーサ入門
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
글쓰는 개발자 모임, 글또
글쓰는 개발자 모임, 글또글쓰는 개발자 모임, 글또
글쓰는 개발자 모임, 글또
 
Azure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
Azure Pipeline Tutorial | Azure DevOps Tutorial | EdurekaAzure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
Azure Pipeline Tutorial | Azure DevOps Tutorial | Edureka
 
HttpClient詳解、或いは非同期の落とし穴について
HttpClient詳解、或いは非同期の落とし穴についてHttpClient詳解、或いは非同期の落とし穴について
HttpClient詳解、或いは非同期の落とし穴について
 
小さなサービスも契約する時代
小さなサービスも契約する時代小さなサービスも契約する時代
小さなサービスも契約する時代
 
twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧
 
Docker入門 - 基礎編 いまから始めるDocker管理
Docker入門 - 基礎編 いまから始めるDocker管理Docker入門 - 基礎編 いまから始めるDocker管理
Docker入門 - 基礎編 いまから始めるDocker管理
 
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Kubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったことKubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったこと
 
YoctoでLTSディストリを作るには
YoctoでLTSディストリを作るにはYoctoでLTSディストリを作るには
YoctoでLTSディストリを作るには
 

Similar to Crash Analysis with Reverse Taint

nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable codenullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
n|u - The Open Security Community
 
DEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WPDEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WP
Amr Thabet
 
Breaking av software
Breaking av softwareBreaking av software
Breaking av softwareJoxean Koret
 
Breaking av software
Breaking av softwareBreaking av software
Breaking av software
Thomas Pollet
 
Breaking Antivirus Software
Breaking Antivirus SoftwareBreaking Antivirus Software
Breaking Antivirus Softwarerahmanprojectd
 
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
Akmal Hisyam
 
Malware 101 by saurabh chaudhary
Malware 101 by saurabh chaudharyMalware 101 by saurabh chaudhary
Malware 101 by saurabh chaudhary
Saurav Chaudhary
 
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingShowing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingDan Kaminsky
 
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Joxean Koret
 
What
WhatWhat
Whatanity
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
Andrey Karpov
 
Malware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowMalware Classification Using Structured Control Flow
Malware Classification Using Structured Control Flow
Silvio Cesare
 
Debugging and optimization of multi-thread OpenMP-programs
Debugging and optimization of multi-thread OpenMP-programsDebugging and optimization of multi-thread OpenMP-programs
Debugging and optimization of multi-thread OpenMP-programs
PVS-Studio
 
A Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow DetectionA Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow Detection
ITIIIndustries
 
Parallel Lint
Parallel LintParallel Lint
Parallel Lint
PVS-Studio
 
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Iosif Itkin
 
44CON 2014 - Breaking AV Software
44CON 2014 - Breaking AV Software44CON 2014 - Breaking AV Software
44CON 2014 - Breaking AV Software
44CON
 
How to find 56 potential vulnerabilities in FreeBSD code in one evening
How to find 56 potential vulnerabilities in FreeBSD code in one eveningHow to find 56 potential vulnerabilities in FreeBSD code in one evening
How to find 56 potential vulnerabilities in FreeBSD code in one evening
PVS-Studio
 
Cyber Defense Forensic Analyst - Real World Hands-on Examples
Cyber Defense Forensic Analyst - Real World Hands-on ExamplesCyber Defense Forensic Analyst - Real World Hands-on Examples
Cyber Defense Forensic Analyst - Real World Hands-on Examples
Sandeep Kumar Seeram
 

Similar to Crash Analysis with Reverse Taint (20)

nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable codenullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
 
DEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WPDEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WP
 
Breaking av software
Breaking av softwareBreaking av software
Breaking av software
 
Breaking av software
Breaking av softwareBreaking av software
Breaking av software
 
Breaking Antivirus Software
Breaking Antivirus SoftwareBreaking Antivirus Software
Breaking Antivirus Software
 
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
Breaking Antivirus Software - Joxean Koret (SYSCAN 2014)
 
Malware 101 by saurabh chaudhary
Malware 101 by saurabh chaudharyMalware 101 by saurabh chaudhary
Malware 101 by saurabh chaudhary
 
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingShowing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
 
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
 
What
WhatWhat
What
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
 
Malware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowMalware Classification Using Structured Control Flow
Malware Classification Using Structured Control Flow
 
Debugging and optimization of multi-thread OpenMP-programs
Debugging and optimization of multi-thread OpenMP-programsDebugging and optimization of multi-thread OpenMP-programs
Debugging and optimization of multi-thread OpenMP-programs
 
A Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow DetectionA Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow Detection
 
Attacking antivirus
Attacking antivirusAttacking antivirus
Attacking antivirus
 
Parallel Lint
Parallel LintParallel Lint
Parallel Lint
 
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
 
44CON 2014 - Breaking AV Software
44CON 2014 - Breaking AV Software44CON 2014 - Breaking AV Software
44CON 2014 - Breaking AV Software
 
How to find 56 potential vulnerabilities in FreeBSD code in one evening
How to find 56 potential vulnerabilities in FreeBSD code in one eveningHow to find 56 potential vulnerabilities in FreeBSD code in one evening
How to find 56 potential vulnerabilities in FreeBSD code in one evening
 
Cyber Defense Forensic Analyst - Real World Hands-on Examples
Cyber Defense Forensic Analyst - Real World Hands-on ExamplesCyber Defense Forensic Analyst - Real World Hands-on Examples
Cyber Defense Forensic Analyst - Real World Hands-on Examples
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 

Crash Analysis with Reverse Taint

  • 1. Crash Analysis with Reverse Taint Powered by Taintgrind Marek Zmysłowski V. 12122019
  • 2. whoami Security Researcher @ Interested in fuzzing and vulnerability finding Fan of The Matrix and Hacker movies Co-organizer of “H4x0r5 %40 Warsaw” meetings
  • 3. Crash Analysis with Reverse Taint
  • 4. Crash Analysis with Reverse Taint - What is a crash? Crash, or system crash, occurs when a computer program (...) stops functioning properly and exits. * An application typically crashes when it performs an operation that is not allowed by the operating system. The operating system then triggers an exception or signal in the application. Unix applications traditionally responded to the signal by dumping core. * - Why is it important to identify a crash? “performs an operation that is not allowed”. This indicate that inside the application a bug exists. When the crash is identified and recurrent the bug can be found. *https://en.wikipedia.org/wiki/Crash_(computing)
  • 5. Crash Analysis with Reverse Taint - How to identify a crash? “operating system then triggers an exception or signal” Different operating systems contain different mechanisms to “collect” crashes. Some perform core dump, store the memory (Linux), other attach the debugger to allow user the debugging session with the crashed application (Windows). - How do we get crashes? “That, Detective, is the ‘right question.’” “I, Robot” (2004)
  • 6. Hunting in the wild - How can a crash be found? By accident or by one of the most popular techniques that we will be also mentioned here, fuzzing. The sooner the bug is found in the production process, the less the costs are. - So what is ‘fuzzing’? The idea behind fuzzing is very simple. Let’s take an malformed input and feed it to the application. Maybe it will crash. Of course, how the input is “chosen” and how the crashes are caught is a topic for another presentation(s).
  • 7. Fuzzers The godfather of all is, of course, AFL. However, recent years brought multiple fuzzers used for different purposes. Some of them are different clones of AFL, some of them try to do things differently. Everyone can find something for themselves. American Fuzzy Lop HonggFuzz AFL++ Angora QSYM WinAFL
  • 8. Real Example - Fuzzing jhead is used to display and manipulate data contained in the Exif header of JPEG images from digital cameras. By default, jhead displays the more useful camera settings from the file in a user-friendly format. The version used here is 3.03. http://www.sentex.net/~mwandel/jhead/
  • 9. Real Example - Fuzzing
  • 11. Crash vs Bug - What is the difference between Crash and Bug? Crash is a result of incorrectly working code caused by a bug. Sometimes it happens, that the crashing place and the bug place are “the same”. And sometimes not ... - Is one crash caused by one bug? No.
  • 12. Crash vs Bug Case 1. One bug causes one crash. This is the easiest situation as the identification is straightforward.
  • 13. Crash vs Bug Case 2. One bug can cause a few crashes. This happens quite often especially with simple buffer overflows where the “size” variable is used. Direct read or write and access different memory regions cause different crashes.
  • 14. Crash vs Bug Case 3. A few bugs can cause one crash. This depends on how we identify crash. The simplest example can be a frame processor. For the different types of frame, the size parser works incorrectly and may cause different crashes for different paths.
  • 15. Crash vs Bug In an additional experiment we computed a portion of groundtruth. We applied all patches to cxxfilt from the version we fuzzed up until the present. We grouped together all inputs that a particular patch caused to now gracefully exit [11], confirming that the patch represented a single conceptual bugfix. We found that all 57,142 crashing inputs deemed “unique” by coverage profiles were addressed by 9 distinct patches. Stack hashes did better, but still over-counted bugs. Instead of the bug mapping to, say 500 AFL coverage-unique crashes in a given trial, it would map to about 46 stack hashes, on average. Stackhashes were also subject to false negatives: roughly 16% of hashesfor crashes from one bug were shared by crashes from another bug.In five cases, a distinct bug was found by only one crash, and that crash had a non-unique hash, meaning that evidence of a distinct bug would have been dropped by “de-duplication.” “Evaluating Fuzz Testing” https://arxiv.org/pdf/1808.09700.pdf
  • 16. Crash vs Bug So what is needed?
  • 17. Crash Analysis with Reverse Taint
  • 18. Crash Analysis with Reverse Taint - What is crash analysis and why do we need that? Crash analysis is a process of evaluating exploitability of the crash and identifying the root cause of this crash. If you are fuzzing something, the number of crashes can be huge. Also the impact and consequences (criticality) the bug might have, depends on the application technology and the system. - So what exactly do we analyze? There are two major things to analyze: the crash and the bug.
  • 19. Analysis - Crash - What type of the crash is it? For example: Out-of-bound read, NULL Pointer Dereference, Buffer Overflow, etc. - Is the crash exploitable? It is a part of identification process to find out if the crash can be used to achieve something more than just crash the application - read a piece data, overwrite memory or execute code. - Critical or exploitable - what is the difference? The exploitability related only to bug and crash itself. The criticality is related to the whole environment. A Safe NULL Pointer Dereference is different for a nuclear power plant software and a kids game.
  • 20. Analysis - Bug The second important analysis part is to identify the bug. As this was mentioned before, there can be different relations between bug and crash. It is also important to how inputs (which bits and bytes) correlate with the bug. This of course may influence the crash later and its exploitability. - What different relation are here? User data can control the crash directly (e: offset inside the table is calculated based on user data) or indirectly (e: the incorrect branch is taken) For example: NULL Pointer Dereference vs Safe NULL Pointer Dereference
  • 21. Analysis - Bug (Direct vs. Indirect) void *pointer = NULL char table[100] int index = 0 char user_data user_data>100 pointer[index] = 0 table[user_data] = 0
  • 22. Analysis - Few Interesting Tools It runs crash files with instrumentation and outputs results in various formats. It summarizes crashes in a crashwalk database by major / minor stack hash. Although AFL (for example) already de-dupes crashes, bucketing summarizes those crashes by an order of magnitude or more. Crashes that bucket the same have exactly the same stack contents, so they're likely (not guaranteed) to be the same bug. It is a simple utility to output the filenames of all crashes matching a given hash. I use it in combination with xargs to bulk delete / move crash files. crashwalk - cwtriage - cwdump - cwfind https://github.com/bnagy/crashwalk
  • 23. Analysis - Few Interesting Tools afl-utils - afl-collect - afl-minimize Copies all crash sample files from an afl synchronisation directory (used by multiple afl instances when run in parallel) into a single location providing easy access for further crash analysis. Also executes exploitable on them and remove uninteresting crashes. Helps to create a minimized corpus from samples of a parallel fuzzing job. https://github.com/rc0r/afl-utils
  • 26. Analysis - Few Interesting Tools afl-analyze It takes an input file, attempts to sequentially flip bytes, and observes the behavior of the tested program. It then color-codes the input based on which sections appear to be critical, and which are not. While not bulletproof, it can often offer quick insights into complex file formats. https://lcamtuf.blogspot.com/2016/02/say-hello-to-afl-analyze.html
  • 27. afl-analyze *Of course it works, it just does not always give expected results.
  • 28. Crash Analysis with Reverse Taint
  • 29. Tainting - What is tainting? The purpose of dynamic taint analysis is to track information flow between sources and sinks. Any program value whose computation depends on data derived from a taint source is considered tainted. Any other value is considered untainted. - What are the types of tainting? ● The direct value is tainted ● Indirect/Control flow ● Address/Pointer relation https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf
  • 30. Tainting - Types Indirect/ Control Flow if (X > 2) Y = 5 else Y = 10 Address/Pointer Y = A[X] Direct Value Y = X + 2
  • 31. Tainting Propagation (Policy) Depends on the application, different rules can be used to propagate the taint. It is a set of rules how the source operants are propagated to destination. In standard taint analysis, the destination operand is typically marked as tainted if any of the source operands is tainted regardless of how the specific semantics of s affects its destination operands.
  • 32. Tainting - Issues - What is over-tainting? Overtainting occurs when code or data identified by the analysis as tainted is not in fact influenced by any taint source (false-positive). - What is under-tainting? Under-tainting occurs when code or data that is influenced by a taint source is not identified by the analysis as tainted. Such imprecision can be problematic, especially in systems where the result of the taint analysis is critically important (false-negative).
  • 34. - What is Valgrind?
  • 35. Valgrind It is an instrumentation framework for building dynamic analysis tools. It comes with a set of tools each of which performs some kind of debugging, profiling, or similar task that helps you improve your programs. Valgrind's architecture is modular, so new tools can be created easily and without disturbing the existing structure. http://valgrind.org
  • 36. Valgrind IR Valgrind had an x86-specific,part D&R, part C&A, assembly-code-like IR in which the units of translation were basic blocks. Since then Valgrind has had anarchitecture- neutral, D&R, single-static-assignment (SSA) IR that is more similar to what might be used in a compiler. IR blocks are superblocks: single-entry, multiple-exit stretches of code. *http://valgrind.org/docs/valgrind2007.pdf
  • 37. Single-Static-Assignment (SSA) It is a property of an intermediate representation (IR), which requires that each variable is assigned exactly once, and every variable is defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use- def chains are explicit and each contains a single element. *https://en.wikipedia.org/wiki/Static_single_assignment_form
  • 38. - What is Taintgrind?
  • 39. Taintgrind Taintgrind is based on Valgrind's MemCheck and Flayer plugin. Taintgrind borrows the bit-precise shadow memory from MemCheck and only propagates explicit data flow. This means that Taintgrind will not propagate taint in control structures such as if-else, for-loops and while-loops. Taintgrind will also not propagate taint in dereferenced tainted pointers. http://valgrind.org/docs/memcheck2005.pdf
  • 40. Taintgrind - Propagation Rules 1. The direct value is tainted 2. Indirect/Control flow 3. Address/Pointer relation
  • 41. Taintgrind - Propagation Rules - What are the Taintgrind propagation rules? The granularity for the memory operation is 1 byte. For the registry operation it is the size related to the operand. Even if one byte is used there, the whole register will still be tainted. In such case, the Taintgrind is overtainting. However, because the Taintgrind is handling first type, it is also under-tainting.
  • 42. Taintgrind - Propagation Rules WRITE READ Overtainting bit-byte operation
  • 43. Taintgrind Here is the example of logs and how the taint is propagated over the file. The job is to find all the patch from the end of the file to the beginning. One instruction can be tainted with multiple input.
  • 44. Taintgrind The original Taintgrind was not useful for the purpose of the reverse taint. It was missing a few parts. - What was changed? The “Read” function was not showing the size of data that was read. The “Load” and “Store” functions were also not presenting the size of the operation.
  • 46. GDB Function where the crash occurred Instruction that caused the crash
  • 47. The crash occurred with the reference to the address stored in RAX register GDB
  • 49. Taintgrind /work/taint-analysis/valgrind- 3.15.0/build/bin/valgrind --tool=taintgrind Calling the Taintgrind tool. ---file-filter=/work/taint-analysis/CRASH This is the name of the file that needs to be tainted. It must be FULL path. --compact=yes Makes the log file smaller. --taint-start=0 Offset inside the file. --taint-len=1504 Taint size /work/taint-analysis/jhead-3.03/jhead /work/taint-analysis/CRASH Command
  • 50. Crash Analysis with Reverse Taint
  • 51. Reverse Tainting the Value An example how the values are tracked.
  • 52. Reverse Tainting the Value Parts of one tainted variable diagram
  • 54. rtaint - -f This is the name of the log file created by Taintgrind. It can be in the compact version. - -g The script can also produce the file in dot format used to generate a graph. - -s This is the name of the file with the slice. Later, this can be used to display what operations where tainted with the values. - -k This is the directory path where the KaiTai struct will be stored inside files.
  • 55. Reverse Tainting the Value These are the indexes from the file that are causing the crash.
  • 57. Reverse Tainting the Value - With or without the file size? What is the probability that different size files with the same KaiTai Struct will have different root cause? - What is the relation between AFL Unique Algorithm and the Tainted Input? It is an open question...
  • 58. Reverse Tainting the Value - Results 413 total crashes found by 4 instances (1 master and 3 slaves) - master - 44 crashes - slave1 - 116 crashes - slave2 - 124 crashes - slave3 - 129 crashes 349 crashes were reproduced under Taintgraind 177 crashes had unique KaiTai structure.
  • 59. Slicing It is the computation of the set of program statements, the program slice, that may affect the values at some point of interest, referred to as a slicing criterion.* https://en.wikipedia.org/wiki/Program_slicing
  • 60. Graph Taintgrind and rtaint allows to create a dot graph that can be converted with the graphviz package.
  • 62. Crash Analysis with Reverse Taint Powered by Taintgrind
  • 63. Powered by ... Moflow - https://github.com/Cisco- Talos/moflow/tree/master/BAP-0.7-moflow Binary Ninja - https://blog.trailofbits.com/2019/08/29/reverse- taint-analysis-using-binary-ninja/ Triton - https://triton.quarkslab.com/ BARF - https://github.com/programa-stic/barf-project libdft - http://www.cs.columbia.edu/~vpk/research/libdft/ CommercialFree TETRANE - https://www.tetrane.com/
  • 64. What Next? The tainting starts from the last line inside the file. This is useful when there is a crash. But there is no way to taint any arbitrary instruction if the application doesn’t crash. IDA Pro/Ghidra/Binary Ninja script for highlighting the tainted instruction. This will help to easy identified the data flow. The way as it is written currently makes it slow. Optimization or the language change (thinking about Rust) is required. Updates - Address - Scripts - Speed
  • 65. What Next? Issues Currently the Taintgrind doesn't work on the ARM processors. This is caused by the Valgrind itself. It is missing some of the ARM conversions. The bug was already reported.
  • 66. Summary The solution is based on the Valgrind/Taintgrind. It means that supports all the system supported by Valgrind itself (+) But it also suffers from the Valgrind issues (-) The process of creating taint log is time consuming (-) rtaint can be used in most of the cases making the analysis “faster” and automated. Easy to incorporate to other tools. (+) The Python may not be the best solution for the rtaint. Too slow? (-) It requires more testing on the real live application. I’m happy to receive any feedback :) (+)
  • 67. References and Interesting Docs https://github.com/wmkhoo/taintgrind http://valgrind.org/ http://valgrind.org/docs/memcheck2005.pdf https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf https://www2.cs.arizona.edu/~debray/Publications/bit-level-taint.pdf http://bitblaze.cs.berkeley.edu/papers/dta%2B%2B-ndss11.pdf http://shell-storm.org/blog/Taint-analysis-and-pattern-matching-with-Pin/ https://www.blackhat.com/docs/eu-15/materials/eu-15-Kim-Triaging-Crashes-With-Backward-Taint- Analysis-For-ARM-Architecture.pdf