Active Directory Penetration Testing, cionsystems.com.pdf
Catch and Release: A New Look at Detecting and Mitigating highly obfuscated Exploit Kits
1. CATCH AND RELEASE: A
NEW LOOK AT DETECTING
AND MITIGATING HIGHLY
OBFUSCATED EXPLOIT
KITS
BY MOHAMED SAHER AND AHMED GARHY
2. AGENDA
Our Intent
Rethinking Evasions
Domain of the Problem
Current Problem
Problem with Current Solutions
Solution #1 First Method
Solution #2 Second Method
3. OUR INTENT
Is this function malicious?
function Translate(objects, offset, size) {
var length = 4;
for (var i = 0; i < size; i++) {
var r = rc.substr(0, length);
if(offset > 0) {
r = r.substr(offset) + r.substr(0, offset);
}
objects[i] = r.substr(0, r.length);
}
}
4. OUR INTENT
Is this function malicious?
function Translate(objects, offset, size) {
var length = 4;
for (var i = 0; i < size; i++) {
var r = rc.substr(0, length);
if(offset > 0) {
r = r.substr(offset) + r.substr(0, offset);
}
objects[i] = r.substr(0, r.length);
}
}
Without understanding the context on how a function is used, it is
very difficult to determine if it is malicious or not
5. OUR INTENT
What about this script?
<script>
var a = '%25%33%43%69%66%72%61%6d%65 ...';
var b = unescape(unescape(a));
var spray = new Function(unescape(b));
</script>
6. OUR INTENT
What about this script?
<script>
var a = '%25%33%43%69%66%72%61%6d%65 ...';
var b = unescape(unescape(a));
var spray = new Function(unescape(b));
</script>
An “expert’s eye” can probably determine it looks suspicious.
The two are actually equal to each other
7. OUR INTENT
What about this script?
<script>
var a = '%25%33%43%69%66%72%61%6d%65 ...';
var b = unescape(unescape(a));
var spray = new Function(unescape(b));
</script>
An “expert’s eye” can probably determine it looks suspicious.
The two are actually equal to each other
Our intent is to allow an attack using the first example script,
without depending on obfuscating like the second example
script, and propose a more superior method for detecting both
9. RETHINKING EVASIONS
Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into
disparate self contained messages – we refer to this as “units of
work”
10. RETHINKING EVASIONS
Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into
disparate self contained messages – we refer to this as “units of
work”
This is a variation of the “script splitting” technique except a
message exists within a local scope and is destroyed after it
serves its purpose
11. RETHINKING EVASIONS
Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into
disparate self contained messages – we refer to this as “units of
work”
This is a variation of the “script splitting” technique except a
message exists within a local scope and is destroyed after it
serves its purpose
Does not require DOM manipulation to hide “magic strings”
12. RETHINKING EVASIONS
Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into
disparate self contained messages – we refer to this as “units of
work”
This is a variation of the “script splitting” technique except a
message exists within a local scope and is destroyed after it
serves its purpose
Does not require DOM manipulation to hide “magic strings”
Avoid the “magic redirect IFRAME” that can be a trigger for some
analyzers
14. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
An artifact that can be parsed or scanned for patterns,
characteristics, and definitions does not exist
15. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
An artifact that can be parsed or scanned for patterns,
characteristics, and definitions does not exist
An alternative to loading JavaScript in “clear text”
16. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
An artifact that can be parsed or scanned for patterns,
characteristics, and definitions does not exist
An alternative to loading JavaScript in “clear text”
Load one message at a time, forcing each message to be
analyzed independently – remember “units of work”
17. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
An artifact that can be parsed or scanned for patterns,
characteristics, and definitions does not exist
An alternative to loading JavaScript in “clear text”
Load one message at a time, forcing each message to be
analyzed independently – remember “units of work”
Web Sockets are a perfect candidate for both MOA and
bypassing HTTP from a web environment
19. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Two components involved, client and server
Client
Listen
Invoke
20. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Two components involved, client and server
Client
Listen
Invoke
Server
State
Send
21. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Two components involved, client and server
For each accepted connection from a client, server maintains a
state machine
22. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Two components involved, client and server
For each accepted connection from a client, server maintains a
state machine
Messages are essentially commands and do not depend on each
other – remember “units of work”
23. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Two components involved, client and server
For each accepted connection from a client, server maintains a
state machine
Messages are essentially commands and do not depend on each
other – remember “units of work”
Client evaluates message, invokes message, and destroys it
24. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
25. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Only client control flow is that of the client listening and invoking a
message
26. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Only client control flow is that of the client listening and invoking a
message
Order of messages not guaranteed by server. Server may send
NOP messages as part of an attack to trick certain analyzers
27. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Only client control flow is that of the client listening and invoking a
message
Order of messages not guaranteed by server. Server may send
NOP messages as part of an attack to trick certain analyzers
“Monkey patch” functions dynamically evaluated in messages to
trick certain analyzers
28. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
29. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented
on the wire in an application specific way
30. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented
on the wire in an application specific way
No longer restricted to sending JavaScript in clear text
31. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented
on the wire in an application specific way
No longer restricted to sending JavaScript in clear text
Create custom binary format
32. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented on
the wire in an application specific way
No longer restricted to sending JavaScript in clear text
Create custom binary format
Send message in binary on the wire
01001000011001010110110001101100011011110010000001001000011
00001011011010110001001110101011100100110011100100001
33. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented on
the wire in an application specific way
No longer restricted to sending JavaScript in clear text
Create custom binary format
Send message in binary on the wire
Simply looking at a binary message won't give hints about what its
contents are – is it an audio file, an image, even text?
34. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Web Sockets are simple TCP pipes, so data can be represented on the wire
in an application specific way
No longer restricted to sending JavaScript in clear text
Create custom binary format
Send message in binary on the wire
Simply looking at a binary message won't give hints about what its contents
are – is it an audio file, an image, even text?
To even begin to understand a binary message, its format specification
needs to be known beforehand or else it is a very challenging problem in its
own
35. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Confusing the Context
36. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Confusing the Context
Remember this function?
function Translate(objects, offset, size) {
var length = 4;
for (var i = 0; i < size; i++) {
var r = rc.substr(0, length);
if(offset > 0) {
r = r.substr(offset) + r.substr(0, offset);
}
objects[i] = r.substr(0, r.length);
}
}
37. RETHINKING EVASIONS
Designing a new architecture
Avoiding HTTP
Avoiding client side state
Limit control flow and function call hierarchy
Getting creative in transport format
Confusing the Context
Remember this function?
function Translate(objects, offset, size) {
var length = 4;
for (var i = 0; i < size; i++) {
var r = rc.substr(0, length);
if(offset > 0) {
r = r.substr(offset) + r.substr(0, offset);
}
objects[i] = r.substr(0, r.length);
}
}
Now that we get this from our binary format, we again ask the question, how do you determine
if it is malicious?
38. DOMAIN OF THE PROBLEM
How can we define a malicious website?
39. DOMAIN OF THE PROBLEM
How can we define a malicious website?
How can we detect a malicious website?
40. DOMAIN OF THE PROBLEM
How can we define a malicious website?
How can we detect a malicious website?
How can we detect obfuscation?
41. DOMAIN OF THE PROBLEM
How can we define a malicious website?
How can we detect a malicious website?
How can we detect obfuscation?
How can we identify obfuscation used for malicious purposes?
42. DOMAIN OF THE PROBLEM
How can we define a malicious website?
How can we detect a malicious website?
How can we detect obfuscation?
How can we identify obfuscation used for malicious purposes?
How can we categorize what is malicious and what is not?
44. CURRENT PROBLEM
Exploits delivered at some point relies on JavaScript
JavaScript is continuously getting obfuscated with more
complexity
45. CURRENT PROBLEM
Exploits delivered at some point relies on JavaScript
JavaScript is continuously getting obfuscated with more
complexity
Current solutions are way behind in technology
46. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
47. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
DOM and CSS selectors
48. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
DOM and CSS selectors
Client side proxies for client-server interaction
49. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
DOM and CSS selectors
Client side proxies for client-server interaction
Client side template engines
50. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
Limited sets of characteristics
51. PROBLEMS WITH CURRENT
SOLUTIONS
Relies heavily on invocative functions that are not a
concrete base to be malicious (fromCharCode, eval,
unescape, etc.) and have plenty of legitimate use cases
Limited sets of characteristics
Probabilistic decisions is directly proportional with the
characteristics extracted
56. DYNAMIC ANALYSIS
AdHoc Forwarding
Create a middle layer between the browser and the JS
engine
57. DYNAMIC ANALYSIS
AdHoc Forwarding
Create a middle layer between the browser and the JS
engine
Analyze the CFG of the scripts being executed
58. DYNAMIC ANALYSIS
AdHoc Forwarding
Create a middle layer between the browser and the JS
engine
Analyze the CFG of the scripts being executed
Analyze a call hierarchy of functions order
59. DYNAMIC ANALYSIS
AdHoc Forwarding
Create a middle layer between the browser and the JS
engine
Analyze the CFG of the scripts being executed
Analyze a call hierarchy of functions order
Analyze certain combination of functions used including
known highly risky ones
62. DYNAMIC ANALYSIS
AdHoc Forwarding
Browser Automation
Attach to IE process
Use shdocvw.dll to automate COM callbacks
63. DYNAMIC ANALYSIS
AdHoc Forwarding
Browser Automation
Attach to IE process
Use shdocvw.dll to automate COM callbacks
Capture events while they trigger and manipulate them
64. DYNAMIC ANALYSIS
AdHoc Forwarding
Browser Automation
Attach to IE process
Use shdocvw.dll to automate COM callbacks
Capture events while they trigger and manipulate them
Analyze in the same manner as AdHoc Forwarding
66. DYNAMIC ANALYSIS
AdHoc Forwarding
Browser Automation
Browser In-Memory Injection
Inject JS in DOM to monitor events
67. DYNAMIC ANALYSIS
AdHoc Forwarding
Browser Automation
Browser In-Memory Injection
Inject JS in DOM to monitor events
Use a JS Debugger (FireBug or other)
70. STATIC ANALYSIS (METHOD 1)
Analyze the script and categorize them based on certain
criteria
71. STATIC ANALYSIS (METHOD 1)
Analyze the script and categorize them based on certain
criteria
Web page encoding
72. STATIC ANALYSIS (METHOD 1)
Analyze the script and categorize them based on certain
criteria
Web page encoding
Detecting current language used and extracting features
73. STATIC ANALYSIS (METHOD 1)
Analyze the script and categorize them based on certain
criteria
Web page encoding
Detecting current language used and extracting features
Check the WHOIS for the web page
74. STATIC ANALYSIS (METHOD 1)
Analyze the script and categorize them based on certain
criteria
Web page encoding
Detecting current language used and extracting features
Check the WHOIS for the web page
Determine probabilistically to which category it belongs to
76. SHANNON’S ENTROPY
Formula
We use Shannon’s Entropy to determine the entropy of the
file only as a side-effect and not a main criteria to
determine the decision whether it was malicious or not
77. NAÏVE BAYESIAN
A machine-learning technique that can be used to predict
to which category a particular data case belongs
78. NAÏVE BAYESIAN
A machine-learning technique that can be used to predict to
which category a particular data case belongs
Given the above formula’: An event A is INDEPENDENT from
event B if the conditional probability is the same as the
marginal probability
79. LAPLACIAN SMOOTHING
To avoid having a 0 joint in any partial probability we use
the add-one smoothing technique
80. LAPLACIAN SMOOTHING
To avoid having a 0 joint in any partial probability we use
the add-one smoothing technique.
Given an observation x = (x1, …, xd) from a multinomial
distribution with N trials and parameter vector
θ = (θ1, …, θd), a "smoothed" version of the data gives the
estimator
where α > 0 is the smoothing parameter (α = 0 corresponds
to no smoothing)
82. STATIC ANALYSIS (METHOD 2)
How is JS executed/handled?
1. The code is scanned for all function(s) declaration. Each
declaration is executed by creating a function object and
a named reference to that function is created so that the
function can be called from within a statement.
83. STATIC ANALYSIS (METHOD 2)
How is JS executed/handled?
1. The code is scanned for all function(s) declaration. Each
declaration is executed by creating a function object and
a named reference to that function is created so that the
function can be called from within a statement.
2. The statements are evaluated and executed by order as
they appear on the page after fully loaded.
90. STATIC ANALYSIS (METHOD 2)
Semantic analysis to focus on “what does this mean”
Optimizer-Compiler for JS which focuses on structure
other than extracted invocative functions
91. OPTIMIZER-COMPILER
The following describes the architecture of any ordinary
compiler and the current compiler as well
Lexer Parser Translator Optimizer
Tokens AST IR
92. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
93. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
94. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
95. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
96. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
97. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
Loop Invariant Code Motion
98. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
Loop Invariant Code Motion
Constant Folding
99. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
Loop Invariant Code Motion
Constant Folding
Copy Propagation
100. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
Loop Invariant Code Motion
Constant Folding
Copy Propagation
Common Sub-Expression Elimination
101. OPTIMIZER-COMPILER
At this phase the optimizer tries to optimize the JS input
based on optimization theories after the AST was
generated and converted into an IR
Optimizer
Hidden Classes
Type Inference
Inline Caches
Function Synthesis
Inline Expansion
Loop Invariant Code Motion
Constant Folding
Copy Propagation
Common Sub-Expression Elimination
Dead Code Elimination