LINEAR TIME SHELLCODE DETECTION USING STATE
MACHINES AND OPERAND ANALYSIS ON THE RUNTIME
Abhishek Singh
Aditya Joshi
File-less Attack Detection/Process Investigator Team
INTRODUCTION
Shellcodeis a small piece of code mostly written in
assembly, used as a stage 1 payload in exploiting
software vulnerabilities.
Shellcodeis also used as bootstrap/loader to load
the stage 2 payload in memory.
Packed documents & malware also use embedded
shellcode for targeted attacks.
Identifying shellcode is an important way of detecting
new malware families and unclassified code injection
attacks.
Input Bytes
Analysis result
Shellcode Analysis Workflow
Input Byte stream spanning
multiple pages
Single Pass O(n), Page Selection
algorithm.
• Detect fs/gs references.
• Number of calls In Segment(near
calls).
• Number of indirect calls.(call
register)
• Number of loops(backward relative
jump)
• Operand Expression Analysis(offset
analysis for specific Data Structure).
• Stack Instruction(track stack
push/pop/writes/read).
• Call Processing(Parameter checking +
return value checks)
• Loop Processing(Identify hashing and
decryptor loops)
• Other Exploit indicator(evaluating
current Ip & Memory probing )
Layer1(Page Selection) -------------> Layer 2(Analyzer + State Machine) -------------> Aggregation
Register State
Stack State
Function/Addressaccessed
Aggregate all detection
Signals
Detection Result
Module List Walk Pattern PE Export Walk Pattern
Deref +0x30 for
selfref
Deref +0x60 for PEB
Deref +0x18 for Ldr
Deref
+0x10,+0x20,0x30 for
iloml, imoml, iioml
Deref +0x30 for DllBase
Deref +0x58 from entry
base
+0x3c e_lfanew offset
+0x88 export
directory offset
NumberOfname &
Address offset
Operand Analysis for Offset Access in Instruction Window
CALL PROCESSING
first parameter
Register and Stacks value
is tracked through State
machine.
Detection is based on
Parameter Type, Number
and Known Constant
usage.
Constants used
during Param Init
6th Parameter
5th Parameter
Known Values
STATE MACHINE WORKFLOW
For Each Instruction
Register State
Stack State
Access to Known Data
Structures
Evaluate Expressions.
Process Data Move.
Process Bit Shift.
Process arithmetic.
Process Mask.
Virtual Processor
Demo : Shellcode detection using the State Machine , Operand Expression Analyzer
STAGE TWO
SCENARIO
Offense
• Reverse engineer app to find RCE
• Plant rogue messaging peer in Azure
• Sends sequence of valid, invalid
messages to the vulnerable app to hijack
instructions pointer
• Overwrites return address with cross-
platform shellcode to get vulnerable
app to connect back to the rogue peer
Defense
• Fix bug at source
• Monitor for unusual network activity
• Monitor for unusual process/task
creations
• Detect foreign code running in the
vulnerable app
Vuln
App
GCP Ubuntu VM AWS Windows VM
Vuln
AppRogue
Peer
Azure
Windows VM
• Cross platform peer messaging
application with RCE
• Application sends telemetry to Azure
endpoint
THE TARGET
OFFENSE: REVERSE ENGINEER APP
Linux Windows
OFFENSE: OPERATING SYSTEM DETECTION
Linux: FS value for ELF
• FS pointer
Windows: FS value for PE
OFFENSE: DETECTION EVASION
• Encoded Shellcode: not, xor, insertion or custom encoders
• Anti-Debugging Techniques: Target application is being debugged
• Custom Prolog JMP hopping: Jump over hooks by supplying custom prolog
• Hooking Detection: Read the first byte of the api for jumps/calls indicated hooked
functions
• Polymorphism: Modify shellcode to get variance in byte patterns while achieving
the same outcome
DETECT: RUNTIME MEMORY FORENSICS
Linux
• ReadProcessMemory of candidate
segments in the target process
• Page selection algorithm
• Detect References to fs/gs
• Call and loop analysis
• Analysis and Outcome
• Call Processing
• Return Value - rax
• Arguments – rcx, rdx, r8, r9
• Stack processing
• Decoder/Hashing loops
• Get current ip/Memory probing
• Network connection information
Windows
• process_vm_readv (kernel 3.2+) of
candidate mappings in the target process
• Page selection algorithm
• Fixed syscall table. Detect syscall usage.
• Call and loop analysis
• Analysis and Outcome
• Call Processing
• Return value/Syscall – rax
• Arguments – rdi, rsi, rdx, r10, r8, r9
• Stack processing
• Decoder/Hashing loops
• Get current ip/Memory probing
• Network connection information

BlueHat v18 || Linear time shellcode detection using state machines and operand analysis on the runtime

  • 1.
    LINEAR TIME SHELLCODEDETECTION USING STATE MACHINES AND OPERAND ANALYSIS ON THE RUNTIME Abhishek Singh Aditya Joshi File-less Attack Detection/Process Investigator Team
  • 2.
    INTRODUCTION Shellcodeis a smallpiece of code mostly written in assembly, used as a stage 1 payload in exploiting software vulnerabilities. Shellcodeis also used as bootstrap/loader to load the stage 2 payload in memory. Packed documents & malware also use embedded shellcode for targeted attacks. Identifying shellcode is an important way of detecting new malware families and unclassified code injection attacks.
  • 3.
  • 4.
    Input Byte streamspanning multiple pages Single Pass O(n), Page Selection algorithm. • Detect fs/gs references. • Number of calls In Segment(near calls). • Number of indirect calls.(call register) • Number of loops(backward relative jump) • Operand Expression Analysis(offset analysis for specific Data Structure). • Stack Instruction(track stack push/pop/writes/read). • Call Processing(Parameter checking + return value checks) • Loop Processing(Identify hashing and decryptor loops) • Other Exploit indicator(evaluating current Ip & Memory probing ) Layer1(Page Selection) -------------> Layer 2(Analyzer + State Machine) -------------> Aggregation Register State Stack State Function/Addressaccessed Aggregate all detection Signals Detection Result
  • 5.
    Module List WalkPattern PE Export Walk Pattern Deref +0x30 for selfref Deref +0x60 for PEB Deref +0x18 for Ldr Deref +0x10,+0x20,0x30 for iloml, imoml, iioml Deref +0x30 for DllBase Deref +0x58 from entry base +0x3c e_lfanew offset +0x88 export directory offset NumberOfname & Address offset Operand Analysis for Offset Access in Instruction Window
  • 6.
    CALL PROCESSING first parameter Registerand Stacks value is tracked through State machine. Detection is based on Parameter Type, Number and Known Constant usage. Constants used during Param Init 6th Parameter 5th Parameter Known Values
  • 7.
    STATE MACHINE WORKFLOW ForEach Instruction Register State Stack State Access to Known Data Structures Evaluate Expressions. Process Data Move. Process Bit Shift. Process arithmetic. Process Mask. Virtual Processor Demo : Shellcode detection using the State Machine , Operand Expression Analyzer
  • 8.
  • 9.
    SCENARIO Offense • Reverse engineerapp to find RCE • Plant rogue messaging peer in Azure • Sends sequence of valid, invalid messages to the vulnerable app to hijack instructions pointer • Overwrites return address with cross- platform shellcode to get vulnerable app to connect back to the rogue peer Defense • Fix bug at source • Monitor for unusual network activity • Monitor for unusual process/task creations • Detect foreign code running in the vulnerable app Vuln App GCP Ubuntu VM AWS Windows VM Vuln AppRogue Peer Azure Windows VM • Cross platform peer messaging application with RCE • Application sends telemetry to Azure endpoint
  • 10.
  • 11.
    OFFENSE: REVERSE ENGINEERAPP Linux Windows
  • 12.
    OFFENSE: OPERATING SYSTEMDETECTION Linux: FS value for ELF • FS pointer Windows: FS value for PE
  • 13.
    OFFENSE: DETECTION EVASION •Encoded Shellcode: not, xor, insertion or custom encoders • Anti-Debugging Techniques: Target application is being debugged • Custom Prolog JMP hopping: Jump over hooks by supplying custom prolog • Hooking Detection: Read the first byte of the api for jumps/calls indicated hooked functions • Polymorphism: Modify shellcode to get variance in byte patterns while achieving the same outcome
  • 14.
    DETECT: RUNTIME MEMORYFORENSICS Linux • ReadProcessMemory of candidate segments in the target process • Page selection algorithm • Detect References to fs/gs • Call and loop analysis • Analysis and Outcome • Call Processing • Return Value - rax • Arguments – rcx, rdx, r8, r9 • Stack processing • Decoder/Hashing loops • Get current ip/Memory probing • Network connection information Windows • process_vm_readv (kernel 3.2+) of candidate mappings in the target process • Page selection algorithm • Fixed syscall table. Detect syscall usage. • Call and loop analysis • Analysis and Outcome • Call Processing • Return value/Syscall – rax • Arguments – rdi, rsi, rdx, r10, r8, r9 • Stack processing • Decoder/Hashing loops • Get current ip/Memory probing • Network connection information