We present an automatic static tool CryFind to identify cryptographic algorithms in a binary executable. Our main strategy is using string match to search for cryptographic constants and API names. To expand our search range and improve our hit rate, our tool matches strings under different encodings and XOR'ed with different keys, as well as incorporates techniques to extract strings on stack. As a result, we have a more effective and efficient detection tool compared with a wide range of state-of-the-art static analysis tools.
9. Encoding
Encoding Value
Big Endian 0xfaceb00c
Little Endian 0x0cb0cefa
Big Endian → Negative → Big Endian 0x05314ff4
Little Endian → Negative → Big Endian 0xf34f3106
Big Endian → Negative → Little Endian 0xf44f3105
Little Endian → Negative → Little Endian 0x06314ff3
10. Size
Size Bytes Value
FULLWORD ALL
'x01#Egx89xabxcdxefxfexdc
xbax98vT2x10'
DWORD 4
'x01#Eg', 'x89xabxcdxef',
'xfexdcxbax98', 'vT2x10'
QWORD 8
'x01#Egx89xabxcdxef',
'xfexdcxbax98vT2x10'
16. Grouping the constants
address 0x21 0x30 0x35 0x42 0x58 0x63 0x7f 0xa0
index 0 0 1 2 3 2 1 3
Q : Find a smallest interval that includes all constants
A : binary search interval size
index value
0 0x11111111
1 0x22222222
2 0x33333333
3 0x44444444
17. Search
1. Using python in operator
• O(nk + m) time
2. Using AC ( Aho-Corasick ) Algorithm
• O(n + m) time
• https://github.com/abusix/ahocorapy
3. Using yara rules
19. Search with yara
$c_0_fullword_big
rule cry_134 {
meta:
id = 133
name = "TEAN [32 rounds]"
length = 8
strings:
$c_0_fullword_big = { 2037efc6b979379e }
$c_0_fullword_little = { 9e3779b9c6ef3720 }
$c_0_dword_big = { 2037efc6 }
...
$c_0_qword_little = { 9e3779b9c6ef3720 }
$c_0_qword_bnb = { dfc810394686c862 }
...
condition:
(any of ($c_0_fullword_*)) or (any of ($c_0_dword_*)) and (any of
($c_1_dword_*)) or (any of ($c_0_qword_*))
}
Index
Size
Encoding
20. Bypass XOR Encryption
• Bypass xor encryption with key length 1 ( can be extended to longer key length )
• Brute force for 256 possibilities are too slow
• Try to search in xor difference array
3 5
6
⊕
7
⊕
2
3 ⊕ 2 5 ⊕ 2
6
⊕
7 ⊕ 2
⊕
2
4
⊕
1
Signature Binary
21. StackStrings
• Using radare2 reverse engineering framework to do emulation
• Pull out all functions
• Building a spanning tree on the function CFG to avoid cycle
• Emulate each path and dump the stack
• Search cryptographic constant in the stack