SlideShare a Scribd company logo
1 of 26
Malware Analysis:
Java Bytecode
May 2012
Brian Baskin
@bbaskin
Update: Back Story
• Intrusion resulting in major monetary loss
• System with keylogger and unknown trojan
• Java drive-by identified:
• 52b989e6-783fc81c.jar
– MD5: ee18509d07bf591c73bd30091080e034
Update: Java IDX results
IDX file: JAR52b989e6-783fc81c2.idx (IDX File Version 6.03)
[*] Section 2 (Download History) found:
URL: http://173.224.71.132:8080/content/Qai.jar
IP: 173.224.71.132
<null>: HTTP/1.1 200 OK
content-length: 14869
last-modified: Thu, 15 Mar 2012 14:39:44 GMT
content-type: application/java-archive
date: Thu, 15 Mar 2012 18:55:12 GMT
server: nginx
deploy-request-content-type: application/x-java-archive
[*] Section 3 (Jar Manifest) found:
[*] Section 4 (Code Signer) found:
[*] Found: Data block. Length: 4
Data: Hex: 00000000
[*] Found: Data block. Length: 3
Data: 0 Hex: 300d0a
This “Section 4” data
appears to be a
pattern indicative of a
BlackHole download.
First:
• Java Sucks
File details
• Java JAR with five included files
<Insert intrigue>
• But, first…
• WTF?
<Shrug and go back to work>
• Uncompress with internal Windows zip and let
‘er rip…
• Now, let’s take a look at one in WinHex
Yup… That’s Compiled Java
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 CA FE BA BE 00 00 00 31 00 35 07 00 02 01 00 03 Êþº¾ 1 5
00000010 6D 5F 63 07 00 04 01 00 10 6A 61 76 61 2F 6C 61 m_c java/la
00000020 6E 67 2F 4F 62 6A 65 63 74 01 00 03 6D 5F 67 01 ng/Object m_g
00000030 00 12 4C 6A 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A Ljava/lang/Obj
00000040 65 63 74 3B 01 00 03 6D 5F 65 01 00 13 5B 4C 6A ect; m_e [Lj
00000050 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A 65 63 74 3B ava/lang/Object;
00000060 01 00 03 6D 5F 68 01 00 12 4C 6A 61 76 61 2F 6C m_h Ljava/l
00000070 61 6E 67 2F 53 74 72 69 6E 67 3B 01 00 0D 43 6F ang/String; Co
00000080 6E 73 74 61 6E 74 56 61 6C 75 65 08 00 0D 01 00 nstantValue
00000090 11 56 47 37 52 45 2D 53 57 54 34 45 2D 52 55 49 VG7RE-SWT4E-RUI
000000A0 4F 53 01 00 03 6D 5F 62 01 00 11 4C 6A 61 76 61 OS m_b Ljava
000000B0 2F 6C 61 6E 67 2F 43 6C 61 73 73 3B 01 00 03 6D /lang/Class; m
000000C0 5F 64 08 00 12 01 00 1E 47 59 37 38 54 47 44 45 _d GY78TGDE
000000D0 53 38 39 46 56 59 53 50 44 46 4A 50 39 55 56 46 S89FVYSPDFJP9UVF
000000E0 39 53 30 44 4A 47 01 00 03 6D 5F 61 01 00 15 4C 9S0DJG m_a L
000000F0 6A 61 76 61 2F 75 74 69 6C 2F 4D 61 70 24 45 6E java/util/Map$En
00000100 74 72 79 3B 01 00 08 5A 4B 4D 35 2E 34 2E 33 01 try; ZKM5.4.3
00000110 00 12 5B 4C 6A 61 76 61 2F 6C 61 6E 67 2F 43 6C [Ljava/lang/Cl
00000120 61 73 73 3B 01 00 08 3C 63 6C 69 6E 69 74 3E 01 ass; <clinit>
00000130 00 03 28 29 56 01 00 04 43 6F 64 65 09 00 01 00 ()V Code
00000140 1B 0C 00 05 00 06 0A 00 03 00 1D 0C 00 1E 00 1F
What do these mean?
• CAFEBABE = Magic value
• 0031 = 0x31 – Major file version (J2SE 5.0)
• Then a huge pool of string values…
Decompile?
• JD-GUI (Java Decompiler) -
http://java.decompiler.free.fr/
• Because: decompilers > disassemblers
• Awesome, free tool to revert Java byte codes
into original Java source
JD-GUI results:
public class m_a extends Expression
{
public String m_i = z[2];
public String m_c = z[3];
private String m_h = z[4] +
z[5].concat(z[1]);
public String m_d = z[0];
protected String m_e = z[6];
private static final String[] z = {
z(z("")), z(z("8023")), z(z("")), z(z("")
), z(z("030tX")), z(z("+017")), z(z("
0230302="005")) };
But, then…
private static char[] z(String paramString)
{
// Byte code:
// 0: aload_0
// 1: invokevirtual 105
java/lang/String:toCharArray ()[C
// 4: dup
// 5: arraylength
// 6: iconst_2
// 7: if_icmpge +12 -> 19
// 10: dup
// 11: iconst_0
// 12: dup2
…
WTF?
• JD-GUI didn’t know how to parse the bytes…
so it disassembled them.
• OK, fine.
• But, not 100% correctly
Some of this is wrong…
// 14: iconst_5
// 15: irem
// 16: tableswitch default:+52 -> 68,
0:+32->48, 1:+37->53, 2:+42->58, 3:+47->63
// 49: bipush 167
// 51: nop
// 52: ldc2_w 4157
// 55: goto +15 -> 70
// 58: bipush 64
// 60: goto +10 -> 70
// 63: bipush 76
// 65: goto +5 -> 70
So, let’s go to the hex editor
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
00000000 2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70 *Y¾_ < FY 4 p
00000016 AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03 ª 4
00000032 00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F % * /
00000048 10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10 O ` 6
00000064 5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3  ‚’U„ _Z £
00000080 FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00 ÿº» 6Z_· l¶ o
00000096 00 00 00 01 00 5B 00 00 00 02 00 5C [  2
And consult the Java bible…
http://docs.oracle.com/javase/specs
This is better…
http://en.wikipedia.org/wiki/Java_bytec
ode_instruction_listings
Mnemonic
Opcode
(in hex)
Other bytes
Stack
[before]→[after]
Description
aaload 32 arrayref, index → value load onto the stack a reference from an array
aastore 53 arrayref, index, value → store into a reference in an array
aconst_null 01 → null push a null reference onto the stack
aload 19 1: index → objectref
load a reference onto the stack from a local
variable #index
And start filling in mnemonics
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70
AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03
00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F
10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10
5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3
FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00
00 00 00 01 00 5B 00 00 00 02 00 5C
2A aload_0
59 dup
BE arraylength
5F swap
03 iconst_0
3C istore_1
A7 00 46 goto +70
So far, this looks similar to JD-GUI output…
Some tricky ones…
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70
AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03
00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F
10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10
5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3
FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00
00 00 00 01 00 5B 00 00 00 02 00 5C
tableswitch (case) statement (0xAA):
00 00 00 = padding
00 00 00 34 = Default, JMP +52 (0x34)
00 00 00 00 = padding 00 00 00 03 = # branches (0-3)
00 00 00 20 JMP + 32 00 00 00 25 JMP + 37
00 00 00 2A JMP + 42 00 00 00 2F JMP + 47
13: iload_1
14: iconst_5
15: irem
16: tableswitch default: JMP +52 -> 68, 0: JMP +32->48, 1: JMP+37->53,
2:JMP +42->58, 3:JMP +47->63
49: iastore
50: goto +20 -> 70
53: bipush 96
55: goto +15 -> 70
58: bipush 54
60: goto +10 -> 70
63: bipush 92
65: goto +5 -> 70
68: bipush 92
70: ixor
71: i2c
72: castore
73: iinc 1 1
76: swap
77: dup_x1
78: iload_1
79: if_icmpgt -70 -> 9
Direct translation to Python
def decode(str):
key0 = 79 # 0x4F
key1 = 96 # 0x60
key2 = 54 # 0x36
key3 = 92 # 0x5C
keydef = 92 # 0x5C
newstr = ""
for i in range (0, length(str)):
pos = i % 5
if pos == 0: newstr += chr(ord(str[i]) ^ key0)
elif pos == 1: newstr += chr(ord(str[i]) ^ key1)
elif pos == 2: newstr += chr(ord(str[i]) ^ key2)
elif pos == 3: newstr += chr(ord(str[i]) ^ key3)
else: newstr += chr(ord(str[i]) ^ keydef)
return newstr
codes = ["8023", "030tX", "+017", " 0230302="005"]
for code in codes: print decode(code)
## All THAT just for a simple five-byte XOR key?!
Results
Encoded Decoded
8023
030tX
+017
ws
Win
do (Windows)
0230302="005 os.name
!{/ >!-zse >jv;q regsvr32 -s "%s“
9⌂>2f:qf'%#z!! java.io.tmpdir
ct
!|
.d
ll (.dll)
cu
5u
.e
xe (.exe)
FLASH, a-ah, King of the Impossible
• Same concept applies to all JIT runtimes
– e.g. Flash ActionScript
• CVE-2012-0779
– Sourced from Contagio
– Contains custom DoSWF encryption
– Adobe SWF Investigator to disassemble
– …
– Profit!
Update: AndroChef
• AndroChef: Commercial (shareware) Java
Decompiler
• http://www.neshkov.com/ac_decompiler.html
• Decompiles sample just fine
– But where’s the fun in that?
Update: AndroChef - Code
private static String z(char[] var0) {
for(int var1 = 0; var10000 > var1; ++var1) {
char var10004 = var10001[var1];
byte var10005;
switch(var1 % 5) {
case 0:
var10005 = 16;
break;
case 1:
var10005 = 61;
break;
case 2:
var10005 = 64;
break;
case 3:
var10005 = 76;
break;
default:
var10005 = 62;
}
var10001[var1] = (char)(var10004 ^ var10005);
}
return (new String(var10001)).intern();
Malware Analysis:
Java Bytecode
Brian Baskin
@bbaskin

More Related Content

What's hot

Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Simplilearn
 
がうす・まるこふ の定理とかそのへん
がうす・まるこふ の定理とかそのへんがうす・まるこふ の定理とかそのへん
がうす・まるこふ の定理とかそのへん
T T
 

What's hot (20)

Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)
 
データ解析10 因子分析の基礎
データ解析10 因子分析の基礎データ解析10 因子分析の基礎
データ解析10 因子分析の基礎
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Dll Injection
Dll InjectionDll Injection
Dll Injection
 
Kaggle and data science
Kaggle and data scienceKaggle and data science
Kaggle and data science
 
言葉のもつ広がりを、モデルの学習に活かそう -one-hot to distribution in language modeling-
言葉のもつ広がりを、モデルの学習に活かそう -one-hot to distribution in language modeling-言葉のもつ広がりを、モデルの学習に活かそう -one-hot to distribution in language modeling-
言葉のもつ広がりを、モデルの学習に活かそう -one-hot to distribution in language modeling-
 
データ解析5 単回帰分析
データ解析5 単回帰分析データ解析5 単回帰分析
データ解析5 単回帰分析
 
『バックドア基準の入門』@統数研研究集会
『バックドア基準の入門』@統数研研究集会『バックドア基準の入門』@統数研研究集会
『バックドア基準の入門』@統数研研究集会
 
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
 
ppt
pptppt
ppt
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learn
 
がうす・まるこふ の定理とかそのへん
がうす・まるこふ の定理とかそのへんがうす・まるこふ の定理とかそのへん
がうす・まるこふ の定理とかそのへん
 
強化学習における好奇心
強化学習における好奇心強化学習における好奇心
強化学習における好奇心
 
無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms
 

Viewers also liked

Security in OSGi applications: Robust OSGi Platforms, secure Bundles
Security in OSGi applications: Robust OSGi Platforms, secure BundlesSecurity in OSGi applications: Robust OSGi Platforms, secure Bundles
Security in OSGi applications: Robust OSGi Platforms, secure Bundles
Kai Hackbarth
 

Viewers also liked (9)

Casual Cyber Crime
Casual Cyber CrimeCasual Cyber Crime
Casual Cyber Crime
 
Information Gathering Over Twitter
Information Gathering Over TwitterInformation Gathering Over Twitter
Information Gathering Over Twitter
 
P2P Forensics
P2P ForensicsP2P Forensics
P2P Forensics
 
The Dark Side of Malware Analysis - Andrea Pompili - Codemotion Rome 2015
The Dark Side of Malware Analysis - Andrea Pompili - Codemotion Rome 2015The Dark Side of Malware Analysis - Andrea Pompili - Codemotion Rome 2015
The Dark Side of Malware Analysis - Andrea Pompili - Codemotion Rome 2015
 
The magic world of APT 0.6 - Pompili
The magic world of APT 0.6 - Pompili The magic world of APT 0.6 - Pompili
The magic world of APT 0.6 - Pompili
 
Black Hat 2015 Arsenal: Noriben Malware Analysis
Black Hat 2015 Arsenal: Noriben Malware AnalysisBlack Hat 2015 Arsenal: Noriben Malware Analysis
Black Hat 2015 Arsenal: Noriben Malware Analysis
 
Introducing Intelligence Into Your Malware Analysis
Introducing Intelligence Into Your Malware AnalysisIntroducing Intelligence Into Your Malware Analysis
Introducing Intelligence Into Your Malware Analysis
 
Security in OSGi applications: Robust OSGi Platforms, secure Bundles
Security in OSGi applications: Robust OSGi Platforms, secure BundlesSecurity in OSGi applications: Robust OSGi Platforms, secure Bundles
Security in OSGi applications: Robust OSGi Platforms, secure Bundles
 
Waf.js: How to Protect Web Applications using JavaScript
Waf.js: How to Protect Web Applications using JavaScriptWaf.js: How to Protect Web Applications using JavaScript
Waf.js: How to Protect Web Applications using JavaScript
 

Similar to Java bytecode Malware Analysis

Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Jagadisha Maiya
 

Similar to Java bytecode Malware Analysis (20)

44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
 
A Modern Fairy Tale: Java Serialization
A Modern Fairy Tale: Java Serialization A Modern Fairy Tale: Java Serialization
A Modern Fairy Tale: Java Serialization
 
Bsides
BsidesBsides
Bsides
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assembly
 
Hta r31
Hta r31Hta r31
Hta r31
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbg
 
Reverse engineering of binary programs for custom virtual machines
Reverse engineering of binary programs for custom virtual machinesReverse engineering of binary programs for custom virtual machines
Reverse engineering of binary programs for custom virtual machines
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
 
What Lies Beneath
What Lies BeneathWhat Lies Beneath
What Lies Beneath
 
crack satellite
crack satellite crack satellite
crack satellite
 
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей ПаньгинАварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
 
Troubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device DriversTroubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device Drivers
 
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
 
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
 
Stop Monkeys Fall
Stop Monkeys FallStop Monkeys Fall
Stop Monkeys Fall
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기
 
The true story_of_hello_world
The true story_of_hello_worldThe true story_of_hello_world
The true story_of_hello_world
 
ARM 64bit has come!
ARM 64bit has come!ARM 64bit has come!
ARM 64bit has come!
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Java bytecode Malware Analysis

  • 1. Malware Analysis: Java Bytecode May 2012 Brian Baskin @bbaskin
  • 2. Update: Back Story • Intrusion resulting in major monetary loss • System with keylogger and unknown trojan • Java drive-by identified: • 52b989e6-783fc81c.jar – MD5: ee18509d07bf591c73bd30091080e034
  • 3. Update: Java IDX results IDX file: JAR52b989e6-783fc81c2.idx (IDX File Version 6.03) [*] Section 2 (Download History) found: URL: http://173.224.71.132:8080/content/Qai.jar IP: 173.224.71.132 <null>: HTTP/1.1 200 OK content-length: 14869 last-modified: Thu, 15 Mar 2012 14:39:44 GMT content-type: application/java-archive date: Thu, 15 Mar 2012 18:55:12 GMT server: nginx deploy-request-content-type: application/x-java-archive [*] Section 3 (Jar Manifest) found: [*] Section 4 (Code Signer) found: [*] Found: Data block. Length: 4 Data: Hex: 00000000 [*] Found: Data block. Length: 3 Data: 0 Hex: 300d0a This “Section 4” data appears to be a pattern indicative of a BlackHole download.
  • 5. File details • Java JAR with five included files
  • 6. <Insert intrigue> • But, first… • WTF?
  • 7. <Shrug and go back to work> • Uncompress with internal Windows zip and let ‘er rip… • Now, let’s take a look at one in WinHex
  • 8. Yup… That’s Compiled Java Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F 00000000 CA FE BA BE 00 00 00 31 00 35 07 00 02 01 00 03 Êþº¾ 1 5 00000010 6D 5F 63 07 00 04 01 00 10 6A 61 76 61 2F 6C 61 m_c java/la 00000020 6E 67 2F 4F 62 6A 65 63 74 01 00 03 6D 5F 67 01 ng/Object m_g 00000030 00 12 4C 6A 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A Ljava/lang/Obj 00000040 65 63 74 3B 01 00 03 6D 5F 65 01 00 13 5B 4C 6A ect; m_e [Lj 00000050 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A 65 63 74 3B ava/lang/Object; 00000060 01 00 03 6D 5F 68 01 00 12 4C 6A 61 76 61 2F 6C m_h Ljava/l 00000070 61 6E 67 2F 53 74 72 69 6E 67 3B 01 00 0D 43 6F ang/String; Co 00000080 6E 73 74 61 6E 74 56 61 6C 75 65 08 00 0D 01 00 nstantValue 00000090 11 56 47 37 52 45 2D 53 57 54 34 45 2D 52 55 49 VG7RE-SWT4E-RUI 000000A0 4F 53 01 00 03 6D 5F 62 01 00 11 4C 6A 61 76 61 OS m_b Ljava 000000B0 2F 6C 61 6E 67 2F 43 6C 61 73 73 3B 01 00 03 6D /lang/Class; m 000000C0 5F 64 08 00 12 01 00 1E 47 59 37 38 54 47 44 45 _d GY78TGDE 000000D0 53 38 39 46 56 59 53 50 44 46 4A 50 39 55 56 46 S89FVYSPDFJP9UVF 000000E0 39 53 30 44 4A 47 01 00 03 6D 5F 61 01 00 15 4C 9S0DJG m_a L 000000F0 6A 61 76 61 2F 75 74 69 6C 2F 4D 61 70 24 45 6E java/util/Map$En 00000100 74 72 79 3B 01 00 08 5A 4B 4D 35 2E 34 2E 33 01 try; ZKM5.4.3 00000110 00 12 5B 4C 6A 61 76 61 2F 6C 61 6E 67 2F 43 6C [Ljava/lang/Cl 00000120 61 73 73 3B 01 00 08 3C 63 6C 69 6E 69 74 3E 01 ass; <clinit> 00000130 00 03 28 29 56 01 00 04 43 6F 64 65 09 00 01 00 ()V Code 00000140 1B 0C 00 05 00 06 0A 00 03 00 1D 0C 00 1E 00 1F
  • 9. What do these mean? • CAFEBABE = Magic value • 0031 = 0x31 – Major file version (J2SE 5.0) • Then a huge pool of string values…
  • 10. Decompile? • JD-GUI (Java Decompiler) - http://java.decompiler.free.fr/ • Because: decompilers > disassemblers • Awesome, free tool to revert Java byte codes into original Java source
  • 11. JD-GUI results: public class m_a extends Expression { public String m_i = z[2]; public String m_c = z[3]; private String m_h = z[4] + z[5].concat(z[1]); public String m_d = z[0]; protected String m_e = z[6]; private static final String[] z = { z(z("")), z(z("8023")), z(z("")), z(z("") ), z(z("030tX")), z(z("+017")), z(z(" 0230302="005")) };
  • 12. But, then… private static char[] z(String paramString) { // Byte code: // 0: aload_0 // 1: invokevirtual 105 java/lang/String:toCharArray ()[C // 4: dup // 5: arraylength // 6: iconst_2 // 7: if_icmpge +12 -> 19 // 10: dup // 11: iconst_0 // 12: dup2 …
  • 13. WTF? • JD-GUI didn’t know how to parse the bytes… so it disassembled them. • OK, fine. • But, not 100% correctly
  • 14. Some of this is wrong… // 14: iconst_5 // 15: irem // 16: tableswitch default:+52 -> 68, 0:+32->48, 1:+37->53, 2:+42->58, 3:+47->63 // 49: bipush 167 // 51: nop // 52: ldc2_w 4157 // 55: goto +15 -> 70 // 58: bipush 64 // 60: goto +10 -> 70 // 63: bipush 76 // 65: goto +5 -> 70
  • 15. So, let’s go to the hex editor Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 00000000 2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70 *Y¾_ < FY 4 p 00000016 AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03 ª 4 00000032 00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F % * / 00000048 10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10 O ` 6 00000064 5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3 ‚’U„ _Z £ 00000080 FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00 ÿº» 6Z_· l¶ o 00000096 00 00 00 01 00 5B 00 00 00 02 00 5C [ 2
  • 16. And consult the Java bible… http://docs.oracle.com/javase/specs
  • 17. This is better… http://en.wikipedia.org/wiki/Java_bytec ode_instruction_listings Mnemonic Opcode (in hex) Other bytes Stack [before]→[after] Description aaload 32 arrayref, index → value load onto the stack a reference from an array aastore 53 arrayref, index, value → store into a reference in an array aconst_null 01 → null push a null reference onto the stack aload 19 1: index → objectref load a reference onto the stack from a local variable #index
  • 18. And start filling in mnemonics 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70 AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03 00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F 10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10 5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3 FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00 00 00 00 01 00 5B 00 00 00 02 00 5C 2A aload_0 59 dup BE arraylength 5F swap 03 iconst_0 3C istore_1 A7 00 46 goto +70 So far, this looks similar to JD-GUI output…
  • 19. Some tricky ones… 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2A 59 BE 5F 03 3C A7 00 46 59 1B 5C 34 1B 08 70 AA 00 00 00 00 00 00 34 00 00 00 00 00 00 00 03 00 00 00 20 00 00 00 25 00 00 00 2A 00 00 00 2F 10 4F A7 00 14 10 60 A7 00 0F 10 36 A7 00 0A 10 5C A7 00 05 10 5C 82 92 55 84 01 01 5F 5A 1B A3 FF BA BB 00 36 5A 5F B7 00 6C B6 00 6F B0 00 00 00 00 00 01 00 5B 00 00 00 02 00 5C tableswitch (case) statement (0xAA): 00 00 00 = padding 00 00 00 34 = Default, JMP +52 (0x34) 00 00 00 00 = padding 00 00 00 03 = # branches (0-3) 00 00 00 20 JMP + 32 00 00 00 25 JMP + 37 00 00 00 2A JMP + 42 00 00 00 2F JMP + 47
  • 20. 13: iload_1 14: iconst_5 15: irem 16: tableswitch default: JMP +52 -> 68, 0: JMP +32->48, 1: JMP+37->53, 2:JMP +42->58, 3:JMP +47->63 49: iastore 50: goto +20 -> 70 53: bipush 96 55: goto +15 -> 70 58: bipush 54 60: goto +10 -> 70 63: bipush 92 65: goto +5 -> 70 68: bipush 92 70: ixor 71: i2c 72: castore 73: iinc 1 1 76: swap 77: dup_x1 78: iload_1 79: if_icmpgt -70 -> 9
  • 21. Direct translation to Python def decode(str): key0 = 79 # 0x4F key1 = 96 # 0x60 key2 = 54 # 0x36 key3 = 92 # 0x5C keydef = 92 # 0x5C newstr = "" for i in range (0, length(str)): pos = i % 5 if pos == 0: newstr += chr(ord(str[i]) ^ key0) elif pos == 1: newstr += chr(ord(str[i]) ^ key1) elif pos == 2: newstr += chr(ord(str[i]) ^ key2) elif pos == 3: newstr += chr(ord(str[i]) ^ key3) else: newstr += chr(ord(str[i]) ^ keydef) return newstr codes = ["8023", "030tX", "+017", " 0230302="005"] for code in codes: print decode(code) ## All THAT just for a simple five-byte XOR key?!
  • 22. Results Encoded Decoded 8023 030tX +017 ws Win do (Windows) 0230302="005 os.name !{/ >!-zse >jv;q regsvr32 -s "%s“ 9⌂>2f:qf'%#z!! java.io.tmpdir ct !| .d ll (.dll) cu 5u .e xe (.exe)
  • 23. FLASH, a-ah, King of the Impossible • Same concept applies to all JIT runtimes – e.g. Flash ActionScript • CVE-2012-0779 – Sourced from Contagio – Contains custom DoSWF encryption – Adobe SWF Investigator to disassemble – … – Profit!
  • 24. Update: AndroChef • AndroChef: Commercial (shareware) Java Decompiler • http://www.neshkov.com/ac_decompiler.html • Decompiles sample just fine – But where’s the fun in that?
  • 25. Update: AndroChef - Code private static String z(char[] var0) { for(int var1 = 0; var10000 > var1; ++var1) { char var10004 = var10001[var1]; byte var10005; switch(var1 % 5) { case 0: var10005 = 16; break; case 1: var10005 = 61; break; case 2: var10005 = 64; break; case 3: var10005 = 76; break; default: var10005 = 62; } var10001[var1] = (char)(var10004 ^ var10005); } return (new String(var10001)).intern();