Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Malware analysis, threat intelligence and reverse engineering


Published on

In this presentation, I introduce the concepts of malware analysis, threat intelligence and reverse engineering. Experience or knowledge is not required.

Feel free to send me feedback via Twitter (@bartblaze) or email.

Blog post:


Mind the disclaimer.

Published in: Education

Malware analysis, threat intelligence and reverse engineering

  1. 1. Malware Analysis Threat Intelligence Reverse Engineering Bart Parys
  2. 2. Introduction ● Career of +8 years in information security ● Last 4 years even more involved in malware research & analysis ● Maintain a personal blog ( ● Twitter: @bartblaze ● Email: ● Please do reach out! 2
  3. 3. What we will see today ● Short introductions for each section malware analysis, threat intelligence, reverse engineering ● Combining all three together while taking a deep(er) dive ● Hands-on exercises of course! And also… ● Feel free to interrupt me at any point during the course ● Contact me online or offline at any given point 3
  4. 4. Preparation: for this course Verify that… ● You have already downloaded the Virtual Machine provided ● You have not installed VirtualBox guest additions ● The VM is in a clean state - it cannot be already infected with malware ● Take a snapshot of the clean state No VM available? Ask if you can partner up. In case of any other issues or questions… Shout :-) 4
  5. 5. Preparation: for your own lab at home ● Always isolate your virtual environment ○ Use NAT ○ Use a VPN on the host, if you allow communication ● Snapshot capability ○ VirtualBox (Free) ○ VMWare Workstation (Not free) ● Never install VirtualBox guest additions or VMware tools ○ Why? Scrutinizes your VM - easier to identify by malware ● If possible, have a non-Windows machine as host ● Keep your host machine, antivirus and virtualization software updated! 5
  6. 6. Malware Analysis 6
  7. 7. Malware Analysis: The basics... Malware is any form of malicious software. (Plural: malware) Which types of malware do you know? Can you give me an example? ● Virus: needs user interaction & infects other applications; for example a file infector ● Worm: self-replicates; for example to shares or removable media ● Trojan or trojan horse: disguised as an innocuous application ● Backdoor: allows for persistent access; for example RATs ● Rootkit: any application that allows for privileged and persistent access ● Spyware: any application that spies on the user; for example a keylogger ● Ransomware: holds the user hostage in return for a price (either files, browser or the whole machine) ● PUP/PUA/Adware: modifies browser settings and/or installs unwanted applications 7
  8. 8. A word on ransomware Source: (2017, Microsoft) 8
  9. 9. How did it start? 1989: AIDS trojan, first case of ransomware 2005: GPcode (PGPCoder) 2009/2010: WinLock 2012: ACCDFISA, Urausy, Reveton 2013: CryptoLocker 2014: CTB-Locker (Critroni), TorrentLocker, CryptoWall 2015: Mobile ransomware (on Android), such as Fusob 2016: Locky, ‘Open-source’ ransomware such as; Eda2, Hidden Tear 2017: WannaCry (May), NotPetya (June), BadRabbit (October) 9
  10. 10. CryptoLocker ● Introduced the end of rogueware (fake antivirus) ● Innovative … ● Inspirational … ● … And very annoying :-) Many assumed that any form of cryptographic ransomware (“cryptoware”) is Cryptolocker, however this was one ransomware variant. It has been dead since 2014. 10
  11. 11. How does one get malware? The bad way 11 ● Phishing or spear-phishing ● Exploit kits ● Drive-by download ● USB drive or other removable media ● Network (shares, SMB) ● Manual installation (RDP, VNC, TeamViewer, …) ● Watering hole (Strategic Web Compromise) ● Other malware that downloads and/or installs ‘companions’
  12. 12. How does one get malware? For analysis purposes ● Malware Sample Sources for Researchers ● List of Malware Sources ● Get a job in this field that applies most to you. Keywords for jobs: Malware analyst/researcher, threat intelligence (analyst), reverser/reverse engineer - and any of these but add ‘cyber’ in front - yes really ● Ask other researchers :-) 12
  13. 13. Analysing malware: static vs dynamic 13 ● Static: do not run the malware, look at static properties ○ Can you think of tools, or what could be considered static properties? ● Dynamic: run the malware, and examine onwards ○ Can you think of tools, or what could only be discovered by running the malware? Why not both?
  14. 14. Static malware analysis: primer First of, consider the type of a file. Is it a(n)… ● Executable? EXE, COM, SCR, PIF, DLL ○ Strings, compile time, imports, sections, … ● Image? PNG, BMP, JPG, GIF ○ Steganography, hidden content, creator/creation date, ... ● Office file? DOC/DOCX, XLS/XLSX, RTF ○ Creator/creation date, embedded content, filename, … ● Adobe file? PDF, EPS, SWF/FWS ○ Creator/creation date, embedded content, filename, … ● Archive? ZIP, RAR, 7z, ISO ○ Creation date, contents, … 14
  15. 15. Static malware analysis: tools It is important to have a proper toolbox, or toolset ● Executable? ○ ExeinfoPE, Detect it Easy (DIE), PEViewer (RogueKillerPE) ● Office document? ○ Oletools, oledump, OfficeMalScanner, QuickSand ● Adobe document? ○ Pdfid, pdf-parser, PDF Stream Dumper Additionally: strings2, FLOSS, and… calculate the hash! (MD5, SHA1, SHA256) 15
  16. 16. Lab 1: static analysis ● On the desktop, you can find a folder named ‘Labs’ ● Examine the files inside the LAB1 folder Instructions ● Use HxD, ExeinfoPE and PEViewer to look at the files ● Go over at least these tabs in PEViewer: ○ Dashboard, Indicators, Hex/Strings, PE Sections/Overlay, PE Imports/Exports/TLS, PE Debug, PE Resources, Version Info/Digital Signature ● Run FLOSS and/or strings2 over the files, and identify any strings of interest 16
  17. 17. Lab 1: addendum PE - what’s in a file? ● Short for PECOFF - Portable Executable and Common Object File Format Specification ● Windows only! x86 and x64 ● Executables, object code, DLLs ● For extensive reading: “PE Format” ibrary/windows/desktop/ms68054 7%28v=vs.85%29.aspx (Microsoft) 17 DOS header - MZ - 0x4D5A PE header - PE 0x5045 Sections - code, data, imports & original entrypoint (OEP) Resources - icons Overlay - appended data
  18. 18. Dynamic malware analysis: primer You have two different ways of doing dynamic analysis: ● Do it yourself: run the malware in a VM ○ Manual dynamic analysis ● Use a sandbox: let a sandbox take care of the malware ○ Automatic dynamic analysis What are some of the pros and cons of, on one hand, running the malware yourself, and on the other hand, let a machine take care of it? 18
  19. 19. VirusTotal 19 ● Scans with over 60 engines (antivirus/machine learning) ● Supports, in theory, all file types ● Extensive file details ● Limited VT sandbox, Tencent’s sandbox ● Useful for a second opinion ● All uploads are public
  20. 20. Online sandboxes - part I 20 ● / ● ● ● (can automatically extract malware configs)
  21. 21. Online sandboxes - part II 21 ● (documents only, no executables) ● (documents only, no executables) ● (executables only)
  22. 22. Quick concept - packed malware ● Traditionally used for… ○ shrinking the file, in size ● Now used for ‘obfuscating’ the file, and its strings, imports, … 22 DOS header - MZ - 0x4D5A PE header - PE 0x5045 Unpacker code & entrypoint (stub) Packed sections
  23. 23. Lab 2: dynamic analysis ● On the desktop, you can find a folder named ‘Labs’ ● Examine the files inside the LAB2 folder Instructions ● Execute the file. Check out some of the buttons. What functionality does this file have, at least? ● Open Process Hacker and examine the strings of the process. Click the process Properties > Memory tab > Strings button ● Can you find any peculiar string(s), that static analysis or strings does not reveal? 23
  24. 24. Additional tools Fakenet ● Create a “fake network” ● Tricks the malware into thinking there’s connection ● Serves back files correspondingly CaptureBAT ● x86 only (32-bit Windows) ● Create a log file to analyse registry, file changes and more ● Creates a copy of deleted files 24
  25. 25. Lab 3: static + dynamic analysis ● On the desktop, you can find a folder named ‘Labs’ ● Examine the files inside the LAB3 folder Instructions ● Statically analyse the file. What can you discover already? ● Start Fakenet by double-clicking the icon, and CaptureBAT by opening command prompt, navigate to the directory, and start it with the following command: ○ CaptureBAT.exe -c -l lab3.log (-c will capture events, -l will write to a log file) ● Open Process Hacker, and execute the file. Let it run for a minute, and check process memory strings. What can you discover with CaptureBAT and Fakenet? 25
  26. 26. Recap: malware analysis 26 ● Malware can assume many forms ● It does not discriminate, as you have malware for most modern operating systems ● Some malware can exist cross-platform (think of a malicious macro in Word, for example) ● Static vs dynamic analysis, and combined ● Know which tools are at your disposal, but also… ● Know how to perform analysis manually!
  27. 27. Threat Intelligence 27
  28. 28. Threat Intelligence: The basics... 28 ● “ [...] the process of understanding the threats to an organization based on available data points.” Threat Intelligence: What It Is, and How to Use It Effectively, SANS 2016 ● It entails several parts: ○ Tactical ○ Strategic ○ Operational ● Be able to see the bigger picture!
  29. 29. But first… how did we get here? ● Mandiant’s APT1 report, released in February 2013 dfs/mandiant-apt1-report.pdf (PDF) ● Directly implicated a PLA unit ● Significant impact on both attackers and defenders Cyber-espionage is very real and can occur at any point, anytime and anywhere. 29
  30. 30. Cyber Kill Chain vs Diamond Model Cyber Kill Chain: ● Published in 2009 by Lockheed Martin ● Based on a series of events, from an attacker’s perspective, but defender-centric ● Disrupt or deny a chain and you may gain the upper hand - resulting in minimal financial losses or compromise 30 Diamond Model: ● Published in 2013 by Active Response ● More attacker-centric ● Tactics, techniques and procedures, also known as TTPs ● As a defender, it may enable a better overall response to a threat - resulting in minimal financial losses or compromise
  31. 31. Cyber Kill Chain 31 Source: https://www.loc m/content/dam/ lockheed/data/c orporate/docum ents/LM-White- Paper-Intel-Driv en-Defense.pdf (Lockheed Martin)
  32. 32. Diamond Model 32 Source: http://www.activerespo oads/2013/07/diamond. pdf (Active Response)
  33. 33. What else is there? 33 Apply or map Mitre’s ATT&CK matrix to an attacker: Source:
  34. 34. Pyramid of pain in Threat Intelligence (TI) 34 Source: (2013, David Bianco)
  35. 35. Exercise: investigate a real case 35 CrunchyRoll hack delivers malware: malware.html Together, we will go through the stages of this attack, and apply or map this to (part of) the diamond model, and/or the (cyber) kill chain.
  36. 36. Threat Intelligence: tactical 36 What does an organisation need to defend itself? ● Applied to real-time events ● More temporal. Why? ○ Threat actors can ‘burn’ TTPs ○ They can re-tool ● Think of the pyramid of pain!
  37. 37. Threat Intelligence: strategic 37 How does an organisation defend itself? ● Be able to respond correctly in case of an incident ● What is needed to protect the organisation ● Forms more of an overall picture, more so for management ○ But again: pyramid of pain ● Often analyses long-term trends
  38. 38. Threat Intelligence: operational 38 Operational and technical are usually glued together ● Handles on details of an attack or intrusion ● Provides guidance and technically-focused intelligence ● Often, indicators are also provided ○ What’s that pyramid again?
  39. 39. Finding intelligence ● Use threat intelligence feeds ○ For example: AlienVault OTX ○ And act on them ● Enable automated alerts ○ For example: Google Alerts ● Twitter - a rich data source, sometimes ● Security blogs ○ Vendors or individuals ● Resource: “Awesome Threat Intelligence” ○ 39
  40. 40. What’s in our toolbox? External intelligence 40 ● ● ● ●
  41. 41. What’s in our toolbox? Internal intelligence 41 ● Logs from your software and hardware ○ Antivirus, firewall, anti spam, event logs, … ○ SIEM: Security Information & Management System ■ Includes log management, compliance, analysis, specific correlation and aggregation of data, … In a dashboard ● Previous attacks, successful or not ○ Attacker methodology (think of the matrices!) ● People: experience, insight - create, structure & maintain a team
  42. 42. How to leverage Indicators of Compromise ● As seen before, indicators come in many different forms ○ Can you name some? ● STIX, TAXII, CyBox: helpful for standardising data, and transforming it into intelligence ○ Read: How STIX, TAXII and CybOX Can Help With Standardizing Threat Information / SecurityIntelligence, 2015 ● Leverage rules or rulesets, most commonly: ○ OpenIOC ○ Yara 42
  43. 43. OpenIOC ● Created by Mandiant ● We will, however, not leverage OpenIOC in this course ● Mostly used is its IOC Editor ● To learn more OpenIOC Series: Investigating with Indicators of Compromise (IOCs) 2013/12/openioc-series-investigating-indicator s-compromise-iocs.html Mandiant, 2013 43
  44. 44. Yara - part I ● Yet Another Recursive Acronym ● “The pattern matching Swiss knife” ● Can identify and classify files, not only malware ● Repository on Github ● Public list of Yara rules 44
  45. 45. Yara - part II 45
  46. 46. Yara - part III 46 ● You can scan a file, folder or process with Yara ● The commandline is as follows: ○ For a file: yara32.exe rules_file file ○ For a folder, recursively: yara32.exe rules_file folder -r Exercise: Let’s write some Yara rules together!
  47. 47. Threat Intelligence: pitfalls Threat Intelligence is not… ● A silver bullet! It often takes more than just Indicators of Compromise (IOCs) ● Easy. You need to cover all angles, vectors, possibilities, … ● Difficult. Sometimes, it’s easier… TTPs & sharing come a long way What about attribution? 47
  48. 48. Roll the attribution dice 48 It’s North-Korea… Theoretically: any country with capability
  49. 49. Attribution: scenario 49 Imagine a scenario, where a threat actor or cyber criminal attacks an individual, an organisation, or even a country. ● What do we establish first? What is or isn’t actionable? ● What are some of the possible issues? ● What are some identifiables? (Think TTP!) ● What’s next?
  50. 50. Lab 4: Yara Write efficient Yara rules for the file and its droppers, if any, in Lab 4. These can be either or both for the file on disk, or in memory. Instructions ● Run FLOSS or Strings2 on the file. See if you can find something specific, interesting and/or suspicious. Write a Yara rule, and test! ● Run the file, and observe the behaviour. Fakenet, CaptureBAT & Process Hacker are your friends! 50
  51. 51. Addendum: time Time is a very important aspect in threat intelligence ● Five minutes can make a lot of difference ○ To the attacker: execute the operation ○ To the defender: identify an operation (intrusion) ● Time is in everything: do you have a plan ready, that can kick out an attacker as fast as possible? ● What about attackers that stay on the network? ● What about attackers that perform rapid lateral movement? 51
  52. 52. Recap: threat intelligence ● Ideally, you leverage the full threat intelligence cycle ● Know what you don’t know ● Trust, but validate/verify ● There are no silver bullets ● Attribution: use, but don’t over-use ● Always maintain a healthy sense of paranoia ● Be aware, and wary, of time 52 Strategic Tactical Operational
  53. 53. Reverse Engineering 53
  54. 54. Reverse Engineering: The basics... ● “Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction.” Reverse engineering and design recovery: A taxonomy (1999, Chikofsky & Cross) ● You can compare it with a bicycle… It is assembled at the factory, but, for whatever reason, you may wish to disassemble it ● Basically, you attempt to understand the soft -or hardware ● Re-assembling, re-engineering parts may be prohibited. Read the license agreement (EULA) 54
  55. 55. Different layers 55 ● Each of these layers has multiple languages ● Can you think of any examples for each? ● The operating system can only understand machine language ● We will be focusing on assembly, however. More specifically: x86
  56. 56. Registers 56 General EAX EBX ECX EDX Segment CS DS ES/FS/GS SS Pointer ESP EBP EDI ESI What else is there? ● EIP: instruction pointer - points to the next instruction ● EFLAGS: flags - these hold the state of the CPU, where each bit is a … flag: 0 or 1 Examples are: CF, SF, ZF, PF, TF (and many more… ) 32- bit registers!
  57. 57. Instructions Moving data MOV MOVZX MOVSZ LEA Arithmetic (math) ADD SUB INC DEC Logic OR/AND/XOR SHR/SHL SAL/SAR ROR/ROL Control Flow JMP CMP/TEST CALL/RET JZ/JNZ/JB/JG (...some more) What else is there? ● Manipulating the “stack”: PUSH, POP (pushad, popad) ● NOP: No OPeration - represented with 0x90, and does nothing 57
  58. 58. Examples 58 mov eax, ebx Move the value of ebx in eax inc eax Increment the value of eax with 1 xor eax, eax Clear the eax register instruction destination, source
  59. 59. Stack, heap and memory: visualised 59 ~error catching Stack Heap Program image (base image of binary) DLLs TEB PEB ~shared user page Kernel land (no user access) 0x00000000 Grows UP Grows DOWN 0x0040000 Loaded libraries/images Thread Environment Block Process Environment Block 0x7fffffff 0xffffffff
  60. 60. Stack 60 ● Stores functions/function parameters, local variables, and for program control flow. A stack usually has a limited and fixed location in memory, where it begins ● Last In First Out or LIFO ○ The order in which elements come off a stack ● Push and Pop ○ PUSH: Adds an element to the stack ○ POP: Removes the most recently added element (sometimes referred to as pull) ● The stack grows upwards to lower addresses ● Used for short-term storage only ● Stack overflow: if the stack is full; and does not contain enough space to accept the next element - the call stack pointer exceeds thus its boundary
  61. 61. Heap ● Used for dynamic memory during program execution ● The heap grows downwards to higher addresses ● Has typically more memory available to it than the stack ● The heap requires pointers to access it ● Programmer has to define or explicitly allocate and deallocate (free) the memory ○ If this is not done properly, you may experience a memory leak ● Allocated memory is referred to as malloc 61
  62. 62. Exercises ● Name a typical return instruction ● Do we know of any 8-bit, 16-bit, 32-bit or 64-bit registers for general purposes? Name two (2) in total ● Name a typical return register ● Which register/s is/are typically used for loops? 62
  63. 63. Debuggers, disassemblers and decompilers ● Debuggers: debug a binary, this means running the sample! ○ Examples: OllyDBG, Immunity, x64dbg ● Disassemblers: disassemble or tear apart a binary. Static! ○ IDA Pro Free, Radare ○ Some disassemblers can also be used as a debugger ● Decompilers: decompile a binary, this means reverting back to its source code (more or less) ○ ILSpy, dnSpy ○ Some decompilers can also be used as a debugger 63
  64. 64. Compression and obfuscation ● Compression is, as mentioned, for reducing the file size, but also for ‘obfuscating’ data. Examples include but are not limited to: ○ UPX, Themida, ASPack, MPRESS, VMProtect, … ○ But also: Winzip, Winrar, 7z, … ● Obfuscation is actually masquerading sections of code (and text). There is a plethora of methods, including, but not limited to: ○ Fake C2 servers planted as strings ○ Base64 (remember?), XOR, RC4, AES, … ○ Garbage instructions or junk code ○ But also: obfuscators! For example, for .NET: SmartAssembly, Confuser, ... 64
  65. 65. Lab 5: IDA, x64dbg and dnSpy We will examine the files in LAB5 together. Instructions ● Open LAB5-1 in IDA, visit the strings tab, and find the ‘Hello World’ string, and its accompanying code block. What does it do? ● Open LAB5-2 in x64dbg (x32dbg), and step through. The file is packed with UPX, and we will need to unpack it manually! ● Open LAB5-3 in dnSpy, and go through the code, and figure out what it is doing from only reading the code. 65
  66. 66. Lab 5 addendum: UPX ● In LAB5, one of the files we investigated, turned out to be packed with (default) UPX ● We can also leverage the same packer, UPX, to unpack the file ● Try it yourself! UPX is located in the Tools folder > upx394w ○ upx -d file -o unpacked_file 66
  67. 67. XOR encryption ● A form of encryption (sometimes referred to as encoding) ● Exclusive OR ● Will return true only if one of the operators is true 67 Operator Operator Result 0 0 False (0) 0 1 True (1) 1 0 True (1) 1 1 False (0)
  68. 68. Lab 6: XOR We will examine the file in LAB6. Instructions ● Open the command prompt, and navigate to the Tools folder > XOR ● Use either unxor, xorsearch or xorstrings to find the secret message in the file ● All tools provide a help file/manual with the “-h” parameter ○ Example: -h 68
  69. 69. Addendum: CyberChef ● “The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis” ● Can perform a ton of operations, such as encoding/decoding, extracting, brute-forcing, … ○ For example: XOR decoding + bruteforcing! ● Utilise online: ● Send your ideas, collaborate: 69
  70. 70. Lab 7: reversing time Investigate, analyse (and write Yara rules for) the files in LAB7. Hints for the first LAB7 file (LAB7-1): ● Are you iMPRESSed with my packer? ● I smell a RAT… Hints for the second LAB7 file (LAB7-2): ● Windows? Never heard of it. ● Who will become the next xorrior? ● Mirror mirror on the wall, who has the largest botnet of them all? 70
  71. 71. Recap: reverse engineering ● Reverse engineering, or reversing, takes time to learn - this is perfectly normal ● A good method is to write a small piece of software, or file, and consequently analyse it (for example, hello world) ● ‘Crackmes’ often provide a great experience ● Every engineer, including a reverse engineer, uses Google (or MSDN) from time to time ● Familiarise yourself with the language, and languages 71
  72. 72. One last lab… 72
  73. 73. LAB 8 73 ● Lab 8 is for the fearless… So all of you! ● It includes a real campaign from an APT actor ○ What’s an APT again? ● Analyse the full chain, this means: ○ Context ○ Purpose ○ Full analysis (pre, during, post) ○ Write a (short) report on what the malware does
  74. 74. LAB 8 - hints ● Examine the file in LAB8. What kind of file is this? ○ It appears to be an email, so rename the file to LAB8.msg ● Investigate where it is sent from (or by whom) and where to ○ Additionally check the date, subject & content of the email ● Save the attachment, and unzip it. What kind of file do you get? ○ It appears to be a Word document! Use static and dynamic analysis to investigate what happens ○ Don’t forget to get and set your tools ready before starting dynamic analysis! ● What files does it drop? Any outgoing connections? Can we write a Yara rule to detect any part(s) of the attack? How is the attack performed? ○ The attack is mostly done in PowerShell, and appears to use shellcode ● Can we make an effort to do attribution at the end? ○ Qatar has been having political issues, and is in a diplomatic crisis. Possibly, one of its neighbouring countries? 74
  75. 75. This slide intentionally left blank. 75
  76. 76. Where and how to learn more Books: ● Malware Analyst's Cookbook and DVD ● Practical Malware Analysis ● Practical Reverse Engineering 76 Online: ● Reverse Engineering for Beginners ● Github: awesome X ○ ware-analysis ○ rsing ○ hreat-intelligence And also: Psychology of Intelligence Analysis (CIA) -and-monographs/psychology-of-intelligence-analysis/
  77. 77. Thank you. Questions? 77 Twitter: @bartblaze Email:
  78. 78. 78 Disclaimer and license You are free to: Share — copy and redistribute the material in any medium or format The licensor cannot revoke these freedoms as long as you follow the license terms. Under the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. NonCommercial — You may not use the material for commercial purposes. NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material. No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. The human readable version can be found at: /4.0/ The full license can be found at: /4.0/legalcode All rights reserved © 2018 Bart Parys