The document discusses challenges in malware analysis and describes a system for automated static malware analysis. It outlines techniques for capturing malware, disassembling malware binaries, resolving obfuscated API calls, and handling obfuscation techniques used by malware authors to evade analysis. The system aims to unpack malware, recover the original program structure, and produce a higher-level representation to facilitate understanding the malware's purpose and functionality.
Beginner level presentation on Malware Identification as part of the Malware Reverse Engineering course. Learn what malware is, how it functions, how it can be detected, identified and isolated for reverse engineering. For more information about malware detection and removal visit https://www.intertel.co.za
Basic Network Attacks
The active and passive attacks can be differentiated on the basis of what are they, how they are performed and how much extent of damage they cause to the system resources. But, majorly the active attack modifies the information and causes a lot of damage to the system resources and can affect its operation. Conversely, the passive attack does not make any changes to the system resources and therefore doesn’t causes any damage.
Basic survey on malware analysis, tools and techniquesijcsa
The term malware stands for malicious software. It is a program installed on a system without the
knowledge of owner of the system. It is basically installed by the third party with the intention to steal some
private data from the system or simply just to play pranks. This in turn threatens the computer’s security,
wherein computer are used by one’s in day-to-day life as to deal with various necessities like education,
communication, hospitals, banking, entertainment etc. Different traditional techniques are used to detect
and defend these malwares like Antivirus Scanner (AVS), firewalls, etc. But today malware writers are one
step forward towards then Malware detectors. Day-by-day they write new malwares, which become a great
challenge for malware detectors. This paper focuses on basis study of malwares and various detection
techniques which can be used to detect malwares.
Beginner level presentation on Malware Identification as part of the Malware Reverse Engineering course. Learn what malware is, how it functions, how it can be detected, identified and isolated for reverse engineering. For more information about malware detection and removal visit https://www.intertel.co.za
Basic Network Attacks
The active and passive attacks can be differentiated on the basis of what are they, how they are performed and how much extent of damage they cause to the system resources. But, majorly the active attack modifies the information and causes a lot of damage to the system resources and can affect its operation. Conversely, the passive attack does not make any changes to the system resources and therefore doesn’t causes any damage.
Basic survey on malware analysis, tools and techniquesijcsa
The term malware stands for malicious software. It is a program installed on a system without the
knowledge of owner of the system. It is basically installed by the third party with the intention to steal some
private data from the system or simply just to play pranks. This in turn threatens the computer’s security,
wherein computer are used by one’s in day-to-day life as to deal with various necessities like education,
communication, hospitals, banking, entertainment etc. Different traditional techniques are used to detect
and defend these malwares like Antivirus Scanner (AVS), firewalls, etc. But today malware writers are one
step forward towards then Malware detectors. Day-by-day they write new malwares, which become a great
challenge for malware detectors. This paper focuses on basis study of malwares and various detection
techniques which can be used to detect malwares.
With the growth of computer networking, electronic commerce and web services, security networking systems have become very important to protect infomation and networks againts malicious usage or attacks. In this report, it is designed an Intrusion Detection System using two artificial neural networks: one for Intrusion Detection and the another for Attack Classification.
A college lecture at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
Intrusion Detection Systems and Intrusion Prevention Systems Cleverence Kombe
Intrusion detection system (IDS) is software that automates the intrusion detection process. The primary responsibility of an IDS is to detect unwanted and malicious activities. Intrusion prevention system (IPS) is software that has all the capabilities of an intrusion detection system and can also attempt to stop possible incidents.
CNIT 126: 10: Kernel Debugging with WinDbgSam Bowne
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_F19.shtml
What is IDS?
Software or hardware device
Monitors network or hosts for:
Malware (viruses, trojans, worms)
Network attacks via vulnerable ports
Host based attacks, e.g. privilege escalation
What is in an IDS?
An IDS normally consists of:
Various sensors based within the network or on hosts
These are responsible for generating the security events
A central engine
This correlates the events and uses heuristic techniques and rules to create alerts
A console
To enable an administrator to monitor the alerts and configure/tune the sensors
Different types of IDS
Network IDS (NIDS)
Examines all network traffic that passes the NIC that the sensor is running on
Host based IDS (HIDS)
An agent on the host that monitors host activities and log files
Stack-Based IDS
An agent on the host that monitors all of the packets that leave or enter the host
Can monitor a specific protocol(s) (e.g. HTTP for webserver)
Secure Code Review is the best approach to uncover the most security flaws, in addition to being the only approach to find certain types of flaws like design flaws. During this session, you will learn how to perform security code review and uncover vulnerabilities such as OWASP Top 10: Cross-site Scripting, SQL Injection, Access Control and much more in early stages of development. You will use a real life application. You will get an introduction to Static Code Analysis tools and how you can automate some parts of the process using tools like FxCop.
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your CodeDevOps.com
While graph databases are primarily known as the backbone of the modern social networks, we have found a much more interesting application for them: program analysis. This talk aims to demonstrate that graph databases and the typical program representations developed in compiler construction are a match made in heaven, allowing large code bases to be mined for vulnerabilities using complex bug descriptions encoded in simple, and not so simple graph database queries.
This talk will bring together two well-known but previously unrelated topics: static program analysis and graph databases. After briefly covering the "emerging graph landscape" and why it may be interesting for hackers, a graph representation of programs exposing syntax, control-flow, data-dependencies and type information is presented, designed specifically with bug/backdoors/business logic flaws hunting in mind.
Capabilities of the system will then be demonstrated live with Joern, an open source code exploration tool, as we craft queries for RCE exploits, insider attacks, data leak detection.
With the growth of computer networking, electronic commerce and web services, security networking systems have become very important to protect infomation and networks againts malicious usage or attacks. In this report, it is designed an Intrusion Detection System using two artificial neural networks: one for Intrusion Detection and the another for Attack Classification.
A college lecture at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
Intrusion Detection Systems and Intrusion Prevention Systems Cleverence Kombe
Intrusion detection system (IDS) is software that automates the intrusion detection process. The primary responsibility of an IDS is to detect unwanted and malicious activities. Intrusion prevention system (IPS) is software that has all the capabilities of an intrusion detection system and can also attempt to stop possible incidents.
CNIT 126: 10: Kernel Debugging with WinDbgSam Bowne
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_F19.shtml
What is IDS?
Software or hardware device
Monitors network or hosts for:
Malware (viruses, trojans, worms)
Network attacks via vulnerable ports
Host based attacks, e.g. privilege escalation
What is in an IDS?
An IDS normally consists of:
Various sensors based within the network or on hosts
These are responsible for generating the security events
A central engine
This correlates the events and uses heuristic techniques and rules to create alerts
A console
To enable an administrator to monitor the alerts and configure/tune the sensors
Different types of IDS
Network IDS (NIDS)
Examines all network traffic that passes the NIC that the sensor is running on
Host based IDS (HIDS)
An agent on the host that monitors host activities and log files
Stack-Based IDS
An agent on the host that monitors all of the packets that leave or enter the host
Can monitor a specific protocol(s) (e.g. HTTP for webserver)
Secure Code Review is the best approach to uncover the most security flaws, in addition to being the only approach to find certain types of flaws like design flaws. During this session, you will learn how to perform security code review and uncover vulnerabilities such as OWASP Top 10: Cross-site Scripting, SQL Injection, Access Control and much more in early stages of development. You will use a real life application. You will get an introduction to Static Code Analysis tools and how you can automate some parts of the process using tools like FxCop.
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your CodeDevOps.com
While graph databases are primarily known as the backbone of the modern social networks, we have found a much more interesting application for them: program analysis. This talk aims to demonstrate that graph databases and the typical program representations developed in compiler construction are a match made in heaven, allowing large code bases to be mined for vulnerabilities using complex bug descriptions encoded in simple, and not so simple graph database queries.
This talk will bring together two well-known but previously unrelated topics: static program analysis and graph databases. After briefly covering the "emerging graph landscape" and why it may be interesting for hackers, a graph representation of programs exposing syntax, control-flow, data-dependencies and type information is presented, designed specifically with bug/backdoors/business logic flaws hunting in mind.
Capabilities of the system will then be demonstrated live with Joern, an open source code exploration tool, as we craft queries for RCE exploits, insider attacks, data leak detection.
2012 B-Sides and ToorCon Talk Offensive Defense
Blog Post - http://blog.ioactive.com/2013/01/offensive-defense.html
Cyber-criminals have had back-end infrastructures equivalent to Virus Total to test if malware and exploits are effective against AV scanners for many years, thus showing that attackers are proactively avoiding detection when building malware. In this day of age malicious binaries are generated on demand by server-side kits when a victim visits a malicious web page, making reliance solely on hash based solutions inadequate. In the last 15 years detection techniques have evolved in an attempt to keep up with attack trends. In the last few years security companies have looked for supplemental solutions such as the use of machine learning to detect and mitigate attacks against cyber criminals. Let's not pretend attackers can't bypass each and every detection technique currently deployed. Join me as I present and review current detection methods found in most host and network security solutions found today. We will re-review the defense in depth strategy while keeping in mind that a solid security strategy consists of forcing an attacker to spend as much time and effort while needing to know a variety of skills and technologies in order to successfully pull off the attack. In the end I hope to convince you that thinking defensively requires thinking offensively.
An Introduction of SQL Injection, Buffer Overflow & Wireless AttackTechSecIT
Cyber Security - What is a SQL Injection, Buffer Overflow & Wireless Network Attack. Types of SQL Injection, Buffer Overflow and Wireless Network Attack
Using Analyzers to Resolve Security Problemskiansahafi
in this presentation i took a project and used an analyzer(e.g. SonarQube) to detect the security issues with it and reported a the result and after resolving most of those problems i used the same analyzer to get another report and in the process showed how to use such analyzers to detect security issues in the web applications
Demystifying Binary Reverse Engineering - Pixels CampAndré Baptista
Reverse engineering is not just about uncovering the hidden behaviour of a given technology, system, program or device. It's actually an art and a mindset. Reversing is used by some government agencies, secret services, antivirus software companies, hackers and students. It can be used for many purposes: cracking/bypassing software, botnet analysis, finding 0day exploits, interpreting unknown protocols, understanding malware or finding bugs in apps.
RIoT (Raiding Internet of Things) by Jacob HolcombPriyanka Aash
The recorded version of 'Best Of The World Webcast Series' [Webinar] where Jacob Holcomb speaks on 'RIoT (Raiding Internet of Things)' is available on CISOPlatform.
Best Of The World Webcast Series are webinars where breakthrough/original security researchers showcase their study, to offer the CISO/security experts the best insights in information security.
For more signup(it's free): www.cisoplatform.com
IoT (Internet of Things) and OT (Operational Technology) are the current buzzwords for networked devices on which our modern society is based on. In this area the used operating systems are summarized with the term firmware. The devices by themself, so called embedded devices, are essential in the private, as well as in the industrial environment and in the so-called critical infrastructure. Penetration testing of these systems is quite complex as we have to deal with different architectures, optimized operating systems and special protocols. EMBA is an open-source firmware analyzer with the goal to simplify and optimize the complex task of firmware security analysis. EMBA supports the penetration tester with the automated detection of 1-day vulnerabilities on binary level. This goes far beyond the plain CVE detection. With EMBA you always know which public exploits are available for the target firmware. Beside the detection of already known vulnerabilities, EMBA also supports the tester on the next 0-day. For this EMBA identifies critical binary functions, protection mechanisms and services with network behavior on a binary level. There are many other features built into EMBA, such as fully automated firmware extraction, finding file system vulnerabilities, hard-coded credentials, and more. EMBA is an open-source firmware scanner, created by penetration testers for penetration testers.
Project page: https://github.com/e-m-b-a/emba
Conference page: https://troopers.de/troopers22/agenda/tr22-1042-emba-open-source-firmware-security-testing/
We show that it is possible to write remote stack buffer overflow exploits without possessing a copy of the target binary or source code, against services that restart after a crash. This makes it possible to hack proprietary closed-binary services, or open-source servers manually compiled and installed from source where the binary remains unknown to the attacker. Traditional techniques are usually paired against a particular binary and distribution where the hacker knows the location of useful gadgets for Return Oriented Programming (ROP). Our Blind ROP (BROP) attack instead remotely finds enough ROP gadgets to perform a write system call and transfers the vulnerable binary over the network, after which an exploit can be completed using known techniques. This is accomplished by leaking a single bit of information based on whether a process crashed or not when given a particular input string. BROP requires a stack vulnerability and a service that restarts after a crash. The attack works against modern 64-bit Linux with address space layout randomization (ASLR), no-execute page protection (NX) and stack canaries.
We show that it is possible to write remote stack buffer overflow exploits without possessing a copy of the target binary or source code, against services that restart after a crash. This makes it possible to hack proprietary closed-binary services, or open-source servers manually compiled and installed from source where the binary remains unknown to the attacker. Traditional techniques are usually paired against a particular binary and distribution where the hacker knows the location of useful gadgets for Return Oriented Programming (ROP). Our Blind ROP (BROP) attack instead remotely finds enough ROP gadgets to perform a write system call and transfers the vulnerable binary over the network, after which an exploit can be completed using known techniques. This is accomplished by leaking a single bit of information based on whether a process crashed or not when given a particular input string. BROP requires a stack vulnerability and a service that restarts after a crash. The attack works against modern 64-bit Linux with address space layout randomization (ASLR), no-execute page protection (NX) and stack canaries.
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detectionNeel Pathak
Slide briefly describes various av mechanisms, how they actually work, where any file signature is stored etc. And finally discusses av bypassing techniques.
Hackers already knows these techniques but do we know these ? These are just few techniques but there are many.
Related document can be found at
http://www.scribd.com/doc/176058721/Anti-Virus-Mechanism-and-Anti-Virus-Bypassing-Techniques
This presentation talk about some of the challenges in detecting advanced malware which uses evasion techniques such as inline assembly or previously unknown approaches. The presentation also focuses on leveraging the static code analysis as an opportunity to detect these evasive malware in the sandbox
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
2. MOTIVATION
• Malware landscape is diverse and constant evolving
• Large botnets
• Diverse propagation vectors, exploits, C&C
• Capabilities – backdoor, keylogging, rootkits,
• Logic bombs, time-bombs
• Diverse targets: desktops, mobile platforms, SCADA systems
(Stuxnet)
• Malware is not about script-kiddies anymore, it’s real
business. Recent events indicate that it can be a
powerful weapon in cyber warfare.
• Manual reverse-engineering is close to impossible
• Need automated techniques to extract system logic,
interactions and side-effects, derive intent, and devise mitigating
strategies.www.intertel.co.za
3. OUTLINE
• Review of the workflow of binary program analysis
• Review of the challenges in binary program analysis:
• Obfuscation Techniques
• Techniques for reverse engineering stripped binaries:
• Systematic deobfuscation
• Examples of obfuscation: Conficker, Hydrac (Google attack), Stuxnet, …
www.intertel.co.za
4. CAPTURING MALWARE
• Honeynets: Capture malware that scans the Internet for vulnerable
targets
• Mining SPAM for attachments
• Mining SPAM for malicious URLs, and capturing drive-by downloads
• AV heuristics
www.intertel.co.za
6. DYNAMIC VS STATIC MALWARE ANALYSIS
• Dynamic Analysis
• Techniques that profile actions of binary at runtime
• More popular
• CWSandbox, TTAnalyze, multipath exploration
• Only provides partial ``effects-oriented profile’’ of malware potential
• Static Analysis
• Can provide complementary insights
• Potential for more comprehensive assessment
www.intertel.co.za
7. MALWARE EVASIONS AND
OBFUSCATIONS
• To defeat signature based detection schemes
• Polymorphism, metamorphism: started appearing in viruses of
the 90’s primarily to defeat AV tools
• To defeat Dynamic Malware Analysis
• Anti-debugging, anti-tracing, anti-memory dumping
• VMM detection, emulator detection
• To defeat Static Malware analysis
• Encryption (packing)
• API and control-flow obfuscations
• Anti-disassembly
• The main purpose of obfuscation is to slow down the
security communitywww.intertel.co.za
8. MY PERSONAL PHILOSOPHY
• Push the limits of static analysis as much as possible.
• Rebuild the binary in its original form prior to obfuscation.
• Recover a higher level description of the malware binary that makes
deriving the purpose of the malware atteingnable: I want to stare at C
code as opposed to staring at assembly code
www.intertel.co.za
9. MALWARE REVERE ENGINEERING SYSTEM
GOALS
• Desiderata for a Static Analysis Framework
• Unpack most of contemporary malware
• Handle most if not all packers
• Deobfuscate API references
• Automate identification of capabilities
• Provide feedback on unpacking success
• Simplify and annotate call graphs to illustrate interactions
between key logical blocks
• Enable decompilation of assemply code into a higher-level
language
• Identify key logical blocks (crypto for instance)
www.intertel.co.za
10. REVERSE ENGINEERING PHASES
Unpacking phase: the image of a running malware sample is often considered
damaged: No known OEP. Imported APIs are invoked dynamically and the original
import table is destroyed. Arbitrary section names and r/w/e permissions.
Disassembly phase:
- Identification of code and data segments
- Relies on the unpacker to capture all code and data segments.
Decompilation phase:
- Reconstruction of the code segment into a C-like higher level representation
- Relies on the disassembler to recognize function boundaries, targets of call sites,
imports, and OEP.
Program understanding phase:
- Relies on the decompiler to produce readable C code, by recognizing the
compiler, calling conventions, stack frames manipulation, functions prologs and
epilogs, user-defined data structures
www.intertel.co.za
12. THE EUREKA FRAMEWORK
• Novel unpacking technique based on coarse grained execution tracing
• Heuristic-based and statistic-based upacking
• Implements several techniques to handle obfucated API references
• Multiple metrics to evaluate unpack success
• Annotated call graphs provide bird’s eye view of system interaction
www.intertel.co.za
13. THE EUREKA WORKFLOW
Trace
Malware
syscalls in
VM
Syscall
trace
Heuristic
based
offline
analysis
Eureka’s
Unpacker
Favorable
execution
point
Packed
Binary
Un-
packed
Binary
Dis-
assembly
IDA-Pro
Un-
Packed
.ASM
Dis-
assembly
IDA-Pro
Packed
.ASM
Statistics
based
Evaluator
Unpack
Evaluati
on
Raw unpacked
Executable
•Unknown OEP
•No debug information
•Unresolved library calls
•Snapshot of data segment
•Unreachable code
•Loss of structuresStatistics
based
Evaluator
www.intertel.co.za
14. COARSE-GRAINED EXECUTION
MONITORING
• Generalized unpacking principle
• Execute binary till it has sufficiently revealed itself
• Dump the process execution image for static analysis
• Monitoring exection progress
• Eureka employs a Windows driver that hooks to SSDT (System Service
Dispatch Table)
• Callback invoked on each NTDLL system call
• Filtering based on malware process pid
www.intertel.co.za
15. HEURISTIC-BASED UNPACKING
• How do you determine when to dump?
• Heuristic #1: Dump as late as possible. NtTerminateProcess
• Heuristic #2: Dump when your program generates errors.
NtRaiseHardError
• Heuristic #3: Dump when program forks a child process.
NtCreateProcess
• Issues
• Weak adversarial model, too simple to evade…
• Doesn’t work well for package non-malware programs
www.intertel.co.za
16. STATISTICS-BASED UNPACKING
• Observations
• Statistical properties of packed executable differ from unpacked exectuable
• As malware executes code-to-data ratio increases
• Complications
• Code and data sections are interleaved in PE executables
• Data directories(import tables) look similar to data but are often found in
code sections
• Properties of data sections vary with packers
www.intertel.co.za
17. STATISTICS-BASED UNPACKING (2)
• Our Approach
• Model statistical properties of unpacked code
• Estimating unpacked code
• N-gram analysis to look for frequent instructions
• We use bi-grams (2-grams) because x-86 opcodes are
1 or 2 bytes
• Extract subroutine code from 9 benign executables
• FF 15 (call), FF 75 (push), E8 _ _ _ ff (call), E8 _ _ _
00 (call)
www.intertel.co.za
21. SYSTEMATIC APPROACH TO
DEOBFUSCATION:
UNPACKING
• Automatic Unpacking: involves running the malware and capturing its
memory image.
• Monitoring the execution of the malware is an intrusive process and is often
detected using anti-tracing and anti-debugging techniques embedded in the
malware.
• Our multi-strategy approach consists of minimal monitoring and capturing
the process image at key events:
• ExitProcess
• Byte bigram monitoring: call, push instructions for instance
• Number of seconds elapsed
• Run the malware without monitoring and suspend its execution and perform memory
inspection
• In practice, we always manage to get a dump (memory snapshot) of the
running process: no OEP and no Import table
www.intertel.co.za
22. PHASE 2: DISASSEMBLY
• The disassembler reads the PE data structure in order to:
1. Determine the different sections of the file and separate code from data and
identifies resource information such as import tables
– The disassembler relies on the PE data structure (could be corrupt)
– The disassembler translates into code, any referenced address from
known code location
2. Translate code segments into assembly language
– The disassembler relies on the hardware instruction set documentation
3. Interpret data according to identified types
1. A data referenced by code can be of any type: integer, string, struct, etc.
Integer:
0x0040F45C dword_40F45C dd 0E06D7363h, 1, 2 dup(0) ; DATA XREF: 408C98
String:
0x0040F45C unk_40F45C db 63h ; c ; DATA XREF: sub_408C98
0x0040F45D db 73h ; s
0x0040F45E db 6Dh ; m
0x0040F45F db 0E0h ; a
www.intertel.co.za
23. IDA PRO DISASSEMBLER
• http://www.hex-rays.com/idapro/
• It supports a variety of executable formats for different processors
and operating systems. It also can be used as a debugger for
Windows PE, Mac OS X, and Linux ELF executables.
• IDA performs a large degree of automatic code analysis to a certain
extent, leveraging cross-references between code sections,
knowledge of parameters of API calls, limited dataflow analysis, and
recognition of standard libraries.
• Hashes of known statically linked libraries are compared to hashes of
identified subroutines in the code
• Provides scripting languages to interact with the system to improve
the analysis.
• Support plug-ins: The IDA decompiler is the most impressive plug-
in.www.intertel.co.za
24. PE EXECUTION
1. Read the Portable Executable
(PE) file data structure and
maps the file into memory
2. Load import modules
1. Start execution at entry point
2. Runtime unpacking
3. Jump to OEP
www.intertel.co.za
25. PHASE 3: FIXING THE DISASSEMBLED
CODE
• Unpacked & disassembled code does not have an OEP.
• Import tables are rebuilt dynamically and there are no static references
to dynamically loaded libraries
• Header information is not reliable
• Data is not typed
www.intertel.co.za
27. CHALLENGES IN BINARY CODE DISASSEMBLY
• Disassembly is not an exact science: On CISC platforms
with variable-width instructions, or in the presence of self-
modifying code, it is possible for a single program to have
two or more reasonable disassemblies. Determining which
instructions would actually be encountered during a run of
the program reduces to the proven-unsolvable halting
problem.
• Bad disassembly because of variable length instructions
• Jumps into middle of instructions
• No reachability analysis: Unreachable code can hide data.
www.intertel.co.za
29. API RESOLUTION
• User-level malware programs require system calls to perform malicious
actions
• Use Win32 API to access user level libraries
• Obufscations impede malware analysis using IDA Pro or OllyDbg
• Packers use non-standard linking and loading of dlls
• Obfuscated API resolution
www.intertel.co.za
31. HANDLING THUNKS
• Identify subroutines with a JMP instruction only
• Treat any calls to these subs as an API call
IsDebuggerPresent
www.intertel.co.za
32. LEVERAGING STANDARD API ADDRESS
LOADING==================================================
Function Name : ADSICloseDSObject
Address : 0x76e30826
Relative Address : 0x00020826
Ordinal : 142 (0x8e)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICloseSearchHandle
Address : 0x76e3050a
Relative Address : 0x0002050a
Ordinal : 143 (0x8f)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICreateDSObject
Address : 0x76e30447
Relative Address : 0x00020447
Ordinal : 144 (0x90)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
www.intertel.co.za
33. USING DATAFLOW ANALYSIS
• Identify register based indirect calls GetEnvironmentStringW
use
def
www.intertel.co.za
34. HANDLING DYNAMIC POINTER UPDATES
• Identify register based indirect calls
dword_41e304 has no static
value to look up API
use
def
A def to dword_41e308 is found
Look for probable call to
GetProcAddress earlier
Call to GetProcAddress
www.intertel.co.za
35. STANDARD API ADDRESS LOADING IS NOT ENOUGH
==================================================
Function Name : ADSICloseDSObject
Address : 0x76e30826
Relative Address : 0x00020826
Ordinal : 142 (0x8e)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICloseSearchHandle
Address : 0x76e3050a
Relative Address : 0x0002050a
Ordinal : 143 (0x8f)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICreateDSObject
Address : 0x76e30447
Relative Address : 0x00020447
Ordinal : 144 (0x90)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
There are many indirect ways to load
And call a Windows API:
• access to list of loaded DLLs
• access to a loaded DLL and use of
GetModulHandle() + offset
• …
www.intertel.co.za
36. CONSEQUENCE OF FAILURE TO IDENTIFY APIS
...
.text:004011A7 push offset unk_40A2DC ; arg 1
.text:004011AC xor ebx, ebx
.text:004011AE call dword ptr unk_40A0E4 .data:0040A0E4 00000000
.text:004011B4 mov edi, eax
.text:004011B6 cmp edi, ebx
.text:004011B8 jz short loc_401211
.text:004011BA push esi
.text:004011BB mov esi, dword ptr unk_40A0E8
.text:004011C1 push offset unk_40A2C4 ; arg 2
.text:004011C6 push edi ; arg 1
.text:004011C7 call esi ; unk_40A0E8 .data:0040A0E8 00000000
.text:004011C9 push offset unk_40A2AC
.text:004011CE push edi
.text:004011CF mov dword_433480, eax
...
...
.text:00401132 lea eax, [ebp+var_4]
.text:00401135 push eax
.text:00401136 push ebx
.text:00401137 push 0
.text:00401139 mov [ebp+var_4], esi
.text:0040113C call dword_433480
.text:00401142 test eax, eax
…
Name of a library
Load library call (LoadLibrary)
Name of the library function
Name of the library
API call to get the address
Of the loaded library function
(GetProcAddress)
library function call
www.intertel.co.za
37. FAILURE TO PERFORM CONTROL FLOW
ANALYSIS
• CreateThread
.text:009A3A4C push eax
.text:009A3A4D xor eax, eax
.text:009A3A4F push eax
.text:009A3A50 push eax
.text:009A3A51 push offset dword_9A3939
.text:009A3A56 push eax
.text:009A3A57 push eax
.text:009A3A58 call [ebx]
.data:009A3939 xxxxxxx
• Starting Services
• Thread synchronization
• Critical sections
• Callback functions
Location of the start address of a thread
Call to CreateThread
www.intertel.co.za
38. ADVANCED API RESOLUTION
• There are many ways in which a library or API can be invoked.
• There are many ways an API call can be obfuscated
• But there is one invariant associated to each API and library: its
signature
• i.e; number of arguments, type of arguments, and type of return value if any.
www.intertel.co.za
39. ADVANCED API RESOLUTION: TYPE INFERENCE FOR BINARY PROGRAM ANALYS
• Use type inference as a single solution to solve three fundamental problems:
• Identifying API and function calls (call and jump targets)
• Building a precise CFG
• Recovering user-defined types for proper decompilation
• For Windows Executable files:
• Integers: object handles, addresses, IP address, ports, etc
• Strings: file names, service names, etc
• Structures: sockaddr
struct sockaddr_in {
short sin_family;
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
www.intertel.co.za
40. TYPE PROPAGATION AND MATCHING
• Type propagation using dataflow analysis
• Propagation of return values and arguments of functions
sub_403649 proc near
...
.text:00403649 arg_0 = dword ptr 8
.text:00403649 arg_4 = dword ptr 0Ch
.text:00403649 arg_8 = dword ptr 10h
.text:00403649 arg_C = dword ptr 14h
...
.text:00403668 xor ebx, ebx
...
.text:00403704 mov esi, ds:dword_40A06C
.text:0040370A push ebx ; arg 7 type(f,7) = type (ebx)
.text:0040370B mov edi, 80h
.text:00403710 push edi ; arg 6 type(f,6) = type (edi)
.text:00403711 push 4 ; arg 5 type(f,5) = union(int,char)
.text:00403713 push ebx ; arg 4 type(f,4) = type (ebx)
.text:00403714 push 2 ; arg 3 type(f,3) = union(int,char)
.text:00403716 push 2 ; arg 2 type(f,2) = union(int,char)
.text:00403718 push [ebp+arg_4] ; arg 1 type(f,1) = type([ebp+arg_4])
.text:0040371B mov [ebp+var_18], ebx
.text:0040371E call esi ; dword_40A06C type(ret(f)) = type(eax)
...
There is only one API that has
7 arguments such that the seventh
and third and first one can be
pointers and all others are not.
HANDLE WINAPI CreateFile(
__in LPCTSTR lpFileName,
__in DWORD dwDesiredAccess,
__in DWORD dwShareMode,
__in LPSECURITY_ATTRIBUTES lpSecurityAttributes,
__in DWORD dwCreationDisposition,
__in DWORD dwFlagsAndAttributes,
__in HANDLE hTemplateFile
);
www.intertel.co.za
41. ADVANTAGES OF TYPE INFERENCE ANALYSIS
• Programmers data structures and types are going to be based
on known data structures and types provided by the libraries
• Identifying API calls and type information help capture better
the semantics of the program execution
• Not restricted to Windows but require knowledge of the
libraries and their documentation
• Can deal with some of the widely used obfuscation techniques
• Import table obfuscation
• Code rewrite: code rewrite preserves the types!
www.intertel.co.za
42. PHASE 3: REBUILDING THE UNPACKED
EXECUTABLE
• From a damaged dumped image of a running malware to a PE
executable:
• Knowing all APIs allows us to identify the OEP.
• Semantic approach: ExitProcess, CreateMutex, GetCommandLine,
GetModulaHandle, etc are close to OEP.There are about 20 APIs that are
often called at the beginning of the execution of the code.
• Structural approach: find sources of call graphs in the binary
• Rebuilding in import table with all references to identified APIs
• The disassembly of the reconstructed PE is often of better quality
than the disassembly of the dumped process image
• The new PE code bypasses the unpacking routine embedded in the packed
code
• The new PE contains the original code
www.intertel.co.za
45. PHASE 4: DECOMPILATION
• Identifies local variables
• Identifies arguments: registers, stack, or any combination
• Identifies global variables
• Identify calling conventions
• Identifies common idioms and compiler features
• Eliminates the use of registers as intermediate variables
• Identifies control structures
www.intertel.co.za
47. MALWARE OBFUSCATION EFFECT ON
DECOMPILATION
• While packing is the most used obfuscation
technique, it is often combined with other
advanced forms of obfuscation that make
decompilation often impossible:
• Call obfuscation in general and API
obfuscation in particular
• Binary Rewrite to create semantically
equivalent code with vastly different
structure
• Chuncking or “code spaghettisation”
• …
www.intertel.co.za
50. SYSTEMATIC APPROACH TO CODE
DEOBFUSCATION:
BINARY REWRITING
• Dechunking: The control flow of Conficker's P2P module has been significantly
obfuscated to hinder its disassembly and decompilation. Specifically, the contents of code
blocks from each subroutine have been extracted and relocated throughout different portions
of the executable. These different blocks (or chunks) are then referenced through
unconditional and conditional jump instructions. In effect, the logical control flow of the P2P
module has been obscured (spaghetti-code) to a degree that the module cannot be
decompiled into coherent C-like code, which typically drives more in-depth and accurate
code interpretation. Move all blocks to a contiguous memory block.
• Normalize x86 instructions: push followed by a pop is
a mov
• Normalize calling convention: cdecl, fastcall, stdcall,
instead of user-defined.
www.intertel.co.za
51. CONFICKER AND HYDRAC DECHUNKING
• Identify all chunks in a function and rewrite the function
• Applied to all Conficker C P2P Protocol subroutines
• Unlike the Conficker P2P logic, Hydraq did not exhibit the same
level of obfuscation. It did, however, share some obfuscation
features with Conficker. The functions of the Hydraq binary have
been subjected to chunking, which renders decompilation
difficult. We applied our transformations to automatically
generate the C-like code for each subroutine and build a
complete CFG of the binary. The IDA disassembler identified 185
subroutines in the binary prior to our analysis. After running the
dechunking transformation, only 141 subroutine remained and
were decompiled.
www.intertel.co.za
52. PURPOSE OF CODE OBFUSCATION
• While packing is often used to reduce the size of
binaries and to create polymorphic malware
samples, the more advanced obfuscation
techniques are designed to slow down reverse
engineering efforts and to prevent:
• the identification of API calls: identify the basic building blocks of the malware
• the control-flow reconstruction of the malware: follow and reconstruct the logic flow
• static analysis: determine the full functionality, triggers, hidden logic, time bombs, etc.
• timely reverse engineering and mitigation of the threat
www.intertel.co.za
53. WHY CODE OBFUSCATION IS NOT EASY
• Malware authors can design binary code that is extremely
difficult to analyze. Using advanced programming languages
knowledge, it is possible to create such code.
• Malware authors do not feel the need to always obfuscated
their code. Can easily defeat signature-based detection.
Overwhelm analysts and tools with large numbers of samples.
• Malware code should be able to run in a reliable manner.
Obfuscation should not compromise this important
requirement and should maintain the reliability of the initial
code. This requires a proof or guarantee of some sort.
• Malware deobfuscation is therefore a more attainable than you
might think. Systematic obfuscation informs systematic
deobfuscation.
www.intertel.co.za
54. OUR APPROACH
• Because obfuscation is introduced in a rather systematic way, there is a
hope that it can be dealt with in an automated way.
• Systematically identifying an obfuscation step and undoing its effect.
• Focus on generic approaches as opposed to packer/obfuscator
specifics
• Focus on metrics that allow us to assess the effect and success of our
deobfuscation strategies
www.intertel.co.za
55. EXAMPLE: STATIC ANALYSIS OF CONFICKER
• Conficker appeared on November 20th, 2008
• Infected millions of machines worldwide
• Millions of machines still infected despite an extensive news coverage
about the threat
• Four versions have been released: A, B, B++, and C
• It is a sophisticated piece of malicious code created by professionals who
have extensive knowledge about networking, cryptography, system and
network programming, and security
• Managed to defeat the security community in stopping its progression by
using strong crypto, code obfuscation, aggressive propagation strategies,
and constantly monitoring the security community actions
• Dynamic analysis provided a limited understanding of the threat:
• Identification of what appears to be a P2P protocol
• Identification of ports opened by the malware
• Deobfuscation and static analysis were the only techniques that were able
to uncover the full capability of the malware.
www.intertel.co.za
56. EXAMPLE: STATIC ANALYSIS OF CONFICKER
Static Analysis of Conficker Code:
– Domain generation algorithm: provided a list of daily domains to be blocked
– Quarter of the Internet scanned: Understand what part of the Internet was
targeted for scanning and what infections were due to USB ports and mobile
devices
– List of disabled security products: detection
– Ukrainian keyboard avoidance: Geo-location database poisoning
– Use of MD6 and related crypto algorithms: Attribution
– DNS APIs patching to disable list of websites (including SRI!): detection
– Distribute a number of modified versions of the binary
– TCP and UDP ports based on the IP address of the infected machine:
detection
www.intertel.co.za
57. DEOBFUSCATION OF THE CONFICKER C P2P
• Heavily obfuscated protocol code
• 88 APIs obfuscated
• Use of chunking lead to poor decompilation
• Benefits of the deobfuscation
– P2P Protocol description: protocol understanding and P2P
structure
– Peer selection algorithm: proved the peer poisoning approach
useless
– Possibility to hot patch code without DGA updates: proved C&C
domains obsolete
The P2P protocol was not just a mechanism for distributing PE executable
files but also digitally signed sets of x86 instructions that are executed in a
separate thread and take as argument the IP address of the sender. This
would provide a hot patch mechanism for all data manipulated by Conficker:
list of peers, encryption/decryption keys, the Conficker code it self, etc.www.intertel.co.za
58. STUXNET: KEEPING IT “RELATIVELY” SIMPLE
• Stuxnet does not use advanced binary obfuscation techniques.
• The analysis of the code is challenging nevertheless
• Stuxnet Code Characteristics:
• Use of C++
• Use of C++ exception handling
• Use of C++ classes
• Use of simple data encoding (encryption)
• Use of C structures for all data passed to the main subroutines:
• Over 40 user-defined structures
• Not recognized by disassemblers and decompilers
www.intertel.co.za
59. PHASE 5: PROGRAM UNDERSTANDING
• Need to identify higher-level concepts from the deobfuscated code
• Need to interpret the code into a higher-level malware objective
• Need to indentify particular features: crypto:
• Functions that use crypto-related opcode, loops, etc
• Known constants in crypto algorithms
www.intertel.co.za
61. CONCLUSIONS
• It is always desirable to recover from the malware a
description that is as close as possible to the original
code produced by the authors.
• It is often possible to do that in practice
• It is often the only way to really determine the full
capability of the malware
• The benefits are important when it comes to high-
profile targets
• Easily integrated in common analysis tools:
disassembler (IDA), Decompilers.
www.intertel.co.za