Libor Mořkovský - Recognizing Malware

•

0 likes•414 views

This document discusses techniques for recognizing malware files. It describes how each file is represented as a feature vector that captures static and dynamic attributes. A distance function is used to measure the similarity between vectors and identify nearest neighbors for classification. Files are classified using an instance-based classifier and optimized with techniques like a VP-tree and distance-bounded search. Classifications are deployed in a system that collects file fingerprints from users and shares threats and updates between components. A rule generator also aims to detect malware variants by learning rules based on conditions in files.

Computer virus
bacterial cell
based on work by Anderson Brito

Computer virus
executable file
entry point

Computer virus
Inserting code into files is never “good”.
executable file
entry point

image courtesy of
Looking Glass Studios
Malware
How do you recognize a thief?

image courtesy of
Looking Glass Studios
Twentieth Century Fox
Malware
How do you recognize a thief?

Malware
How do you recognize a thief?
image courtesy of
Looking Glass Studios
Twentieth Century Fox
Paramount Pictures

Malware
completely different behaviors are considered “bad”
we need a judge to decide who crossed the line
•
•

Malware | Many faces
unlike real thieves, malware can be duplicated
not only duplicated, but also modified
all this is done by machines
too much work to judge each one manually
•
•
•
•

Finding similar files
oooooooooooo
o
oo
o
oo
oooo
ooooo
oooo
o
oo
o
oo
o
ooooooooo
o
o
MDS1
MDS2
class
oo
oo
oo
oo
CLEAN
MALWARE
QUERY
UNKNOWN

Finding similar files
need a file representation
need a distance function
•
•

Finding similar files | File vector
each executable file is represented by a feature vector
the PE format is complex, so we keep exactly one
version of the extractor code (C++)
the vector comprises static and dynamic features, the
exact content is proprietary
•
•
•
Database record
• One record = constant vector of over 100 attributes
• the “file fingerprint”
• Each attribute has a data type and semantic
Attribute Data Type Semantic
sha256 32 byte array CHECKSUM
pe_sect_cnt uint16_t VALUE
pe_sect_rawoff_entry uint32_t OFFSET
• The complete contents of the vector are kept secret
• static and dynamic features of PE executables

Finding similar files | Distance
sum of partial distances
each distance operator assigned manually
weights assigned manually to equalize contribution
•
•
•
Nearest neighbor query
• Compound distance function
• Data type and semantic determine partial dist. func.
Data Type Semantic Partial distance function
32 byte array CHECKSUM RETURN_ZERO
uint16_t VALUE EQUAL_RET32
uint32_t OFFSET LOG
• Each partial distance function = one kernel function
• Over 100 kernels for every NN query
• Intermediate results kept in the “Scratchpad”

Finding similar files | Data
~60 M data points
sparse and well separated
(in many cases)
•
•

Finding similar files | Implementation
we started with GPUs
their high memory throughput allows “naive”
implementation and rapid prototyping
column-oriented database
•
•
•

Classification | Requirements
find easily what is responsible for
a mistake – transparency
fix the problem quickly – tractability
•
•

Classification | Algorithm
Instance based classifier.

Classification | Optimizations
scaling and HW problems with GPUs
we invested in algorithmic optimizations:
VP-tree, distance bounded search
hand optimized distance function (assembly)
CPU version is ~100x faster
•
•
•
•

Classification | Deployment
→
FileSHAandu
ser id →
←Fileprevale
nce ←
←
Fileclass
ification ←
→
Filefinger
print →
← Generic detections ←
↑ File classifications and
Evo-gen detections
→ Threats →
Set updates ↓
Medusa
Scavenger
Avast users
FileRep

Rule generator
detect more variants in the wild
(our) rule is a conjunction of several conditions
known as Win32:Evo-Gen
completely different optimization problem than
classification - still uses the GPU
•
•
•
•

This document discusses using virtual machines for safe dynamic malware analysis. It recommends using a virtual machine to run malware in a protected environment isolated from the host system. Specific virtualization software discussed includes VMware Player, Workstation and Fusion along with VirtualBox and Hyper-V. The document outlines techniques for configuring networking and taking snapshots in virtual machines for malware analysis. It also introduces tools for dynamic analysis within virtual machines like Process Monitor, Process Explorer, Regshot, INetSim and Wireshark that can monitor the behavior and network activity of malware samples.

Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...

Sam Bowne

This document provides an overview of basic static malware analysis techniques. It discusses using antivirus scanners, hashing files, and finding strings to identify malware without executing it. It also covers analyzing the Portable Executable (PE) file format used in Windows executables, including examining the PE header, imported and exported functions, linked libraries, and sections like .text and .rsrc. The document demonstrates various tools for these static analysis tasks like HashCalc, strings, PEview, Dependency Walker, and Resource Hacker.

Is Linux/Moose endangered or extinct?

ESET

Presentation of ESET researcher Olivier Bilodeau from Virus Bulletin Conference 2015. Embedded Linux platforms have been increasingly targeted by malware authors over the past few years. The targeted devices, labelled under the umbrella term 'Internet of Things', are generally consumer routers, gateways or modems. They are compromised remotely via brute-forcing of their credentials or being victim of an unpatched vulnerability, such as the infamous Shellshock. Most of these compromises result in the targeted system being assimilated into a botnet. Read more about Linux/Moose here: http://www.welivesecurity.com/2015/05/26/dissecting-linuxmoose/

Practical Malware Analysis Ch13

Sam Bowne

Malware analysis

Prakashchand Suthar

- Malware analysis involves both static and dynamic analysis techniques to understand malware behavior and assess potential damage. Static analysis involves disassembling and reviewing malware code and structure without executing it. Dynamic analysis observes malware behavior when executed in an isolated virtual environment. - Tools for static analysis include file hashing, string extraction, and PE header examination. Dynamic analysis tools monitor the registry, file system, processes, and network traffic created by malware runtime behavior. These include Process Monitor, Wireshark, Process Explorer, and network sniffers. - To safely conduct malware analysis, one should create an isolated virtual lab separated from production networks, and install behavioral monitoring and code analysis tools like OllyDbg, Process Monitor, and Wiresh

CNIT 126: 10: Kernel Debugging with WinDbg

Sam Bowne

CNIT 126 12: Covert Malware Launching

Sam Bowne

This document discusses various covert techniques used by malware to launch and conceal itself, including process injection, process replacement, and APC injection. Process injection techniques like DLL injection and direct injection allow malware to inject malicious code into running processes to hide its behavior and bypass security mechanisms. Process replacement involves overwriting the memory of a running process with malware code to disguise itself. APC injection uses asynchronous procedure calls to direct a thread to execute malicious code when it is in an alterable state.

This document discusses various techniques used by covert malware to launch and conceal itself, including using launchers, process injection, process replacement, and hooks. It describes how malware uses these techniques to inject malicious code into running processes in order to gain privileges and evade detection. Process injection techniques like DLL injection and APC injection are commonly used to force the loading of malicious payloads.

Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg

Sam Bowne

This document discusses using WinDbg for kernel debugging and analyzing rootkits. It explains that WinDbg can debug in both user-mode and kernel-mode, unlike OllyDbg which is only for user-mode. Device drivers run code in the Windows kernel and are difficult to analyze. The DriverEntry routine is called when a driver is loaded and it registers callback functions. Malware often imports functions from Ntoskrnl.exe and Hal.dll to manipulate the kernel. WinDbg commands like bp, lm, and dt are demonstrated for setting breakpoints, listing modules, and viewing structures. Symbol files from Microsoft provide function and structure names to make debugging easier.

Basic Malware Analysis

Albert Hui

This document introduces tools and techniques for preliminary malware analysis. It discusses examining malware behavior through static analysis, behavioral tracing, and sandboxing. Specific tools are presented for observing malware snapshots, tracing its behavior, and containing it in a sandbox. Process-based and stealthy malware are discussed, along with vulnerabilities of rootkits and tools for rootkit detection. The goal is to present a model for beginning reverse engineering of malware through observation and experimentation in a contained environment.

Introduction to Malware Analysis

Andrew McNicol

This document provides an overview of malware analysis, including both static and dynamic analysis techniques. Static analysis involves examining a file's code and components without executing it, such as identifying file types, checking hashes, and viewing strings. Dynamic analysis involves executing the malware in a controlled environment and monitoring its behavior and any system changes. Dynamic analysis tools discussed include Process Explorer, Process Monitor, and Autoruns to track malware processes, files accessed, and persistence mechanisms. Both static and dynamic analysis are needed to fully understand malware behavior.

Practical Malware Analysis Ch 14: Malware-Focused Network Signatures

Sam Bowne

This document discusses techniques for analyzing malware network signatures and developing effective network countermeasures. It describes using firewalls, proxies, and intrusion detection systems to filter malicious traffic. Deep packet inspection can detect malware beacons hidden in layers like HTTP user-agents. The document advises passively monitoring real infected networks to understand malware without tipping off attackers. It also provides methods for safely investigating attackers online anonymously. Analyzing how malware generates domain names and URLs can reveal signatures to detect similar strains. The goal is to create general signatures that still work if the malware evolves while avoiding false positives.

Practical Malware Analysis: Ch 8: Debugging

Sam Bowne

This document discusses debugging techniques for malware analysis. It describes the differences between disassemblers and debuggers, and introduces two popular debuggers - OllyDbg for user-mode debugging and Windbg for kernel-mode debugging. It covers debugging concepts like source-level versus assembly-level debugging, setting breakpoints, single-stepping, and modifying execution by skipping or testing functions.

CNIT 126: 10: Kernel Debugging with WinDbg

Sam Bowne

CNIT 126 Ch 11: Malware Behavior

Sam Bowne

This document summarizes various types of malware behaviors including downloaders and launchers, backdoors, credential stealers, keyloggers, and techniques for persistence and privilege escalation. Downloaders download and execute other malware while launchers prepare other malware for execution. Backdoors provide remote access to infected machines. Credential stealers steal login credentials in various ways. Keyloggers log keystrokes through hooking or polling methods. Malware uses techniques like registry modifications, trojanizing binaries, and DLL load hijacking for persistence. It may also exploit privileges like SeDebugPrivilege for privilege escalation. User-mode rootkits modify OS functionality to hide malware by techniques like IAT and inline hooking.

Awesome Concurrency with Elixir Tasks

Jonathan Magen

This document provides an overview of concurrency and the Task module in Elixir. It introduces the speakers and defines concurrency as doing more than one thing at once. It discusses why concurrency is important for performance, flexibility, and expressiveness. It then covers the actor model and how Elixir supports actors via processes. It presents send and receive as low-level ways to communicate between processes, and introduces Task as a lightweight abstraction over processes. It provides examples of Task.async and Task.await for asynchronous work. It also demonstrates Task.async_stream for parallel processing of filenames into compressed files with checksums.

CNIT 126 7: Analyzing Malicious Windows Programs

Sam Bowne

CNIT 126 Ch 0: Malware Analysis Primer & 1: Basic Static Techniques

Sam Bowne

Practical Malware Analysis: Ch 7: Analyzing Malicious Windows Programs

Sam Bowne

The document discusses various application programming interfaces (APIs) and techniques used by malicious programs on Windows systems. It describes the Windows API and common data types. It also covers lower-level APIs like the Native API, and how malware authors leverage APIs, dynamic link libraries (DLLs), processes, threads, mutexes, services, and other techniques to interact with the operating system and maintain persistence. The document provides technical details to help analysts understand how malware functions on Windows.

Basic Dynamic Analysis of Malware

Natraj G

This document provides an overview of basic dynamic malware analysis techniques. It explains that dynamic analysis examines how malware behaves when executed by monitoring changes to the system, unusual processes, network traffic, and other behaviors. A number of tools are described that can be used for dynamic analysis, including sandboxes, process monitors, registry snapshots, network service emulators, and packet sniffers. Caution is advised to perform analysis safely in a isolated lab environment.

9: OllyDbg

Sam Bowne

OllyDbg is a free debugger that can be used to analyze malware. It was originally developed over a decade ago for cracking software but is now widely used for malware analysis and exploit development. OllyDbg allows users to load EXEs and DLLs directly or attach to running processes to debug malware. It provides useful interfaces like a disassembler, register and memory views. OllyDbg also supports setting various breakpoints, calling functions, modifying data, and tracing program execution which helps analysts understand malware behavior.

"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...

SegInfo

This document discusses automated malware analysis techniques used by Dissect || PE. It describes the challenges of processing large volumes of samples from different sources. The system uses a feed server, scheduler, unpackers, dissectors, and kernel driver. Samples are run in virtual machines and real machines. Plugins allow custom analysis. The architecture is scalable and supports community research through shared samples and results.

CNIT 126 Ch 9: OllyDbg

Sam Bowne

This document provides an overview of using OllyDbg for malware analysis. It discusses loading malware into OllyDbg, setting breakpoints, debugging techniques like tracing code execution and patching binaries. Additional features like logging, watches, labeling, and plugins are also covered. While OllyDbg is an older debugger, it remains useful for malware analysis due to its simplicity and ability to script functionality through plugins and scripts.

Practical Malware Analysis: Ch 11: Malware Behavior

Sam Bowne

CNIT 126 Ch 7: Analyzing Malicious Windows Programs

Sam Bowne

The Windows API allows programs to interact with operating system functions. It includes data types, handles, file system calls, and registry functions. Malware uses these APIs to load DLLs, create processes and threads, communicate over networks, and persist across reboots by modifying registry keys. The Native API provides lower-level access and is used by malware to evade detection by antivirus software and debuggers.

Materials Project Validation, Provenance, and Sandboxes by Dan Gunter

Dan Gunter

DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap

Felipe Prado

- Firmware Slap is a tool that automates the discovery of exploitable vulnerabilities in firmware using concolic analysis and function clustering. It recovers function prototypes from firmware binaries, runs automated analysis on the functions in parallel to find bugs, and visualizes the results in JSON and Elasticsearch/Kibana. - The document discusses challenges with concolic analysis like memory usage and underconstraining symbolic values. It proposes techniques like starting analysis after initialization, modeling functions individually, and tracking memory more precisely. - Function clustering is used to find similar functions that may contain similar bugs. Features are extracted from functions and k-means clustering is applied to group similar functions.

Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware

Lastline, Inc.

Over the last few years, as the world has moved closer to realizing the idea of the Internet of Things, an increasing number of the analog things with which we used to interact every day have been replaced with connected devices. The increasingly-complex systems that drive these devices have one thing in common – they must all communicate to carry out their intended functionality. Such communication is handled by firmware embedded in the device. And firmware, like any piece of software, is susceptible to a wide range of errors and vulnerabilities.

What's hot

CNIT 126 13: Data Encoding

Sam Bowne

Practical Malware Analysis Ch12

Sam Bowne

Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg

Sam Bowne

Basic Malware Analysis

Albert Hui

Introduction to Malware Analysis

Andrew McNicol

Practical Malware Analysis Ch 14: Malware-Focused Network Signatures

Sam Bowne

Practical Malware Analysis: Ch 8: Debugging

Sam Bowne

CNIT 126: 10: Kernel Debugging with WinDbg

Sam Bowne

CNIT 126 Ch 11: Malware Behavior

Sam Bowne

Awesome Concurrency with Elixir Tasks

Jonathan Magen

CNIT 126 7: Analyzing Malicious Windows Programs

Sam Bowne

CNIT 126 Ch 0: Malware Analysis Primer & 1: Basic Static Techniques

Sam Bowne

Practical Malware Analysis: Ch 7: Analyzing Malicious Windows Programs

Sam Bowne

Basic Dynamic Analysis of Malware

Natraj G

9: OllyDbg

Sam Bowne

"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...

SegInfo

CNIT 126 Ch 9: OllyDbg

Sam Bowne

Practical Malware Analysis: Ch 11: Malware Behavior

Sam Bowne

CNIT 126 Ch 7: Analyzing Malicious Windows Programs

Sam Bowne

Materials Project Validation, Provenance, and Sandboxes by Dan Gunter

Dan Gunter

What's hot (20)

CNIT 126 13: Data Encoding

Practical Malware Analysis Ch12

Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg

Basic Malware Analysis

Introduction to Malware Analysis

Practical Malware Analysis Ch 14: Malware-Focused Network Signatures

Practical Malware Analysis: Ch 8: Debugging

CNIT 126: 10: Kernel Debugging with WinDbg

CNIT 126 Ch 11: Malware Behavior

Awesome Concurrency with Elixir Tasks

CNIT 126 7: Analyzing Malicious Windows Programs

CNIT 126 Ch 0: Malware Analysis Primer & 1: Basic Static Techniques

Practical Malware Analysis: Ch 7: Analyzing Malicious Windows Programs

Basic Dynamic Analysis of Malware

9: OllyDbg

"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...

CNIT 126 Ch 9: OllyDbg

Practical Malware Analysis: Ch 11: Malware Behavior

CNIT 126 Ch 7: Analyzing Malicious Windows Programs

Materials Project Validation, Provenance, and Sandboxes by Dan Gunter

Similar to Libor Mořkovský - Recognizing Malware

DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap

Felipe Prado

Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware

Lastline, Inc.

Eusecwest

zynamics GmbH

The document summarizes post-exploitation techniques on OSX and iPhone. It describes a technique called "userland-exec" that allows executing applications on OSX without using the kernel. This technique was adapted to work on jailbroken iPhones by injecting a non-signed library and hijacking the dynamic linker (dlopen) to map and link the library. With some additional patches, the authors were able to load an arbitrary non-signed library into the address space of a process on factory iPhones, representing the first reliable way to execute payloads on these devices despite code signing protections.

Watchtowers of the Internet - Source Boston 2012

Stephan Chenette

Watchtowers of the Internet: Analysis of Outbound Malware Communication, Stephan Chenette, Principal Security Researcher, (@StephanChenette) & Armin Buescher, Security Researcher With advanced malware, targeted attacks, and advanced persistent threats, it’s not IF but WHEN a persistant attacker will penetrate your network and install malware on your company’s network and desktop computers. To get the full picture of the threat landscape created by malware, our malware sandbox lab runs over 30,000 malware samples a day. Network traffic is subsequently analyzed using heuristics and machine learning techniques to statistically score any outbound communication and identify command & control, back-channel, worm-like and other types of traffic used by malware. Our talk will focus on the setup of the lab, major malware families as well as outlier malware, and the statistics we have generated to give our audience an exposure like never before into the details of malicious outbound communication. We will provide several tips, based on our analysis to help you create a safer and more secure network. Stephan Chenette is a principal security researcher at Websense Security Labs, specializing in research tools and next generation emerging threats. In this role, he identifies and implements exploit and malcode detection techniques. Armin Buescher is a Security Researcher and Software Engineer experienced in strategic development of detection/prevention technologies and analysis tools. Graduated as Dipl.-Inf. (MSc) with thesis on Client Honeypot systems. Interested in academic research work and published author of security research papers.

Building next gen malware behavioural analysis environment

isc2-hellenic

This document discusses building an automated malware behavioral analysis environment. It covers types of malware analysis, taxonomy of analysis platforms, analysis phases and checks, and evaluation strategies. Static and dynamic automated analysis are described as well as their pros, cons, and limitations. The analysis phases of submission, analysis, and reporting are outlined. Key challenges like modularity, fingerprinting, stalling, social engineering, and decoys are examined. Examples of analysis platforms and tools are provided.

Software Analytics: Data Analytics for Software Engineering and Security

Tao Xie

Frodo Baggins presents on software analytics for software engineering and security tasks. The presentation discusses how software and how it is built and used is changing, with data now being ubiquitous and software having continuous development and release. Software analytics aims to enable software practitioners to perform data exploration and analysis to obtain useful insights. Examples of software analytics techniques discussed include XIAO for scalable code clone analysis, and SAS for incident management of online services. The presentation then shifts to discussing software analytics techniques for mobile app security, including WHYPER for natural language processing on app descriptions to link permissions to functionality, and AppContext for machine learning to classify malware.

Adversarial machine learning for av software

junseok seo

Patching Windows Executables with the Backdoor Factory | DerbyCon 2013

midnite_runr

Patching Windows Executives with the Backdoor Factory is a presentation about binary patching techniques. It discusses the history of patching, how key generators and Metasploit patch binaries, and how the author learned to manually patch binaries. The presentation then introduces the Backdoor Factory tool, which can automatically patch Windows binaries by injecting shellcode into code caves. It demonstrates patching via code cave insertion, single cave jumps, and cave jumping. Mitigations like self-validation and antivirus are discussed.

The Hacking Games - Operation System Vulnerabilities Meetup 29112022

lior mazor

Defending Your "Gold"

Will Schroeder

You suck at Memory Analysis

Francisco Ribeiro

From the current offensive and defensive technique arsenal, memory analysis applied to volatile memory is far from being the most explored channel. It is more likely to hear about input validation attacks or attacks against the protocol & cryptography while keys, passphrases, credit card numbers and other precious artifacts are kept unsafely in memory. This analysis arises as a mine waiting to be explored since it is sustained by one of the most vulnerable and unavoidable resource to systems, memory. From Java to Stuxnex, as well as Windows but without forgetting the Cloud, I will try to show some scenarios where these techniques can be applied, its impact as a threat and bring an important and fun subject not just to those who work in forensics but also to penetration testers as myself. Finally, I will also try to show how can this be used for defensive technologies as tools for monitoring and protection in networks with systems in production.

ShaREing Is Caring

sporst

Halvar Flake and Sebastian Porst present BinCrowd, a tool for analyzing disassembled binaries. It allows uploading analysis results to a central database for later retrieval and comparison to other binaries. This helps identify code reuse across different programs. The presentation covers techniques for function matching and scoring file similarity. It also discusses how BinCrowd can be accessed using IDA Pro and managing access levels for team collaboration.

Malware collection and analysis

Chong-Kuan Chen

This document discusses malware collection and analysis conducted at the DSNSLab at NCTU. It introduces the lab director, Professor Xie Zhiping, and outlines the lab's research areas including malware analysis, virtual machines, digital forensics, and network security. It then provides an overview of the Secmap platform for automated malware analysis and collection. Methods of malware collection discussed include disk forensics, web crawling, shared repositories, email, and honeypots.

CONFidence 2017: Hiding in plain sight (Adam Burt)

PROIDEA

Intro2 malwareanalysisshort

Vincent Ohprecio

This document provides an introduction to malware analysis through a presentation. It discusses key concepts like Zeus malware, behavioral analysis through tools like NetworkMiner and Wireshark, reverse engineering malware using tools like OllyDbg, and submitting samples to VirusTotal for analysis. The presentation emphasizes setting up an analysis workstation, analyzing malware behavior on networks and systems, reverse engineering code to understand malware functionality, and using virtual environments and tools safely to explore malware without risking real systems. It provides examples of real malware like Zeus to illustrate analysis concepts and techniques.

BinaryPig - Scalable Malware Analytics in Hadoop

Jason Trost

The document describes BinaryPig, a framework for processing small binary files like malware samples using Apache Hadoop and Apache Pig. It allows for scalable storage and analysis of large datasets. BinaryPig addresses issues with previous approaches like lack of data locality, failure resilience, and dynamic schema support. It introduces loaders for running analysis scripts and daemons on binary data to extract features. Clustering results on a dataset of 20 million malware samples are also presented to demonstrate BinaryPig's capabilities for malware triage and research.

Sans london april sans at night - tearing apart a fileless malware sample

Michel Coene

This presentation was created based on a sample we found. At first sight this looked to be a standard fileless cryptocurrency mining malware, however, when looking a bit further, we noted that this malware had some other tricks up its sleeve. This presentation starts with an introduction into how fileless malware works and how to detect it, a short introduction into cryptocurrency mining and of course the analysis of the sample itself.

0box Analyzer--Afterdark Runtime Forensics for Automated Malware Analysis and...

Wayne Huang

This talk was given at DEF CON 2010 by Jeremy Chiu, Benson Wu, and Wayne Huang https://www.defcon.org/html/defcon-18/dc-18-speakers.html#Huang For antivirus vendors and malware researchers today, the challenge lies not in "obtaining" the malware samples - they have too many already. What's needed is automated tools to speed up the analysis process. Many sandboxes exist for behavior profiling, but it still remains a challenge to handle anti-analysis techniques and to generate useful reports. The problem with current tools is the monitoring mechanism - there's always a "sandbox" or some type of monitoring mechanism that must be loaded BEFORE malware execution. This allows malware to detect whether such monitoring mechanisms exist, and to bail out thus avoiding detection and analysis. Here we release 0box--an afterDark analyser that loads AFTER malware execution. No matter how well a piece of malware hides itself, there will be runtime forensics data that can be analyzed to identify "traces" of a process trying to hide itself. For example, evidences within the process module lists or discrepancies between kernel- and user-space datastructures. Since analysis is done post mortem, it is very hard for malware to detect the analysis. By using runtime forensics to extract evidences, we turn a piece of malware from its original binary space into a feature space, with each feature representing the existence or non-existence of a certain behavior (ex, process table tampering, unpacking oneself, adding hooks, etc). By running clustering algorithms in this space, we show that this technique not only is very effective and very fast at detecting malware, but is also very accurate at clustering the malware into existing malware families. Such clustering is helpful for deciding whether a piece of malware is just a variation or repacking of an existing malware family, or is a brand new find. Using three case studies, we will demo 0box, compare 0box with 0box with recent talks at BlackHat and other security conferences, and explain how 0box is different and why it is very effective. 0box will be released at the conference as a free tool.

Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...

Alex Pinto

The document discusses machine learning-based security monitoring. It begins with an introduction of the speaker, Alex Pinto, and an agenda that will include a discussion of anomaly detection versus classification techniques. It then covers some history of anomaly detection research dating back to the 1980s. It also discusses challenges with anomaly detection, such as the curse of dimensionality with high-dimensional data and lack of ground truth labels. The document emphasizes communicating these machine learning concepts clearly.

Discovering Vulnerabilities For Fun and Profit

Abhisek Datta

This document discusses discovering vulnerabilities for fun and profit. It introduces the author and their security tools and research involving Microsoft Office, IBM Tivoli Endpoint Manager, HP Siteprotect, and more. It describes techniques like fuzzing, mutation generation, and attack surface analysis that were used to find vulnerabilities. XML mutation and attribute fuzzing of Microsoft OOXML formats are highlighted. Architecture analysis, intercepting traffic, and developing custom tools are discussed for fuzzing IBM Tivoli EM and analyzing the Dameware Mini Remote Control binary protocol. The document concludes by discussing ongoing security research and a balance between finding vulnerabilities and developing secure systems.

Similar to Libor Mořkovský - Recognizing Malware (20)

DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap

Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware

Eusecwest

Watchtowers of the Internet - Source Boston 2012

Building next gen malware behavioural analysis environment

Software Analytics: Data Analytics for Software Engineering and Security

Adversarial machine learning for av software

Patching Windows Executables with the Backdoor Factory | DerbyCon 2013

The Hacking Games - Operation System Vulnerabilities Meetup 29112022

Defending Your "Gold"

You suck at Memory Analysis

ShaREing Is Caring

Malware collection and analysis

CONFidence 2017: Hiding in plain sight (Adam Burt)

Intro2 malwareanalysisshort

BinaryPig - Scalable Malware Analytics in Hadoop

Sans london april sans at night - tearing apart a fileless malware sample

0box Analyzer--Afterdark Runtime Forensics for Automated Malware Analysis and...

Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...

Discovering Vulnerabilities For Fun and Profit

More from Machine Learning Prague

Vít Listík - Email.cz workshop

Machine Learning Prague

The document discusses email classification and summarization techniques used by Email.cz. It details their processes for analyzing spam sources like content, metadata, and blacklists. It also discusses challenges with "graymail" and techniques like natural language processing, support vector machines, image analysis using deep learning models, and decision trees for categorizing email topics in multiple languages. Distributed learning approaches are used to speed up image classification.

Lukáš Vrábel - Deep Convolutional Neural Networks

Machine Learning Prague

The document discusses deep convolutional neural networks and their applications. It provides examples of using pre-trained neural networks for image classification tasks and fine-tuning them for custom image datasets. The key steps outlined are: 1) preparing a custom image dataset with labels, 2) obtaining a pre-trained neural network model, 3) modifying the last fully-connected layers for the custom classes, and 4) retraining the model on the custom dataset. Tips are provided for setting batch sizes, training iterations, and configuring the solver prototxt for efficient fine-tuning.

Tomáš Cícha - Machine Learning Solutions at Seznam.cz

Machine Learning Prague

The document discusses machine learning solutions used at Seznam.cz, a Czech web portal. It describes how machine learning is used throughout their search engine, including tasks like document downloading, indexing, and ranking search results. Specific algorithms discussed include rc-rank, an in-house developed supervised machine learning algorithm, and Lambda learning, which aims to improve the ordering of search results.

Jan Pospíšil - Azure ML

Machine Learning Prague

Michael Levin - MatrixNet Applications at Yandex

Machine Learning Prague

MatrixNet is a machine learning system created by Yandex in 2014 for classification, regression, and ranking problems. It uses gradient boosted decision trees that often achieve strong results with default parameters. MatrixNet has been applied successfully at Yandex for web search ranking, ad click prediction, and other tasks. It has also been used by external companies for applications like churn prediction in telecom. While powerful, MatrixNet has some limitations, such as an inability to handle certain types of categorical features, but overall it is easy to use and often outperforms other models with minimal tuning needed.

Adam Ashenfelter - Finding the Oddballs

Machine Learning Prague

Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...

Machine Learning Prague

TR Discover is an NLP tool that allows users to search Thomson Reuters databases using natural language queries. It uses a context-free grammar and logical semantics to parse queries and translate them into SQL or SPARQL queries. For the query "Drugs developed by Merck", it generates a SPARQL query to retrieve drugs developed by the company Merck. The system provides query autocompletions based on the grammar and relationships in the knowledge graph to guide users. Working as a scientist within Thomson Reuters provides applied research opportunities while requiring technology to support business needs around privacy, customization for different markets, and long-term client relationships.

Tomáš Mikolov - Distributed Representations for NLP

Machine Learning Prague

The document discusses word embedding techniques, specifically Word2vec. It introduces the motivation for distributed word representations and describes the Skip-gram and CBOW architectures. Word2vec produces word vectors that encode linguistic regularities, with simple examples showing words with similar relationships have similar vector offsets. Evaluation shows Word2vec outperforms previous methods, and its word vectors are now widely used in NLP applications.

Kateřina Veselovská - ML Approaches to Sentiment Analysis

Machine Learning Prague

This document discusses machine learning approaches to sentiment analysis. It provides an overview of techniques for sentiment analysis including unsupervised learning methods like Turney's algorithm and LDA, as well as supervised learning methods like classifiers and regression. Evaluation metrics like Kappa and accuracy are discussed. The document also describes specific techniques for sentiment analysis including preprocessing, bag-of-words models, negation handling, and using sentiment lexicons. Neural networks are discussed as a language-independent approach to sentiment analysis.

Jiří Materna - Artificial Intelligence in Creative Writing

Machine Learning Prague

Jan Šedivý - Intelligent Personal Assistants

Machine Learning Prague

The document discusses intelligent personal assistants (IPAs), including current examples like Siri, Google Now, and Cortana. It outlines IPA use cases, interaction modes, context and privacy considerations, and technologies involved. The presentation focuses on rule-based and statistical machine learning approaches to IPA, including the YodaQA question answering system. The future of IPAs is envisioned to include more human-like conversation abilities and adaptability to individual users.

Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology

Machine Learning Prague

The document discusses GoodAI's mission to develop general artificial intelligence as quickly as possible to help humanity. It outlines several key advantages of developing general AI over narrow AI, including higher return on investment potential and the ability for AI to recursively self-improve exponentially. The document also describes GoodAI's unified brain architecture approach and lists many intrinsic properties and learned abilities they are aiming to develop in artificial intelligence systems to achieve human-level general intelligence.

Xuedong Huang - Deep Learning and Intelligent Applications

Machine Learning Prague

Deep Learning and Intelligent Applications Dr Xuedong Huang from Microsoft discusses deep learning and intelligent applications. He explains that big data and GPUs enable deep learning to perform tasks like speech recognition and computer vision. CNTK is introduced as Microsoft's deep learning toolkit that balances efficiency, performance, and flexibility. It allows describing models with code, languages, or scripts and supports CPU/GPU training. Project Oxford APIs are summarized, including APIs for vision, speech, language, and spelling. These APIs make it easy for developers to incorporate intelligent services into applications.

More from Machine Learning Prague (13)

Vít Listík - Email.cz workshop

Lukáš Vrábel - Deep Convolutional Neural Networks

Tomáš Cícha - Machine Learning Solutions at Seznam.cz

Jan Pospíšil - Azure ML

Michael Levin - MatrixNet Applications at Yandex

Adam Ashenfelter - Finding the Oddballs

Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...

Tomáš Mikolov - Distributed Representations for NLP

Kateřina Veselovská - ML Approaches to Sentiment Analysis

Jiří Materna - Artificial Intelligence in Creative Writing

Jan Šedivý - Intelligent Personal Assistants

Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology

Xuedong Huang - Deep Learning and Intelligent Applications

Recently uploaded

Finale of the Year: Apply for Next One!

GDSC PJATK

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Wask

https://www.wask.co/ebooks/digital-marketing-trends-in-2024 Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers: * What is Vector Search? * Importance and benefits of vector search * Practical use cases across various industries * Step-by-step implementation guide * Live demos with code snippets * Enhancing LLM capabilities with vector search * Best practices and optimization strategies Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications. #MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology

Choosing The Best AWS Service For Your Website + API.pptx

Brandon Minnick, MBA

Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API? Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose? Which one is cheapest? Which one is fastest? Which one will scale to meet our needs? Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr

saastr

Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf

flufftailshop

When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.

AWS Cloud Cost Optimization Presentation.pptx

HarisZaheer8

This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on integration of Salesforce with Bonterra Impact Management. Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

dbms calicut university B. sc Cs 4th sem.pdf

Shinana2

Energy Efficient Video Encoding for Cloud and Edge Computing Instances

Alpen-Adria-Universität

Main news related to the CCS TSI 2023 (2023/1695)

Jakub Marek

An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers. The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 . The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .

Letter and Document Automation for Bonterra Impact Management (fka Social Sol...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365. Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Tosin Akinosho

Monitoring and Managing Anomaly Detection on OpenShift Overview Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices. Key Topics Covered 1. Introduction to Anomaly Detection - Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems. 2. Understanding Edge (IoT) - Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source. 3. What is ArgoCD? - Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices. 4. Deployment Using ArgoCD for Edge Devices - Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD. 5. Introduction to Apache Kafka and S3 - Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions. 6. Viewing Kafka Messages in the Data Lake - Learn how to view and analyze Kafka messages stored in a data lake for better insights. 7. What is Prometheus? - Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices. 8. Monitoring Application Metrics with Prometheus - Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system. 9. What is Camel K? - Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes. 10. Configuring Camel K Integrations for Data Pipelines - Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow. 11. What is a Jupyter Notebook? - Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text. 12. Jupyter Notebooks with Code Examples - Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

SitimaJohn

Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.

WeTestAthens: Postman's AI & Automation Techniques

Postman

Presentation of the OECD Artificial Intelligence Review of Germany

innovationoecd

Taking AI to the Next Level in Manufacturing.pdf

ssuserfac0301

Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as: 1. How quickly AI is being implemented in manufacturing. 2. Which barriers stand in the way of AI adoption. 3. How data quality and governance form the backbone of AI. 4. Organizational processes and structures that may inhibit effective AI adoption. 6. Ideas and approaches to help build your organization's AI strategy.

June Patch Tuesday

Ivanti

Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

Chart Kalyan

Recently uploaded (20)

Finale of the Year: Apply for Next One!

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Choosing The Best AWS Service For Your Website + API.pptx

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr

Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf

AWS Cloud Cost Optimization Presentation.pptx

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...

Nordic Marketo Engage User Group_June 13_ 2024.pptx

dbms calicut university B. sc Cs 4th sem.pdf

Energy Efficient Video Encoding for Cloud and Edge Computing Instances

Main news related to the CCS TSI 2023 (2023/1695)

Letter and Document Automation for Bonterra Impact Management (fka Social Sol...

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

WeTestAthens: Postman's AI & Automation Techniques

Presentation of the OECD Artificial Intelligence Review of Germany

Taking AI to the Next Level in Manufacturing.pdf

June Patch Tuesday

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

Libor Mořkovský - Recognizing Malware

1. Recognizing malware Libor Mořkovský

2. Computer virus bacterial cell based on work by Anderson Brito

3. Computer virus executable file entry point

4. Computer virus Inserting code into files is never “good”. executable file entry point

5. image courtesy of Looking Glass Studios Malware How do you recognize a thief?

6. image courtesy of Looking Glass Studios Twentieth Century Fox Malware How do you recognize a thief?

7. Malware How do you recognize a thief? image courtesy of Looking Glass Studios Twentieth Century Fox Paramount Pictures

8. Malware completely different behaviors are considered “bad” we need a judge to decide who crossed the line • •

9. Malware | Many faces unlike real thieves, malware can be duplicated not only duplicated, but also modified all this is done by machines too much work to judge each one manually • • • •

11. Finding similar files need a file representation need a distance function • •

12. Finding similar files | File vector each executable file is represented by a feature vector the PE format is complex, so we keep exactly one version of the extractor code (C++) the vector comprises static and dynamic features, the exact content is proprietary • • • Database record • One record = constant vector of over 100 attributes • the “file fingerprint” • Each attribute has a data type and semantic Attribute Data Type Semantic sha256 32 byte array CHECKSUM pe_sect_cnt uint16_t VALUE pe_sect_rawoff_entry uint32_t OFFSET • The complete contents of the vector are kept secret • static and dynamic features of PE executables

13. Finding similar files | Distance sum of partial distances each distance operator assigned manually weights assigned manually to equalize contribution • • • Nearest neighbor query • Compound distance function • Data type and semantic determine partial dist. func. Data Type Semantic Partial distance function 32 byte array CHECKSUM RETURN_ZERO uint16_t VALUE EQUAL_RET32 uint32_t OFFSET LOG • Each partial distance function = one kernel function • Over 100 kernels for every NN query • Intermediate results kept in the “Scratchpad”

14. Finding similar files | Data ~60 M data points sparse and well separated (in many cases) • •

15. Finding similar files | Implementation we started with GPUs their high memory throughput allows “naive” implementation and rapid prototyping column-oriented database • • •

16. Classification | Requirements find easily what is responsible for a mistake – transparency fix the problem quickly – tractability • •

17. Classification | Algorithm Instance based classifier.

18. Classification | Optimizations scaling and HW problems with GPUs we invested in algorithmic optimizations: VP-tree, distance bounded search hand optimized distance function (assembly) CPU version is ~100x faster • • • •

19. Classification | Deployment → FileSHAandu ser id → ←Fileprevale nce ← ← Fileclass ification ← → Filefinger print → ← Generic detections ← ↑ File classifications and Evo-gen detections → Threats → Set updates ↓ Medusa Scavenger Avast users FileRep

20. Classification | Deployment → FileSHAandu ser id → ←Fileprevale nce ← ← Fileclass ification ← → Filefinger print → ← Generic detections ← ↑ File classifications and Evo-gen detections → Threats → Set updates ↓ Medusa Scavenger Avast users FileRep

21. Classification | Deployment → FileSHAandu ser id → ←Fileprevale nce ← ← Fileclass ification ← → Filefinger print → ← Generic detections ← ↑ File classifications and Evo-gen detections → Threats → Set updates ↓ Medusa Scavenger Avast users FileRep

22. Rule generator detect more variants in the wild (our) rule is a conjunction of several conditions known as Win32:Evo-Gen completely different optimization problem than classification - still uses the GPU • • • •

23.

24. Q&A

Libor Mořkovský - Recognizing Malware

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Libor Mořkovský - Recognizing Malware

Similar to Libor Mořkovský - Recognizing Malware (20)

More from Machine Learning Prague

More from Machine Learning Prague (13)

Recently uploaded

Recently uploaded (20)

Libor Mořkovský - Recognizing Malware