This is the extended offline version of
an overview of the Portable Executable format and its malformations
presented at Hashdays, in Lucerne, on the 3rd November 2012
direct download link: http://corkami.googlecode.com/files/ange_albertini_hashdays_2012.zip
This document describes john-devkit, an experiment to generate optimized C code for hash cracking algorithms in John the Ripper. It aims to separate algorithms, optimizations, and device-specific code to improve performance and scalability. Early results show speed improvements for some formats over John the Ripper's default implementation. The document discusses optimizations like interleaving, vectorization, and early reject that can be applied to any algorithm without effort. It also describes the intermediate language and optimizations specific to password cracking used by john-devkit to generate optimized output code.
This document discusses scalable network transfers and how libcurl addresses scalability issues. It begins with an overview of traditional blocking socket approaches that do not scale well with many connections. It then covers event-based and asynchronous approaches using libraries like libevent that allow handling thousands of simultaneous connections efficiently. Libcurl supports these approaches through its multi-socket API, allowing applications to take advantage of libcurl's many protocols while using scalable event models. The document provides code examples of traditional, libcurl multi, and libcurl multi-socket styles.
This document discusses dynamic analysis of PHP web applications. It begins by explaining what dynamic analysis is and its benefits and limitations. It then surveys the current state of tools for PHP dynamic analysis, including code instrumentation tools, patches and extensions for PHP interpreters, and external profiling tools. A key focus is on developing a PHP extension for dynamic analysis, as it allows full control and transparency. The document outlines the capabilities of a PHP extension, such as handling function entry and exit, working with opcodes, and hooking dynamically evaluated strings. It introduces PVT, a new PHP dynamic analysis tool implemented as a PHP extension, covering its features and providing statistics on its performance. It concludes with plans for further improving PVT and references.
This is the fourteenth (and last for now) set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
RIPS - static code analyzer for vulnerabilities in PHPSorina Chirilă
RIPS is a PHP static source code analyzer based on PIXY that detects vulnerabilities like SQL injection and cross-site scripting. It works by splitting code into tokens and tracing whether user-supplied data reaches sensitive sinks like vulnerable functions. RIPS has a simple web interface and detects vulnerabilities through case studies by preparing a local web site and running analysis. Future work includes improving support for object-oriented code and dynamic runtime analysis.
This document discusses oddities that can be found in Portable Executable (PE) file formats. It describes how static elements like the DOS header, image base, entry point, and sections can be manipulated in non-standard ways. It also explains how dynamic loading behaviors are affected by these oddities, such as through the thread local storage, relocations, and different parsing of headers and data directories between on-disk and in-memory formats. The document promotes further understanding of PE files through the use of proof of concepts provided on related websites.
C++ is most often used programming language. This slide will help you to gain more knowledge on C++ programming. In this slide you will learn the fundamentals of C++ programming. The slide will also help you to fetch more details on Object Oriented Programming concepts. Each of the concept under Object Oriented Programming is explained in detail and in more smoother way as it will helpful for everyone to understand.
Sergi Álvarez + Roi Martín - radare2: From forensics to bindiffing [RootedCON...RootedCON
Radare was originally created as a forensics tool but now also supports bindiffing binaries. It can perform multiple search methods on files including regular expressions, strings, and hexpairs. Signatures and magic templates allow parsing unknown file formats. Scripting is supported through Vala bindings. Filesystems can be mounted and partitions analyzed. Bindiffing helps analyze differences between binaries through function and basic block matching and fingerprints. A work-in-progress graphical interface called ragui is also being built.
This document describes john-devkit, an experiment to generate optimized C code for hash cracking algorithms in John the Ripper. It aims to separate algorithms, optimizations, and device-specific code to improve performance and scalability. Early results show speed improvements for some formats over John the Ripper's default implementation. The document discusses optimizations like interleaving, vectorization, and early reject that can be applied to any algorithm without effort. It also describes the intermediate language and optimizations specific to password cracking used by john-devkit to generate optimized output code.
This document discusses scalable network transfers and how libcurl addresses scalability issues. It begins with an overview of traditional blocking socket approaches that do not scale well with many connections. It then covers event-based and asynchronous approaches using libraries like libevent that allow handling thousands of simultaneous connections efficiently. Libcurl supports these approaches through its multi-socket API, allowing applications to take advantage of libcurl's many protocols while using scalable event models. The document provides code examples of traditional, libcurl multi, and libcurl multi-socket styles.
This document discusses dynamic analysis of PHP web applications. It begins by explaining what dynamic analysis is and its benefits and limitations. It then surveys the current state of tools for PHP dynamic analysis, including code instrumentation tools, patches and extensions for PHP interpreters, and external profiling tools. A key focus is on developing a PHP extension for dynamic analysis, as it allows full control and transparency. The document outlines the capabilities of a PHP extension, such as handling function entry and exit, working with opcodes, and hooking dynamically evaluated strings. It introduces PVT, a new PHP dynamic analysis tool implemented as a PHP extension, covering its features and providing statistics on its performance. It concludes with plans for further improving PVT and references.
This is the fourteenth (and last for now) set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
RIPS - static code analyzer for vulnerabilities in PHPSorina Chirilă
RIPS is a PHP static source code analyzer based on PIXY that detects vulnerabilities like SQL injection and cross-site scripting. It works by splitting code into tokens and tracing whether user-supplied data reaches sensitive sinks like vulnerable functions. RIPS has a simple web interface and detects vulnerabilities through case studies by preparing a local web site and running analysis. Future work includes improving support for object-oriented code and dynamic runtime analysis.
This document discusses oddities that can be found in Portable Executable (PE) file formats. It describes how static elements like the DOS header, image base, entry point, and sections can be manipulated in non-standard ways. It also explains how dynamic loading behaviors are affected by these oddities, such as through the thread local storage, relocations, and different parsing of headers and data directories between on-disk and in-memory formats. The document promotes further understanding of PE files through the use of proof of concepts provided on related websites.
C++ is most often used programming language. This slide will help you to gain more knowledge on C++ programming. In this slide you will learn the fundamentals of C++ programming. The slide will also help you to fetch more details on Object Oriented Programming concepts. Each of the concept under Object Oriented Programming is explained in detail and in more smoother way as it will helpful for everyone to understand.
Sergi Álvarez + Roi Martín - radare2: From forensics to bindiffing [RootedCON...RootedCON
Radare was originally created as a forensics tool but now also supports bindiffing binaries. It can perform multiple search methods on files including regular expressions, strings, and hexpairs. Signatures and magic templates allow parsing unknown file formats. Scripting is supported through Vala bindings. Filesystems can be mounted and partitions analyzed. Bindiffing helps analyze differences between binaries through function and basic block matching and fingerprints. A work-in-progress graphical interface called ragui is also being built.
DEF CON 23 - Ryan o'neil - advances in linux forensics with ecfsFelipe Prado
ECFS is a process forensics tool that takes high-resolution "snapshots" of processes in the form of custom core dump files. These ECFS files contain more detailed information than traditional core dumps, including fully reconstructed symbol tables and additional metadata. This allows ECFS to precisely detect process infections like injected code or hijacked functions. The document demonstrates how ECFS can be used to analyze malware and detect advanced techniques used by real-world rootkits to evade analysis.
"Ning's ""Your Own Social Network"" application is 160,000 lines of PHP that powers hundreds of thousands of social networks, each different than the others. This talk discusses the static and dynamic analysis techniques that we use at Ning to understand and optimize our platform, including the PHP tokenizer, regular expressions, the vld and xdebug extensions, and the PHP DTrace provider.
"
José Miguel Esparza - Obfuscation and (non-)detection of malicious PDF files ...RootedCON
The document discusses techniques for obfuscating malicious PDF files to avoid detection. It begins with an introduction to the PDF format and its object types. It then covers many obfuscation techniques like avoiding characteristic strings, splitting up JavaScript code, encoding strings and names, using uncommon filters, and introducing malformed formatting. The document also analyzes how these techniques can help files evade antivirus detection and complicate analysis by tools. It highlights the peepdf tool for its Python-based interactive PDF analysis capabilities. In conclusions, it finds that nested PDFs, compressed objects, new filters, encryption, and avoiding characteristic strings are very effective at evading detection.
The document analyzes and classifies 123 PlugX malware samples into 7 groups based on their configurations and relationships to known targeted attacks. The largest group is "Starter" with 24 samples, followed by "*Sys" with 20 samples. Various techniques are used to associate samples, including matching registry values, domains, IP addresses, debug strings, and network ranges. Many samples in the "*Sys" and "WS" groups are found to share the same owner, domains, or network.
The speech is timed to the coming release of PHP7 and is intended to review the state of the language and to give a slap for those who still hesitate to make use of available features.
This document discusses creating user-mode debuggers for Windows. It outlines the key components needed, including an OS support API, PE disassembler, symbol handler, process/thread/module enumerator, and handling of debug events. It then provides details on several of these components, explaining how to access system information, disassemble PE files, retrieve symbols, and monitor debug events from the target process. The goal is to describe the main building blocks for developing a custom debugger on Windows.
The document discusses programming PIC microcontrollers using C and assembly languages. It states that assembly language produces smaller programs but is more tedious than C. C programs require a cross-compiler like MPLAB to convert C code into machine language for the target PIC microcontroller. MPLAB includes header files containing function declarations to program ports, timers, and other microcontroller features. The document provides examples of using basic programming constructs like if/else statements, loops, and library functions with a PIC microcontroller.
This is the thirteenth set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
Winnti is malware used by Chinese threat actor for cybercrime and cyber espionage since 2009. The behavior of Winnti components is well described in past analysis report by Novetta, but currently there are much more variants with different behavior from it. I will share my RE findings not explained in public reports including:
- Winnti worker component supporting SMTP protocol,
- Winnti as a loader for other malware family,
- rootkit driver making covert channels by hooking NDIS TCPIP protocol handlers and
- hack tools using the same API hash calculation as Winnti components.
The configuration data of Winnti is important for threat intelligence because campaign IDs indicating target organizations or countries to the actor are included. Moreover, as Kaspersky pointed out in the blog, inline 64-bit kernel drivers are sometimes signed with stolen certificates. The certificates are also useful to identify already-compromised targets. I checked about 170 Winnti samples to extract the configurations and certificates. Based on the work, I will show Winnti targets are not only game and pharmaceutical industries, but also chemical, e-commerce, electronics and telecommunications ones.
Embedded c program and programming structure for beginnersKamesh Mtec
Embedded C programming is used to program microcontrollers that are found in many electronic devices. It involves writing code in the C language to control the functioning of embedded systems. Some key aspects of embedded C include using data types like char, int and float to store values in memory, keywords to perform specific tasks, and special function registers to access peripherals like ports and timers. The structure of an embedded C program typically involves comments, preprocessor directives, functions, variables and statements to read inputs, perform operations and output results.
This document summarizes how to encrypt a ZIP file as a valid PNG file using AES in CBC mode. It describes:
1. The first block of the ZIP file and the target PNG header/chunk that will be encrypted to.
2. Decrypting the first cipher block to obtain the IV by xoring the plaintext and ciphertext blocks.
3. Using the key and crafted IV, the ZIP file can be encrypted such that it decrypts to a valid PNG file, taking advantage of flexible file formats and CBC mode properties.
Lua is a lightweight scripting language embedded in many applications like Wireshark and Redis. It is small but powerful, with features like closures, coroutines, and metatables. Lua is embedded via its C API and allows for extending applications with modules written in Lua. Popular modules include LuaSocket and LuaSQL. Lua sees widespread use due to its small size, speed, portability, and ability to extend large C/C++ applications with scripting.
Demonstrate a Chosen Ciphertext Attack when Crypto constructs are not used correctly. Detailed steps are given. The slides show how to attack the unauthenticated symmetric encryption in the OFB mode.
This document discusses symbolic execution and the Triton framework. It begins with an introduction to symbolic execution and why it is useful for tasks like static and dynamic analysis. Triton is then introduced as a dynamic binary analysis framework that uses symbolic execution. Key components of Triton like the symbolic execution engine and AST representations are described. Finally, SymGDB is presented as a way to combine Triton with GDB to simplify symbolic execution debugging workflows. Examples analyzing crackme programs are provided to demonstrate SymGDB.
HKG15-211: Advanced Toolchain Usage Part 4
---------------------------------------------------
Speaker: Ryan Arnold, Maxim Kuvyrkov, Will Newton, Yvan Roux
Date: February 10, 2015
---------------------------------------------------
★ Session Summary ★
This session is a continuation of the Advanced Toolchain Usage Part 1 & 2 presentations given at LCU14. Parts 3 and 4 will cover a variety of topics, such as: Linker tips and tricks, adding symbol versioning interfaces to a system library, debugging the dynamic linker, debugging applications that use malloc, gcc attributes, manually constructing a backtrace on arm & Aarch64, how to add lightweight debugging to your program, how to use a signal handler appropriately, and TLS Models on Aarch64 and when to use them.
--------------------------------------------------
★ Resources ★
Pathable: https://hkg15.pathable.com/meetings/250792
Video: https://www.youtube.com/watch?v=9AcklY0Cc7U
Etherpad: http://pad.linaro.org/p/hkg15-211
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2015 - #HKG15
February 9-13th, 2015
Regal Airport Hotel Hong Kong Airport
---------------------------------------------------
http://www.linaro.org
Fighting API Compatibility On Fluentd Using "Black Magic"SATOSHI TAGOMORI
The document discusses Fluentd's changes to its plugin API between versions 0.12 and 0.14. In 0.14, the API was overhauled to separate entry points from implementations and introduce a plugin base class to control data and control flow. A compatibility layer was added to allow most 0.12 plugins to work unmodified in 0.14 by handling calls to overridden methods. However, plugins that override certain methods like #emit may cause errors due to changes in how buffering works.
The document discusses the key elements needed for an embedded Linux system: the toolchain, bootloader, kernel, and userspace. It covers selecting an appropriate toolchain, including the compiler, debugger and C library. It also describes bootloaders, their role in initializing hardware and loading the kernel, and examples like U-Boot. The bootloader passes information about the hardware configuration to the kernel.
Nikhil Anish is seeking a challenging position that encourages learning and creativity. He has over 5 years of work experience in business development roles at Standard Chartered Bank and Axis Bank, where he was responsible for meeting sales targets and managing client relationships. Nikhil has a PGDM in Marketing and Finance from IMS Noida and a B.Com from R.S. College. He is proficient in MS Office and programming languages like C# and C++.
This document contains the table of contents for an issue of PoC||GTFO, a journal for sharing technical content in unconventional ways. It lists over 60 articles across various topics including hardware hacking, firmware reverse engineering, embedded exploitation, and unusual file formats. The sections are numbered and titled with references to hacking, unconventional thinking, and sharing knowledge in new ways.
DEF CON 23 - Ryan o'neil - advances in linux forensics with ecfsFelipe Prado
ECFS is a process forensics tool that takes high-resolution "snapshots" of processes in the form of custom core dump files. These ECFS files contain more detailed information than traditional core dumps, including fully reconstructed symbol tables and additional metadata. This allows ECFS to precisely detect process infections like injected code or hijacked functions. The document demonstrates how ECFS can be used to analyze malware and detect advanced techniques used by real-world rootkits to evade analysis.
"Ning's ""Your Own Social Network"" application is 160,000 lines of PHP that powers hundreds of thousands of social networks, each different than the others. This talk discusses the static and dynamic analysis techniques that we use at Ning to understand and optimize our platform, including the PHP tokenizer, regular expressions, the vld and xdebug extensions, and the PHP DTrace provider.
"
José Miguel Esparza - Obfuscation and (non-)detection of malicious PDF files ...RootedCON
The document discusses techniques for obfuscating malicious PDF files to avoid detection. It begins with an introduction to the PDF format and its object types. It then covers many obfuscation techniques like avoiding characteristic strings, splitting up JavaScript code, encoding strings and names, using uncommon filters, and introducing malformed formatting. The document also analyzes how these techniques can help files evade antivirus detection and complicate analysis by tools. It highlights the peepdf tool for its Python-based interactive PDF analysis capabilities. In conclusions, it finds that nested PDFs, compressed objects, new filters, encryption, and avoiding characteristic strings are very effective at evading detection.
The document analyzes and classifies 123 PlugX malware samples into 7 groups based on their configurations and relationships to known targeted attacks. The largest group is "Starter" with 24 samples, followed by "*Sys" with 20 samples. Various techniques are used to associate samples, including matching registry values, domains, IP addresses, debug strings, and network ranges. Many samples in the "*Sys" and "WS" groups are found to share the same owner, domains, or network.
The speech is timed to the coming release of PHP7 and is intended to review the state of the language and to give a slap for those who still hesitate to make use of available features.
This document discusses creating user-mode debuggers for Windows. It outlines the key components needed, including an OS support API, PE disassembler, symbol handler, process/thread/module enumerator, and handling of debug events. It then provides details on several of these components, explaining how to access system information, disassemble PE files, retrieve symbols, and monitor debug events from the target process. The goal is to describe the main building blocks for developing a custom debugger on Windows.
The document discusses programming PIC microcontrollers using C and assembly languages. It states that assembly language produces smaller programs but is more tedious than C. C programs require a cross-compiler like MPLAB to convert C code into machine language for the target PIC microcontroller. MPLAB includes header files containing function declarations to program ports, timers, and other microcontroller features. The document provides examples of using basic programming constructs like if/else statements, loops, and library functions with a PIC microcontroller.
This is the thirteenth set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
Winnti is malware used by Chinese threat actor for cybercrime and cyber espionage since 2009. The behavior of Winnti components is well described in past analysis report by Novetta, but currently there are much more variants with different behavior from it. I will share my RE findings not explained in public reports including:
- Winnti worker component supporting SMTP protocol,
- Winnti as a loader for other malware family,
- rootkit driver making covert channels by hooking NDIS TCPIP protocol handlers and
- hack tools using the same API hash calculation as Winnti components.
The configuration data of Winnti is important for threat intelligence because campaign IDs indicating target organizations or countries to the actor are included. Moreover, as Kaspersky pointed out in the blog, inline 64-bit kernel drivers are sometimes signed with stolen certificates. The certificates are also useful to identify already-compromised targets. I checked about 170 Winnti samples to extract the configurations and certificates. Based on the work, I will show Winnti targets are not only game and pharmaceutical industries, but also chemical, e-commerce, electronics and telecommunications ones.
Embedded c program and programming structure for beginnersKamesh Mtec
Embedded C programming is used to program microcontrollers that are found in many electronic devices. It involves writing code in the C language to control the functioning of embedded systems. Some key aspects of embedded C include using data types like char, int and float to store values in memory, keywords to perform specific tasks, and special function registers to access peripherals like ports and timers. The structure of an embedded C program typically involves comments, preprocessor directives, functions, variables and statements to read inputs, perform operations and output results.
This document summarizes how to encrypt a ZIP file as a valid PNG file using AES in CBC mode. It describes:
1. The first block of the ZIP file and the target PNG header/chunk that will be encrypted to.
2. Decrypting the first cipher block to obtain the IV by xoring the plaintext and ciphertext blocks.
3. Using the key and crafted IV, the ZIP file can be encrypted such that it decrypts to a valid PNG file, taking advantage of flexible file formats and CBC mode properties.
Lua is a lightweight scripting language embedded in many applications like Wireshark and Redis. It is small but powerful, with features like closures, coroutines, and metatables. Lua is embedded via its C API and allows for extending applications with modules written in Lua. Popular modules include LuaSocket and LuaSQL. Lua sees widespread use due to its small size, speed, portability, and ability to extend large C/C++ applications with scripting.
Demonstrate a Chosen Ciphertext Attack when Crypto constructs are not used correctly. Detailed steps are given. The slides show how to attack the unauthenticated symmetric encryption in the OFB mode.
This document discusses symbolic execution and the Triton framework. It begins with an introduction to symbolic execution and why it is useful for tasks like static and dynamic analysis. Triton is then introduced as a dynamic binary analysis framework that uses symbolic execution. Key components of Triton like the symbolic execution engine and AST representations are described. Finally, SymGDB is presented as a way to combine Triton with GDB to simplify symbolic execution debugging workflows. Examples analyzing crackme programs are provided to demonstrate SymGDB.
HKG15-211: Advanced Toolchain Usage Part 4
---------------------------------------------------
Speaker: Ryan Arnold, Maxim Kuvyrkov, Will Newton, Yvan Roux
Date: February 10, 2015
---------------------------------------------------
★ Session Summary ★
This session is a continuation of the Advanced Toolchain Usage Part 1 & 2 presentations given at LCU14. Parts 3 and 4 will cover a variety of topics, such as: Linker tips and tricks, adding symbol versioning interfaces to a system library, debugging the dynamic linker, debugging applications that use malloc, gcc attributes, manually constructing a backtrace on arm & Aarch64, how to add lightweight debugging to your program, how to use a signal handler appropriately, and TLS Models on Aarch64 and when to use them.
--------------------------------------------------
★ Resources ★
Pathable: https://hkg15.pathable.com/meetings/250792
Video: https://www.youtube.com/watch?v=9AcklY0Cc7U
Etherpad: http://pad.linaro.org/p/hkg15-211
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2015 - #HKG15
February 9-13th, 2015
Regal Airport Hotel Hong Kong Airport
---------------------------------------------------
http://www.linaro.org
Fighting API Compatibility On Fluentd Using "Black Magic"SATOSHI TAGOMORI
The document discusses Fluentd's changes to its plugin API between versions 0.12 and 0.14. In 0.14, the API was overhauled to separate entry points from implementations and introduce a plugin base class to control data and control flow. A compatibility layer was added to allow most 0.12 plugins to work unmodified in 0.14 by handling calls to overridden methods. However, plugins that override certain methods like #emit may cause errors due to changes in how buffering works.
The document discusses the key elements needed for an embedded Linux system: the toolchain, bootloader, kernel, and userspace. It covers selecting an appropriate toolchain, including the compiler, debugger and C library. It also describes bootloaders, their role in initializing hardware and loading the kernel, and examples like U-Boot. The bootloader passes information about the hardware configuration to the kernel.
Nikhil Anish is seeking a challenging position that encourages learning and creativity. He has over 5 years of work experience in business development roles at Standard Chartered Bank and Axis Bank, where he was responsible for meeting sales targets and managing client relationships. Nikhil has a PGDM in Marketing and Finance from IMS Noida and a B.Com from R.S. College. He is proficient in MS Office and programming languages like C# and C++.
This document contains the table of contents for an issue of PoC||GTFO, a journal for sharing technical content in unconventional ways. It lists over 60 articles across various topics including hardware hacking, firmware reverse engineering, embedded exploitation, and unusual file formats. The sections are numbered and titled with references to hacking, unconventional thinking, and sharing knowledge in new ways.
1. File formats are complex with many stakeholders who interpret specifications differently, leading to divergent implementations over time.
2. Specifications are often incomplete, unclear, non-free, or do not reflect reality, making it difficult to determine what a valid file is.
3. Relying on specifications alone is not sufficient - one must also analyze sample files and code to understand how file formats work in practice.
Presented at Troopers 2016.
When Infosec and Digipres share interests...
TL;DR
- Attack surface with file formats is too big.
- Specs are useless (just a nice ‘guide’), not representing reality.
- We can’t deprecate formats because we can’t preserve and we can’t define how they really work
- We need open good libraries to simplify landscape, and create a corpus to express the reality of file format, which gives us real “documentation”.
- Then we can preserve and deprecate older format, which reduces attack surface.
- From then on, we can focus on making the present more secure.
- We don't need new formats: reality will diverge from the specs anyway - we need 'alive' (up to date, traceable) specs.
This document discusses binary file formats and creating visual documentation. It notes that specifications are imperfect and there are security consequences. Formats have diverse properties like headers, signatures, offsets. Visual docs should be self-contained, for a defined audience, and remove unnecessary details. The goal is creating useful documentation based on reality. Questions are welcome.
The document provides a step-by-step guide to writing a basic "Hello World" PDF file. It explains the overall PDF file structure and key elements like the file body, cross-reference table, trailer, and objects. Objects are used to define things like the catalog, pages, and a single page. The guide demonstrates creating three objects - one for the catalog that refers to a pages object, which in turn refers to a page object defining a single page.
This document summarizes Ange Albertini's talk on "Funky file Formats". The talk discusses how files can take on multiple formats by exploiting ambiguities and tolerance in file specifications. Examples are given of files that are valid images, archives, documents, and encrypted files simultaneously. The talk also covers steganography techniques like hiding files within other file formats by manipulating metadata or unused portions of file specifications. Overall, the talk illustrates the concept of "format polymorphism" where single files can masquerade as multiple file types to evade detection or trigger different parser behaviors.
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012DefCamp
This document provides an introduction to exploiting vulnerabilities in Windows kernel drivers for privilege escalation. It discusses the differences between user mode and kernel mode, how drivers communicate with user programs through I/O requests, techniques for analyzing and fuzzing drivers, potential privilege escalation methods like overwriting function pointers and token stealing, and how to set up a kernel debugging environment. The overall goal is to find bugs in kernel drivers that could allow gaining kernel-level code execution and full system access.
This document discusses Linux support for the Belgian electronic identity (eID) card. It provides an overview of the key components involved, including the JavaCard applet on the card, PC/SC for communicating with readers, CCID for USB protocols, PKCS#11 for cryptographic functions, and Firefox/Chromium plugins. It also covers options for using the eID card for SSH, PDF signing, and mutual TLS or via an eid-applet for HTTP servers. Support is provided for several major Linux distributions and versions via packaged middleware.
This document summarizes a talk about abusing file format parsers to cause different parsing behaviors, known as "schizophrenia". It describes techniques used across various formats like ZIP, BMP, PDF, GIF and PE files that can result in files being parsed or interpreted differently depending on factors like the parsing order, which part of a program does the parsing, or which specifications are followed. The goal is to fool parsers without causing failures by leveraging ambiguity and flexibility in file specifications.
Patching Windows Executables with the Backdoor Factory | DerbyCon 2013midnite_runr
Patching Windows Executives with the Backdoor Factory is a presentation about binary patching techniques. It discusses the history of patching, how key generators and Metasploit patch binaries, and how the author learned to manually patch binaries. The presentation then introduces the Backdoor Factory tool, which can automatically patch Windows binaries by injecting shellcode into code caves. It demonstrates patching via code cave insertion, single cave jumps, and cave jumping. Mitigations like self-validation and antivirus are discussed.
The document provides an introduction to Linux kernel modules. It discusses that kernel modules extend the capabilities of the Linux kernel by executing code as part of the kernel. It then describes the anatomy of a kernel module, including initialization and cleanup functions. The document demonstrates a simple "hello world" kernel module example and how to build, load and unload kernel modules. It also introduces the idea of character device drivers as a more advanced kernel module example.
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
This document provides an overview of lightweight virtualization using Linux containers and Docker. It begins by explaining the problems of deploying applications across different environments and targets, and how containers can help solve this issue similarly to how shipping containers standardized cargo transportation. It then discusses what Linux containers are, how they provide isolation using namespaces and cgroups. It introduces Docker and how it builds on containers to further simplify deployment by allowing images to be easily built, shared, and run anywhere through standard formats and tools.
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
Docker provides a standardized way to build, ship, and run Linux containers. It uses Linux kernel features like namespaces and cgroups to isolate containers and make them lightweight. Docker allows building container images using Dockerfiles and sharing them via public or private registries. Images can be pulled and run anywhere. Docker aims to make containers easy to use and commoditize the container technology provided by Linux containers (LXC).
This presentations introduces some common protocols used in electronics, and how to sniff/speak them. Then a bit about USB, and some interesting hacks with these things.
Then a bit about openwrt and router hacking.
hashdays 2011: Ange Albertini - Such a weird processor - messing with x86 opc...Area41
Whether it's for malware analysis, vulnerability research or emulation, having a correct disassembly of a binary is the essential thing you need when you analyze code. Unfortunately, many people are not aware that there are a lot of opcodes that are rarely used in normal files, but valid for execution, but also several common opcodes have rarely seen behaviours, which could lead to wrong conclusions after an improper analysis.
For this research, I decided to go back to the basics and study assembly from scratch, covering all opcodes, whether they're obsolete or brand new, common or undocumented. This helped me to find bugs in all the disassemblers I tried, including the most famous ones. This presentation introduces the funniest aspects of the x86 CPUs, that I discovered in the process, including unexpected or rarely known opcodes and undocumented behavior of common opcodes.
The talk will also cover opcodes that are used in armored code (malware/commercial protectors) that are likely to break tools (disassemblers, analyzers, emulators, tracers,...), and introduce some useful tools and documents that were created in the process of the research.
Bio: Ange Albertini is a reverse-engineering and assembly language enthusiast for around 20 years, and malware analyst for 6 years. He has a technical blog, where he shares experimental sources files, and some infographics that are useful in his daily work.
Reverse Engineering 101
Michael Pavle on April 28, 2023
Learn the fundamental tools and skills to take a look under the hood of your favourite programs; we'll be covering compilers, assembly language, and software used to disassemble and analyze executables.
https://github.com/utmgdsc/GDSC_Reversing_Workshop
This document summarizes a presentation on reverse engineering OS X drivers. It discusses the structure of the OS X kernel, drivers, and kernel extensions. It outlines some of the challenges in reverse engineering OS X drivers, such as parsing C++ code and dependencies, and describes approaches to address these challenges like processing relocation information and parsing DWARF files to build a kernel type library in IDA.
Finding Xori: Malware Analysis Triage with Automated DisassemblyPriyanka Aash
"In a world of high volume malware and limited researchers we need a dramatic improvement in our ability to process and analyze new and old malware at scale. Unfortunately what is currently available to the community is incredibly cost prohibitive or does not rise to the challenge. As malware authors and distributors share code and prepackaged tool kits, the corporate sponsored research community is dominated by solutions aimed at profit as opposed to augmenting capabilities available to the broader community. With that in mind, we are introducing our library for malware disassembly called Xori as an open source project. Xori is focused on helping reverse engineers analyze binaries, optimizing for time and effort spent per sample.
Xori is an automation-ready disassembly and static analysis library that consumes shellcode or PE binaries and provides triage analysis data. This Rust library emulates the stack, register states, and reference tables to identify suspicious functionality for manual analysis. Xori extracts structured data from binaries to use in machine learning and data science pipelines.
We will go over the pain-points of conventional open source disassemblers that Xori solves, examples of identifying suspicious functionality, and some of the interesting things we've done with the library. We invite everyone in the community to use it, help contribute and make it an increasingly valuable tool for researchers alike."
Linux kernel tracing superpowers in the cloudAndrea Righi
The Linux 4.x series introduced a new powerful engine of programmable tracing (BPF) that allows to actually look inside the kernel at runtime. This talk will show you how to exploit this engine in order to debug problems or identify performance bottlenecks in a complex environment like a cloud. This talk will cover the latest Linux superpowers that allow to see what is happening “under the hood” of the Linux kernel at runtime. I will explain how to exploit these “superpowers” to measure and trace complex events at runtime in a cloud environment. For example, we will see how we can measure latency distribution of filesystem I/O, details of storage device operations, like individual block I/O request timeouts, or TCP buffer allocations, investigating stack traces of certain events, identify memory leaks, performance bottlenecks and a whole lot more.
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleJérôme Petazzoni
What's Docker, why does it matter, how does it use Linux Containers, why should you use it, and how? You'll find answers to those questions (and a bit more) in this presentation, given February 20th 2014 at the Large Scale Production Engineering Meet-Up at Yahoo, in Sunnyvale.
A Kernel of Truth: Intrusion Detection and Attestation with eBPFoholiab
"Attestation is hard" is something you might hear from security researchers tracking nation states and APTs, but it's actually pretty true for most network-connected systems!
Modern deployment methodologies mean that disparate teams create workloads for shared worker-hosts (ranging from Jenkins to Kubernetes and all the other orchestrators and CI tools in-between), meaning that at any given moment your hosts could be running any one of a number of services, connecting to who-knows-what on the internet.
So when your network-based intrusion detection system (IDS) opaquely declares that one of these machines has made an "anomalous" network connection, how do you even determine if it's business as usual? Sure you can log on to the host to try and figure it out, but (in case you hadn't noticed) computers are pretty fast these days, and once the connection is closed it might as well not have happened... Assuming it wasn't actually a reverse shell...
At Yelp we turned to the Linux kernel to tell us whodunit! Utilizing the Linux kernel's eBPF subsystem - an in-kernel VM with syscall hooking capabilities - we're able to aggregate metadata about the calling process tree for any internet-bound TCP connection by filtering IPs and ports in-kernel and enriching with process tree information in userland. The result is "pidtree-bcc": a supplementary IDS. Now whenever there's an alert for a suspicious connection, we just search for it in our SIEM (spoiler alert: it's nearly always an engineer doing something "innovative")! And the cherry on top? It's stupid fast with negligible overhead, creating a much higher signal-to-noise ratio than the kernels firehose-like audit subsystems.
This talk will look at how you can tune the signal-to-noise ratio of your IDS by making it reflect your business logic and common usage patterns, get more work done by reducing MTTR for false positives, use eBPF and the kernel to do all the hard work for you, accidentally load test your new IDS by not filtering all RFC-1918 addresses, and abuse Docker to get to production ASAP!
As well as looking at some of the technologies that the kernel puts at your disposal, this talk will also tell pidtree-bcc's road from hackathon project to production system and how focus on demonstrating business value early on allowed the organization to give us buy-in to build and deploy a brand new project from scratch.
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...DevSecCon
Matt Carroll
Infrastructure Security Engineer at Yelp
"Attestation is hard" is something you might hear from security researchers tracking nation states and APTs, but it's actually pretty true for most network-connected systems!
Modern deployment methodologies mean that disparate teams create workloads for shared worker-hosts (ranging from Jenkins to Kubernetes and all the other orchestrators and CI tools in-between), meaning that at any given moment your hosts could be running any one of a number of services, connecting to who-knows-what on the internet.
So when your network-based intrusion detection system (IDS) opaquely declares that one of these machines has made an "anomalous" network connection, how do you even determine if it's business as usual? Sure you can log on to the host to try and figure it out, but (in case you hadn't noticed) computers are pretty fast these days, and once the connection is closed it might as well not have happened... Assuming it wasn't actually a reverse shell...
At Yelp we turned to the Linux kernel to tell us whodunit! Utilizing the Linux kernel's eBPF subsystem - an in-kernel VM with syscall hooking capabilities - we're able to aggregate metadata about the calling process tree for any internet-bound TCP connection by filtering IPs and ports in-kernel and enriching with process tree information in userland. The result is "pidtree-bcc": a supplementary IDS. Now whenever there's an alert for a suspicious connection, we just search for it in our SIEM (spoiler alert: it's nearly always an engineer doing something "innovative")! And the cherry on top? It's stupid fast with negligible overhead, creating a much higher signal-to-noise ratio than the kernels firehose-like audit subsystems.
This talk will look at how you can tune the signal-to-noise ratio of your IDS by making it reflect your business logic and common usage patterns, get more work done by reducing MTTR for false positives, use eBPF and the kernel to do all the hard work for you, accidentally load test your new IDS by not filtering all RFC-1918 addresses, and abuse Docker to get to production ASAP!
As well as looking at some of the technologies that the kernel puts at your disposal, this talk will also tell pidtree-bcc's road from hackathon project to production system and how focus on demonstrating business value early on allowed the organization to give us buy-in to build and deploy a brand new project from scratch.
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...Yandex
Lightweight virtualization", also called "OS-level virtualization", is not new. On Linux it evolved from VServer to OpenVZ, and, more recently, to Linux Containers (LXC). It is not Linux-specific; on FreeBSD it's called "Jails", while on Solaris it’s "Zones". Some of those have been available for a decade and are widely used to provide VPS (Virtual Private Servers), cheaper alternatives to virtual machines or physical servers. But containers have other purposes and are increasingly popular as the core components of public and private Platform-as-a-Service (PAAS), among others.
Just like a virtual machine, a Linux Container can run (almost) anywhere. But containers have many advantages over VMs: they are lightweight and easier to manage. After operating a large-scale PAAS for a few years, dotCloud realized that with those advantages, containers could become the perfect format for software delivery, since that is how dotCloud delivers from their build system to their hosts. To make it happen everywhere, dotCloud open-sourced Docker, the next generation of the containers engine powering its PAAS. Docker has been extremely successful so far, being adopted by many projects in various fields: PAAS, of course, but also continuous integration, testing, and more.
Bsdtw17: george neville neil: realities of dtrace on free-bsdScott Tsai
This document summarizes a talk on the history and current state of DTrace, a dynamic tracing framework originally developed for Solaris and later ported to FreeBSD and MacOS. It discusses how DTrace has been used for performance analysis, distributed systems tracing, and teaching operating systems. Recent improvements include machine-readable output, new providers, and performance tuning. Future work includes the OpenDTrace cross-platform project and improving the D programming language used to write probes.
Compiler design notes phases of compilerovidlivi91
The document discusses compiling code and makefiles. It explains that compiling code translates code written in high-level languages into machine-readable machine code. There are typically four steps to compiling C code: preprocessing, compiling, assembling, and linking. Makefiles automate the process of compiling and linking code by defining rules and dependencies between files. They allow recompiling only what is necessary when files change.
The document appears to be a block of random letters with no discernible meaning or purpose. It consists of a series of letters without any punctuation, formatting, or other signs of structure that would indicate it is meant to convey any information. The document does not provide any essential information that could be summarized.
Similar to Binary art - Byte-ing the PE that fails you (extended offline version) (20)
"Technical challenges"? More like horrors!
Let's explore first the technical debt of old file formats,
with the evolution of the "MP3" format.
Then we go through more recent forms of file format abuses and tools:
polyglots, polymocks, and crypto-polyglots.
Last, an overview of recent collisions and other forms of art with MD5.
They say that with file formats, "specs are enough".
Should we laugh, cry or run away screaming?
Presented at Digital Preservation Coalition's CyberSec & DigiPres event.
The document discusses different archive formats and their relationships. It begins with an introduction to the presenter and then covers zlib, gzip, and zip file formats. Zlib and gzip both wrap deflate compression, but in different ways, so while the compressed data can be transferred between them, the formats are not directly compatible. Zip can use deflate but also other compression methods and a different one for each file. In conclusion, deflate is a common algorithm while the various formats wrap it with different headers and metadata.
This document is a slide presentation about hash collisions and generating polyglot files that have the same hash but different content. It discusses existing attacks on hashes like MD5 and SHA1 that allow two files to be generated with the same hash. It then explains how collisions can be generated for ZIP and TAR.GZ files by manipulating the ZIP file format in a way that maintains compatibility with ZIP parsers but results in different files with the same hash. Examples of colliding file pairs are shown with identical prefixes and suffixes and differing collision blocks in the middle.
You are *not* an idiot ~ or maybe we're all idiots.
Keynote at NorthSec 2021.
Talking about school, failure, success, diploma, impostor syndrom, manipulators, burn out, suicide, and how to deal with them.
The talk delivery was more personal, the slides are kept generic.
The recording is available @ https://youtu.be/Iu70J49bPlE?t=20869 (starts at 5:47:49)
The document discusses the author's experience with malware and file formats over 13 years, noting how specifications are often outdated and incomplete which can lead to misunderstandings. It advocates for better tools to analyze, document, and validate file formats to improve understanding of their current usage and behaviors. The author has created several open source projects focused on file format analysis and validation.
Demystifying hash collisions.
Pass the Salt, 1st July 2019.
video @ https://passthesalt.ubicast.tv/videos/kill-md5-demystifying-hash-collisions/
Hack.Lu, 22 October 2019.
video @ https://www.youtube.com/watch?v=JXazRQ0APpI
Beyond your studies ~ You studied X at Y. now what?
HackPra, July 2018
A student's life ago, the author somehow managed to graduate.
On the way, he made a lot of mistakes -- and he still does.
A few people since called him 'successful', but LOL, if only they knew....
And now, the author will do another (big!) mistake:
instead of hiding in shame as he probably should,
he'll share his mistakes with anyone bored enough to attend,
in the hope that he's the last person to ever look that dumb to commit such mistakes.
If you're a genius and you know what to do in life, please skip this. Seriously.
If, like the author at the time, you wonder WTF is going on with graduation, professional work and life, then hopefully you learn a few things. Maybe.
Btw the author is 42 (WTF - old!).
Maybe that will help to provide a few answers.
This document provides an introduction and overview of Inkscape, an open-source vector graphics editor. It discusses Inkscape's features such as its use of Scalable Vector Graphics (SVG), tools for drawing objects and manipulating nodes, layers, transformations, and more. The document also includes tutorials for tasks like tracing an image, creating a poster, and converting code snippets to SVG. Throughout, it emphasizes that Inkscape is non-destructive and files remain editable, while also noting some limitations like unsupported gradients along paths.
The document discusses the author's perspectives on file formats after over 30 years of experience working with computers and digital preservation. The author believes specifications are imperfect and do not fully define what constitutes a valid file, as implementations can interpret specifications differently and become outdated. The author has experimented with creating extreme files that push the boundaries of specifications in order to understand formats better and find potential issues.
Game developers are able to create better video games than what the limitations of computers allow by understanding how things truly work at a detailed level. They discovered tricks to get around limitations, such as updating colors rapidly to display more than the limited palette or changing sounds quickly to generate new voices. Understanding the underlying systems allows developers to creatively solve problems like drawing huge animated monsters that surpass the small allowed object sizes. This knowledge of how things really function provides advantages beyond initial restrictions.
This document discusses potential leaks that can occur from PDF documents, specifically from text, images, and drawings embedded in the pages. Even if text is invisible, images are not displayed, or drawings are covered, this information can still be extracted from the PDF. Importing or copying parts of a PDF does not necessarily limit the content, as the full document is often brought in and only a "limiting view" is applied. The only fully reliable way to prevent leaks is to convert the PDF pages to individual image files. In general, the PDF format has many issues preventing leaks and poses a large attack surface due to embedded metadata.
video https://www.youtube.com/watch?v=vg7LPcFUxg8
audio / HD video download http://media.ccc.de/browse/congress/2014/31c3_-_5997_-_en_-_saal_6_-_201412282030_-_preserving_arcade_games_-_ange_albertini.html
complete animated presentation + extras (~1Gb):
https://archive.org/details/arcade31c3
more infos @ https://code.google.com/p/corkami/wiki/Arcade
5. Windows executables and more
● since 1993, used in almost every executables
● 32bits, 64bits, .Net
● DLL, drivers, ActiveX...
● also used as data container
● icons, strings, dialogs, bitmaps...
omnipresent in Windows
also EFI boot, CE phones, Xbox,...
(but not covered here)
12. sins & punishments
● official documentation limited and unclear
● just describes standard PEs
● not good enough for security
● crashes (OS, security tools)
●
obstacle for 3rd party developments
● hinders automation, classification
● PE or not?
● corrupted, or malware?
● fails best tools → prevents even manual analysis
20. from bottom up
● analyzing what's in the wild
● waiting for malware/corruption to experiment?
● generate complete binaries from scratch
● manually
● no framework/compiler limitation
● concise PoCs
→ better coverage
I share knowledge and PoCs, with sources
29. Header
MZ
DOS header since IBM PC-DOS 1.0 (1981)
PE (or NE/LE/LX/...)
'modern' headers since Windows NT 3.1 (1993)
30. Header
DOS header
(DOS stub) 16 bits
(Rich header) compilation info
'PE headers'
31. DOS Stub
● obsolete 16b code
● prints msg & exits
● still present on all standard PEs
● even 64b binaries
PoC: compiled
32. 'Rich' header
● compiler information
● officially undocumented
● pitiful xor32 encryption
● completely documented by Daniel Pistelli
http://ntcore.com/files/richsign.htm
PoC: compiled
33.
34. Dos header
● obsolete stuff
● only used if started in DOS mode
● ignored otherwise
● tells where the PE header is
35.
36. 'PE Headers'
'NT Headers'
PE00
File header declares the rest
Optional header absent in .obj
Section table
mapping layout
37. File header
● how many sections?
● is there an Optional Header?
● 32b or 64b, DLL or EXE...
38.
39. NumberOfSections values
● 0: Corkami :p
● 1: packer
● 3-6: standard
● code, data, (un)initialized data, imports, resources...
● 16: free basic FTW :D
● what for ?
40.
41. Optional header
● geometry properties
● alignments, base, size
● tells where code starts
● 32/64b, driver/standard/console
● many non critical information
● data directory
42.
43. Sections
● defines the mapping:
● which part of the file goes where
● what for? (writeable, executable...)
44.
45.
46. Data Directory
● (RVA, Size) DataDirectory[NumbersOfRvaAndSizes]
● each of the standard 16 firsts has a specific use
→ often called 'Data Directories'
47.
48. PE DLL
...
call [API] API: …
… ret
Imports Exports
49. Exports
● 3 pointers to 3 lists
● defining in parallel (name, address, ordinal)
● a function can have several names
50.
51. Imports
● a null-terminated list of descriptors
● typically one per imported DLL
● each descriptor specifies
● DLL's name
● 2 null-terminated lists of pointers
– API names and future API addresses
● ImportsAddressTable highlights the address table
● for write access
52.
53. Relocations
● PE have standard ImageBases
● EXE: 0x400000, DLL 0x1000000
→ conflicts between DLLs
→ different ImageBase given by the loader
● absolute addresses need relocation
● most addresses of the header are relative
● immediate values in code, TLS callbacks
● adds (NewImageBase - OldImageBase)
54.
55. Resources
● icons, dialogs, version information, ...
● requires only 3 APIs calls to be used
→ used everywhere
● folder & file structure
● 3 levels in standard
56.
57. Thread Local Storage
● Callbacks executed on thread start and stop
● before EntryPoint
● after ExitProcess
64. SizeOfOptionalHeader
● sizeof(OptionalHeader)
● that would be 0xe0 (32b)/0xf0 (64b)
● many naive softwares fail if different
● offset(SectionTable) – offset(OptionalHeader)
● can be:
● bigger
– bigger than file (→ virtual table, xp)
● smaller or null (→ overlapping OptionalHeader)
● null (no section at all)
65. Section-less PE
● standard mode:
● 200 ≤ FileAlignment ≤ SectionAlignment
● 1000 ≤ SectionAlignment
● 'drivers' mode:
● 1 ≤ FileAlignment == SectionAlignment ≤ 800
→ virtual == physical
● whole file mapped as is
● sections are meaningless
● can be none, can be many (bogus or not)
67. TinyPE
classic example of hand-made malformation
● PE header in Dos header
● truncated OptionalHeader
● doesn't require a section
● 64b & driver compatible
● 92 bytes
● XP only (no more truncated OptionalHeader)
● extra padding is required since Vista
→ smallest universal PE: 268 bytes
69. Dual 'folded' headers
DD only used after mapping
http://www.reversinglabs.com/advisory/pecoff.php
1.move down header
2.fake DD overlaps starts of section (hex art FTW)
3.section area contains real values
● loading process:
1.header and sections are parsed
2.file is mapped
3.DD overwritten with real value
● imports are resolved, etc...
80. EntryPoint change via static DLLs
static DLLs are called before EntryPoint call
● DllMain gets thread context via lpvReserved
● which already contains the future EntryPoint
→ any static DLL can freely change the EntryPoint
documented by Skywing (http://www.nynaeve.net/?p=127),
but not widely known
82. Win32VersionValue
● officially reserved
● 'should be null'
● actually used to override versions info in the PEB
● simple dynamic anti-emu
● used in malwares
85. Characteristics
● IMAGE_FILE_32BIT_MACHINE
● true for 64b
● not required !!
● IMAGE_FILE_DLL
● not required in DLLs
– exports still useable
– no DllMain call!
● invalid EP → not an EXE
● no FILE_DLL → apparently not a DLL
→ can't be debugged
88. Imports descriptor tricks
● INT bogus or absent
● only DllName and IAT required
● descriptor just skipped if no thunk
● DLL name ignored
– can be null or VERY big
● parsing shouldn't abort too early
● isTerminator = (IAT == 0 || DllName == 0)
● terminator can be virtual or outside file
● first descriptor too
90. Collapsed imports
advanced imports malformation
● extension-less DLL name
● IAT in descriptor
● pseudo-valid INT that is ignored
● name and hint/names in terminator
● valid because last dword is null
92. Exceptions directory
● 64 bits Structured Exception Handler
● usually with a lot of extra compiler code
● used by W32.Deelae for infection
● Peter Ferrie, Virus Bulletin September 2011
● update-able manually, on the fly
● no need to go through APIs
98. Relocation types (in theory)
HIGHLOW
● standard ImageBase delta
ABSOLUTE
● do nothing
● just for alignment padding
99. Relocation types in practice
● type 6 and 7 are entirely skipped
● type 8 is forbidden
● type 4 (HIGHADJ) requires an parameter
● that is actually not taken into account (bug)
● type 2 (LOW) doesn't do anything
● because ImageBase are 64kb aligned
● type MIPS and IA64 are present on all archs
● at last, some cleanup in Windows 8!
100.
101. relocations' archeology
● HIGHADJ was there all along
● MIPS was recognized but rejected by Win95
● NT3.1 introduces MIPS – available in all archs.
● LOW was rejected by Win95/WinME
● while it does nothing on other versions
● Windows 2000 had an extra relocation type,
also with a parameter
Bonus:
Win95 relocations use 2 copies of the exact same code.
code optimization FTW!
102.
103. messing with relocations
● 4 relocation types actually do nothing
● All relocations can be applied on a bogus address
● HighAdj's parameter used as a trick
● Relocations can alter relocations
● one block can alter the next
● Relocations can decrypt data
● set a kernel ImageBase
● default ImageBase is known
● No static analysis possible
● but highly suspicious :D
107. Code in the header
● header is executable
● packers put some data or jumps there
● many unused fields
● many less important fields
● Peter Ferrie
http://pferrie.host22.com/misc/pehdr.htm
→ real code in the header
111. .Net
Loading process:
1.PE loader
• requires only imports (DD[1]) at this stage
2.MSCoree.dll called
3..Net Loader
● requires CLR (DD[13]) and relocations (DD[5])
● forgets to check NumberOfRvaAndSizes :(
– works with NumberOfRvaAndSizes = 2
fails IDA, reflector – but already in the wild
113. non-null PE
●
LoadlibraryEx with LOAD_LIBRARY_AS_DATAFILE
● data file PE only needs MZ, e_lfanew, 'PE00'
● 'PE' at the end of the file
● pad enough so that e_lfanew doesn't contain 00s
a non-null PE can be created and loaded
117. subsystems
● no fundamental differences
● low alignments for drivers
● incompatible imports: NTOSKRNL ↔ KERNEL32
● console ↔ gui : IsConsoleAttached
→ a PE with low alignments and no imports
can work in all 3 subsystems
119. a 'naked' PE with code
● low alignments → no section
● no imports → resolve manually APIs
● TLS only → no EntryPoint
no EntryPoint, no section, no imports,
but executed code
126. TLS AddressOfIndex
● pointer to dword
●
overwritten with 0, 1... on nth TLS loading
● easy dynamic trick
call <garbage> on file → call $+5 in memory
● handled before imports under XP, not in W7
same working PE, different loading process
128. Manifest
● XML resource
● can fail loading
● can crash the OS ! (KB921337)
● Tricky to classify
● ignored if wrong type
Minimum Manifest
<assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'/>
129. DllMain/TLS corruption
● DllMain and TLS only requires ESI to be correct
● Even ESP can be bogus
● easy anti-emulator
● TLS can terminate with exception
● no error reported
● EntryPoint executed normally
131. a Quine PE
● prints its source
● totally useless – absolutely fun :D
● fills DOS header with ASCII chars
● ASM source between DOS and PE headers
● type-able manually
● types itself in new window when executed
133. a binary polyglot
● add %PDF within 400h bytes
→ your PE is also a PDF (→ Acrobat)
● add PK0304 anywhere
→ your PE is also a ZIP (→ PKZip)
● throw a Java .CLASS in the ZIP
→ your PE is also a JAR (→ Java)
● add <HTML> somewhere
→ your PE is also an HTML page (→ Mosaic)
● Bonus: Python, JavaScript
136. Conclusion
● the Windows executable format is complex
● mostly covered, but many little traps
● new discoveries every day :(
http://pe101.corkami.com
http://pe.corkami.com
137. Questions?
Thanks to
Fabian Sauter, Peter Ferrie, وليد عصر
Bernhard Treutwein, Costin Ionescu, Deroko, Ivanlef0u, Kris Kaspersky, Moritz Kroll, Thomas Siebert,
Tomislav Peričin, Kris McConkey, Lyr1k, Gunther, Sergey Bratus, frank2, Ero Carrera, Jindřich Kubec, Lord
Noteworthy, Mohab Ali, Ashutosh Mehra, Gynvael Coldwind, Nicolas Ruff, Aurélien Lebrun, Daniel
Plohmann, Gorka Ramírez, 최진영 , Adam Błaszczyk, 板橋一正 , Gil Dabah, Juriaan Bremer, Bruce Dang,
Mateusz Jurczyk, Markus Hinderhofer, Sebastian Biallas, Igor Skochinsky, Ильфак Гильфанов, Alex
Ionescu, Alexander Sotirov, Cathal Mullaney
141. older formats
● 32b Windows still support old EXE and COM
● lower profile formats, evade detection
● an EXE can patch itself back to PE
● can use 'ZM' signature
● only works on disk :(
● a symbols-only COM file can drop a PE
● using Yosuke Hasegawa's http://utf-8.jp/public/sas/
144. file archeology
● bitmap fonts (.FON) are stored in NE format
● created in 1985 for Windows 1.0
● vgasys.fon still present in Windows 8
● file unchanged since 1991 (Windows 3.11)
● font copyrighted in 1984
● Properties show copyright name
→ Windows 8 still (partially) parses
a 16b executable format from 1985
145.
146. Drunk opcode
● Lock:Prefetch
● can't be executed
● bogus behavior under W7 x64
● does not trigger an exception either
● modified by the OS (wrongly 'repaired')
● yet still wrong after patching!
infinite loop of silent errors
147.
148. this is the end...
my only friend, the end...