Rspamd is a spam filtering system that is:
- Written in C for performance and uses an event-driven model to process messages asynchronously for scalability.
- Capable of detecting spam through a variety of filtering methods like policies, DNS lists, headers, text patterns, and machine learning.
- Integrates with mail transfer agents using plugins to modify or reject messages based on spam detection.
The document discusses the use of symbols and rules in Rspamd for defining spam detection logic. Symbols represent metadata like rule names and scores, while rules define spam detection expressions using regular expressions and logic. Symbols can be grouped and rules can reference symbols to define dependencies. Composites allow combining multiple rules and introducing logic to remove symbols and weights conditionally. Practical examples show how to define simple regex rules, complex rules combining multiple checks, and composites modifying rule results.
This talk introduces and discusses a novel, mostly unpublished technique to successfully attack websites that are applied with state-of-the-art XSS protection. This attack labeled Mutation-XSS (mXSS) is capable of bypassing high-end filter systems by utilizing the browser and its unknown capabilities - every single f***** one of them. We analyzed the type and number of high-profile websites and applications that are affected by this kind of attack. Several live demos during the presentation will share these impressions and help understanding, what mXSS is, why mXSS is possible and why it is of importance for defenders as well as professional attackers to understand and examine mXSS even further. The talk wraps up several years of research on this field, shows the abhorrent findings, discusses the consequences and delivers a step-by-step guide on how to protect against this kind of mayhem - with a strong focus on feasibility and scalability.
MMUG18 - MySQL Failover and OrchestratorSimon J Mudd
Description of recovery and failover with of MySQL and specifically how orchestrator is designed to make this simpler.
A presentation I gave at the Madrid MySQL Users Group on 17/05/2017.
Walking the Bifrost: An Operator's Guide to Heimdal & Kerberos on macOSCody Thomas
Kerberos on macOS with and without Active Directory (AD). Where are attacks possible in Kerberos and how does the LKDC (Local Key Distribution Center) come into play.
Presented at Objective By The Sea (OBTS) 3.0 in Maui, Hawaii March 2020
This presentation takes a case-study based approach to design patterns. A purposefully simplified example of expression trees is used to explain how different design patterns can be used in practice. Examples are in C#.
We use it every day and we rely on it. But what are the roots of cryptography? How were, for example, the ancient Greeks able to protect information from their enemies? In this talk we will go through 5500 years of developing encryption technologies and look at how these work.
From the Un-Distinguished Lecture Series (http://ws.cs.ubc.ca/~udls/). The talk was given Mar. 23, 2007
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
Symmetric Key Encryption Algorithms can be categorized as stream ciphers or block ciphers. Block ciphers like the Data Encryption Standard (DES) operate on fixed-length blocks of bits, while stream ciphers process messages bit-by-bit. DES is an example of a block cipher that encrypts 64-bit blocks using a 56-bit key. International Data Encryption Algorithm (IDEA) is another block cipher that uses a 128-bit key and 64-bit blocks, employing addition and multiplication instead of XOR like DES. IDEA consists of 8 encryption rounds followed by an output transformation to generate the ciphertext from the plaintext and key.
The document discusses the use of symbols and rules in Rspamd for defining spam detection logic. Symbols represent metadata like rule names and scores, while rules define spam detection expressions using regular expressions and logic. Symbols can be grouped and rules can reference symbols to define dependencies. Composites allow combining multiple rules and introducing logic to remove symbols and weights conditionally. Practical examples show how to define simple regex rules, complex rules combining multiple checks, and composites modifying rule results.
This talk introduces and discusses a novel, mostly unpublished technique to successfully attack websites that are applied with state-of-the-art XSS protection. This attack labeled Mutation-XSS (mXSS) is capable of bypassing high-end filter systems by utilizing the browser and its unknown capabilities - every single f***** one of them. We analyzed the type and number of high-profile websites and applications that are affected by this kind of attack. Several live demos during the presentation will share these impressions and help understanding, what mXSS is, why mXSS is possible and why it is of importance for defenders as well as professional attackers to understand and examine mXSS even further. The talk wraps up several years of research on this field, shows the abhorrent findings, discusses the consequences and delivers a step-by-step guide on how to protect against this kind of mayhem - with a strong focus on feasibility and scalability.
MMUG18 - MySQL Failover and OrchestratorSimon J Mudd
Description of recovery and failover with of MySQL and specifically how orchestrator is designed to make this simpler.
A presentation I gave at the Madrid MySQL Users Group on 17/05/2017.
Walking the Bifrost: An Operator's Guide to Heimdal & Kerberos on macOSCody Thomas
Kerberos on macOS with and without Active Directory (AD). Where are attacks possible in Kerberos and how does the LKDC (Local Key Distribution Center) come into play.
Presented at Objective By The Sea (OBTS) 3.0 in Maui, Hawaii March 2020
This presentation takes a case-study based approach to design patterns. A purposefully simplified example of expression trees is used to explain how different design patterns can be used in practice. Examples are in C#.
We use it every day and we rely on it. But what are the roots of cryptography? How were, for example, the ancient Greeks able to protect information from their enemies? In this talk we will go through 5500 years of developing encryption technologies and look at how these work.
From the Un-Distinguished Lecture Series (http://ws.cs.ubc.ca/~udls/). The talk was given Mar. 23, 2007
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
Symmetric Key Encryption Algorithms can be categorized as stream ciphers or block ciphers. Block ciphers like the Data Encryption Standard (DES) operate on fixed-length blocks of bits, while stream ciphers process messages bit-by-bit. DES is an example of a block cipher that encrypts 64-bit blocks using a 56-bit key. International Data Encryption Algorithm (IDEA) is another block cipher that uses a 128-bit key and 64-bit blocks, employing addition and multiplication instead of XOR like DES. IDEA consists of 8 encryption rounds followed by an output transformation to generate the ciphertext from the plaintext and key.
This document discusses using MongoDB and Mongoid with Ruby on Rails. It covers why MongoDB was chosen, how to set up Mongoid, different types of relationships and queries, and testing. Embedded and referenced relationships are described. Versioning, indexing and other features like callbacks are demonstrated. Hosting options like Heroku, MongoHQ and MongoLab are also mentioned.
Point field types in Solr. Evolution of the Range Queries.Andrey Mikryukov
The document discusses the evolution of range filters in Solr, from the initial naive implementation to optimizations using tries and then point fields. It explains how range filters were initially implemented by breaking them into individual term queries, and how tries were later used at index and query time to improve performance by exploiting common prefixes. The document then introduces point fields, which use a Bkd-tree to index multi-dimensional points in a way that adapts to the data distribution, offering better performance than tries.
This document discusses the architecture for testing new versions and rules of RSPAMD, an email spam filtering software. It proposes using a proxy server to encrypt and load balance traffic between stable and testing clusters. The proxy would immediately return results from the stable cluster and compare results from multiple testing clusters using scripts. This allows testing new versions and rules on live traffic while maintaining a stable filtering environment.
Thermopylae Sciences & Technology chose to customize MongoDB's spatial indexing capabilities to better support their needs for indexing multi-dimensional and geospatial data. They developed a custom R-tree spatial index that leverages existing MongoDB data structures and provides improved performance over MongoDB's existing geohash-based approach. Their custom index supports complex queries on multidimensional geometric shapes and scales to large geospatial datasets through potential sharding and distribution techniques. They have contributed their work back to the MongoDB open source project and collaborate with MongoDB to further integrate their contributions.
PHP remains the most popular server-side language on the Web and the most favoured language for Web attacks. The security vulnerabilities and attack techniques become more sophisticated though. For example, the vulnerability types PHP Object Instantiation and Phar Deserialization are comparatively unknown to traditional types like XSS and SQLi. In this technical talk, we look at a couple of critical security bugs found in popular open source PHP applications, such as WordPress, WooCommerce and Shopware. We will focus on fundamental design flaws and new state-of-the-art exploitation techniques that are used by attackers to compromise web servers through these issues which can occur in any other application as well.
This document provides an overview and instructions for installing and configuring ProxySQL. It discusses:
1. What ProxySQL is and its functions like load balancing and query caching
2. How to install ProxySQL on CentOS and configure the /etc/proxysql.cnf file
3. How to set up the ProxySQL schema to define servers, users, variables and other settings needed for operation
4. How to test ProxySQL functions like server status changes and benchmark performance
Presented at Percona Live Amsterdam 2016, this is an in-depth look at MariaDB Server right up to MariaDB Server 10.1. Learn the differences. See what's already in MySQL. And so on.
The document discusses PostgreSQL backup and recovery options including:
- pg_dump and pg_dumpall for creating database and cluster backups respectively.
- pg_restore for restoring backups in various formats.
- Point-in-time recovery (PITR) which allows restoring the database to a previous state by restoring a base backup and replaying write-ahead log (WAL) segments up to a specific point in time.
- The process for enabling and performing PITR including configuring WAL archiving, taking base backups, and restoring from backups while replaying WAL segments.
Cryptography its history application and beyondkinleay
1) The document discusses the history of cryptography from ancient times to the 19th century.
2) It describes early encryption methods used by Greeks, Spartans, and Julius Caesar that involved transposition ciphers and substitution ciphers.
3) A major advance was the polyalphabetic cipher invented by Leon Battista Alberti in the 15th century, which used multiple cipher alphabets and was considered "unbreakable" at the time.
Alfresco Web Scripts have become an important part of any Alfresco developer's tool kit and in this session we will take a deep dive into how Web Scripts can be used to provide public APIs for Alfresco extensions. After briefly reviewing the anatomy of a Web Script and discussing Alfresco's approach to Service development, we will work through an example that extends Alfresco with a simple service and creates a REST API using Web Scripts.
Redis is an in-memory data structure store that can be used as a database, cache, or message broker. It supports string, hash, list, set and sorted set data types and allows for atomic operations and transactions. While data resides in memory, Redis can optionally persist data to disk for durability. It is useful for caching, real-time analytics, queues and more due to its speed, flexibility and support for pub/sub messaging.
Benjamin Delpy is a security researcher from France known for creating the tool mimikatz. Mimikatz can retrieve credentials like hashes and keys from the LSASS process memory. It supports techniques like pass-the-hash, over-pass-the-hash, and credential dumping from memory dumps. Delpy gives presentations to teach people about Windows authentication and how mimikatz works.
MySQL replication has evolved a lot in 5.6 ,5.7 and 8.0. This presentation focus on the changes made in parallel replication. It covers MySQL 8.0. It was presented at Mydbops database meetup on 04-08-2016 in Bangalore.
Built in physical and logical replication in postgresql-Firat GulecFIRAT GULEC
What is Replication?
Why do we need Replication?
How many replication layers do we have?
Understanding milestones of built-in Database Physical Replication.
What is the purpose of replication? and How to rescue system in case of failover?
What is Streaming Replication and what is its advantages? Async vs Sync, Hot standby etc.
How to configurate Master and Standby Servers? And What is the most important parameters? Example of topoloji.
What is Cascading Replication and how to configurate it? Live Demo on Terminal.
What is Logical Replication coming with PostgreSQL 10? And What is its advantages?
Logical Replication vs Physical Replication
Limitations of Logical Replication
Quorum Commit for Sync Replication etc.
What is coming up with PostgreSQL 11 about replication?
10 Questions quiz and giving some gifts to participants according to their success.
PSConfEU - Offensive Active Directory (With PowerShell!)Will Schroeder
This talk covers PowerShell for offensive Active Directory operations with PowerView. It was given on April 21, 2016 at the PowerShell Conference EU 2016.
The document presents an overview of the International Data Encryption Algorithm (IDEA). IDEA is a symmetric-key block cipher that uses a 128-bit key to encrypt 64-bit blocks of plaintext into ciphertext. It was developed in 1990 as an alternative to the Data Encryption Standard. The algorithm operates by performing eight rounds of mathematical operations like multiplication, addition, and XOR on the plaintext and keys. It also describes the key generation process, the sequence of operations in each round of encryption and decryption, and applications of the IDEA algorithm like encryption of financial data, email, and multimedia content.
This document discusses using Hyperscan to match regular expressions in Rspamd. It describes how Hyperscan can match many regular expressions against the same data faster than joining expressions with pipes in PCRE. However, Hyperscan has slower compile times and does not support some PCRE features. The document proposes a "lazy hyperscan compile" approach where Rspamd initially uses PCRE alone and later switches to the pre-compiled Hyperscan database to get the speed benefits without the initial compile overhead. It shows how this approach improves Rspamd performance for both small and large messages.
These slides are from ruBSD conf held in Moscow, Russia on 14 Dec 2013. In my presentation, I've described the new SAT solver I've developed for FreeBSD package management system and the important consequences of this change.
This document discusses using MongoDB and Mongoid with Ruby on Rails. It covers why MongoDB was chosen, how to set up Mongoid, different types of relationships and queries, and testing. Embedded and referenced relationships are described. Versioning, indexing and other features like callbacks are demonstrated. Hosting options like Heroku, MongoHQ and MongoLab are also mentioned.
Point field types in Solr. Evolution of the Range Queries.Andrey Mikryukov
The document discusses the evolution of range filters in Solr, from the initial naive implementation to optimizations using tries and then point fields. It explains how range filters were initially implemented by breaking them into individual term queries, and how tries were later used at index and query time to improve performance by exploiting common prefixes. The document then introduces point fields, which use a Bkd-tree to index multi-dimensional points in a way that adapts to the data distribution, offering better performance than tries.
This document discusses the architecture for testing new versions and rules of RSPAMD, an email spam filtering software. It proposes using a proxy server to encrypt and load balance traffic between stable and testing clusters. The proxy would immediately return results from the stable cluster and compare results from multiple testing clusters using scripts. This allows testing new versions and rules on live traffic while maintaining a stable filtering environment.
Thermopylae Sciences & Technology chose to customize MongoDB's spatial indexing capabilities to better support their needs for indexing multi-dimensional and geospatial data. They developed a custom R-tree spatial index that leverages existing MongoDB data structures and provides improved performance over MongoDB's existing geohash-based approach. Their custom index supports complex queries on multidimensional geometric shapes and scales to large geospatial datasets through potential sharding and distribution techniques. They have contributed their work back to the MongoDB open source project and collaborate with MongoDB to further integrate their contributions.
PHP remains the most popular server-side language on the Web and the most favoured language for Web attacks. The security vulnerabilities and attack techniques become more sophisticated though. For example, the vulnerability types PHP Object Instantiation and Phar Deserialization are comparatively unknown to traditional types like XSS and SQLi. In this technical talk, we look at a couple of critical security bugs found in popular open source PHP applications, such as WordPress, WooCommerce and Shopware. We will focus on fundamental design flaws and new state-of-the-art exploitation techniques that are used by attackers to compromise web servers through these issues which can occur in any other application as well.
This document provides an overview and instructions for installing and configuring ProxySQL. It discusses:
1. What ProxySQL is and its functions like load balancing and query caching
2. How to install ProxySQL on CentOS and configure the /etc/proxysql.cnf file
3. How to set up the ProxySQL schema to define servers, users, variables and other settings needed for operation
4. How to test ProxySQL functions like server status changes and benchmark performance
Presented at Percona Live Amsterdam 2016, this is an in-depth look at MariaDB Server right up to MariaDB Server 10.1. Learn the differences. See what's already in MySQL. And so on.
The document discusses PostgreSQL backup and recovery options including:
- pg_dump and pg_dumpall for creating database and cluster backups respectively.
- pg_restore for restoring backups in various formats.
- Point-in-time recovery (PITR) which allows restoring the database to a previous state by restoring a base backup and replaying write-ahead log (WAL) segments up to a specific point in time.
- The process for enabling and performing PITR including configuring WAL archiving, taking base backups, and restoring from backups while replaying WAL segments.
Cryptography its history application and beyondkinleay
1) The document discusses the history of cryptography from ancient times to the 19th century.
2) It describes early encryption methods used by Greeks, Spartans, and Julius Caesar that involved transposition ciphers and substitution ciphers.
3) A major advance was the polyalphabetic cipher invented by Leon Battista Alberti in the 15th century, which used multiple cipher alphabets and was considered "unbreakable" at the time.
Alfresco Web Scripts have become an important part of any Alfresco developer's tool kit and in this session we will take a deep dive into how Web Scripts can be used to provide public APIs for Alfresco extensions. After briefly reviewing the anatomy of a Web Script and discussing Alfresco's approach to Service development, we will work through an example that extends Alfresco with a simple service and creates a REST API using Web Scripts.
Redis is an in-memory data structure store that can be used as a database, cache, or message broker. It supports string, hash, list, set and sorted set data types and allows for atomic operations and transactions. While data resides in memory, Redis can optionally persist data to disk for durability. It is useful for caching, real-time analytics, queues and more due to its speed, flexibility and support for pub/sub messaging.
Benjamin Delpy is a security researcher from France known for creating the tool mimikatz. Mimikatz can retrieve credentials like hashes and keys from the LSASS process memory. It supports techniques like pass-the-hash, over-pass-the-hash, and credential dumping from memory dumps. Delpy gives presentations to teach people about Windows authentication and how mimikatz works.
MySQL replication has evolved a lot in 5.6 ,5.7 and 8.0. This presentation focus on the changes made in parallel replication. It covers MySQL 8.0. It was presented at Mydbops database meetup on 04-08-2016 in Bangalore.
Built in physical and logical replication in postgresql-Firat GulecFIRAT GULEC
What is Replication?
Why do we need Replication?
How many replication layers do we have?
Understanding milestones of built-in Database Physical Replication.
What is the purpose of replication? and How to rescue system in case of failover?
What is Streaming Replication and what is its advantages? Async vs Sync, Hot standby etc.
How to configurate Master and Standby Servers? And What is the most important parameters? Example of topoloji.
What is Cascading Replication and how to configurate it? Live Demo on Terminal.
What is Logical Replication coming with PostgreSQL 10? And What is its advantages?
Logical Replication vs Physical Replication
Limitations of Logical Replication
Quorum Commit for Sync Replication etc.
What is coming up with PostgreSQL 11 about replication?
10 Questions quiz and giving some gifts to participants according to their success.
PSConfEU - Offensive Active Directory (With PowerShell!)Will Schroeder
This talk covers PowerShell for offensive Active Directory operations with PowerView. It was given on April 21, 2016 at the PowerShell Conference EU 2016.
The document presents an overview of the International Data Encryption Algorithm (IDEA). IDEA is a symmetric-key block cipher that uses a 128-bit key to encrypt 64-bit blocks of plaintext into ciphertext. It was developed in 1990 as an alternative to the Data Encryption Standard. The algorithm operates by performing eight rounds of mathematical operations like multiplication, addition, and XOR on the plaintext and keys. It also describes the key generation process, the sequence of operations in each round of encryption and decryption, and applications of the IDEA algorithm like encryption of financial data, email, and multimedia content.
This document discusses using Hyperscan to match regular expressions in Rspamd. It describes how Hyperscan can match many regular expressions against the same data faster than joining expressions with pipes in PCRE. However, Hyperscan has slower compile times and does not support some PCRE features. The document proposes a "lazy hyperscan compile" approach where Rspamd initially uses PCRE alone and later switches to the pre-compiled Hyperscan database to get the speed benefits without the initial compile overhead. It shows how this approach improves Rspamd performance for both small and large messages.
These slides are from ruBSD conf held in Moscow, Russia on 14 Dec 2013. In my presentation, I've described the new SAT solver I've developed for FreeBSD package management system and the important consequences of this change.
This document discusses optimizing the evaluation of logic expressions through the use of abstract syntax trees (ASTs). It begins by providing examples of different types of logic expressions. It then describes building an AST to represent the expression and reordering branches in the tree to minimize calculations through lazy evaluation. By prioritizing faster operations and cutting unnecessary branches, evaluation time can be reduced compared to a naive or reverse polish notation approach. Further optimizations include handling n-ary operations and dynamically learning expression weights over time. The key idea is that laziness in evaluation through AST reordering can significantly improve performance.
Rspamd is an open-source spam filtering system written in C that uses an event-driven architecture. It processes emails using multiple rules to evaluate scores, supports plugins in Lua, and has a self-contained web interface. Performance is a key design goal through techniques like optimized regular expressions, event-driven network processing, and optimized memory usage. It uses techniques like Bayesian filtering with n-grams, fuzzy hashes, and DNS-based filtering while prioritizing security through input validation, encryption for network traffic, and limiting DNS queries.
This document discusses how encryption performance and costs have improved over time due to new hardware, algorithms, and protocols. It analyzes the performance of AES-128-GCM, AES-256-GCM and ChaCha20-Poly1305 on CPUs from 2011 to 2013, showing significant throughput increases. It also discusses challenges in building secure systems, including backward compatibility issues, complex APIs, insecure defaults, and flaws in CBC mode and TLS protocol design. Finally, it presents opportunities like ChaCha20, DNS security methods, and opportunistic transport encryption to further improve security.
Pkg is FreeBSD's package management system that aims to simplify system management through easy installation and upgrading of binary packages while automatically resolving dependencies and conflicts. It features a new solver that can handle complex upgrade scenarios through sandboxing and concurrent locking. The internal SAT-based solver represents packages and dependencies as a SAT problem to determine the best solution.
Практика совместного использования Lua и C в opensource спам-фильтре Rspamd /...Ontico
В данном докладе я расскажу о том, как Lua помогает расширять функционал Rspamd, позволяя людям без особых знаний С писать эффективные правила фильтрации спама. Также будут рассмотрены особенности внедрения Lua в C код и основные приемы, применяемые при написании API для Lua приложений. Отдельное внимание будет уделено документации к Lua API, которая является одним из необходимых компонентов для opensource приложения.
Кроме этого, отдельная часть доклада посвящена анализу производительности Lua: использованию LuaJIT, сравнению вызовов C функций через FFI с традиционным вызовом, оптимизации строковых операций и таблиц в Lua.
В заключение будут рассмотрены некоторые открытые вопросы: будущее языка, наличие нескольких диалектов, статический анализ Lua стека, а также вопросы безопасности при JIT компиляции.
Postfix is a free and open-source mail transfer agent (MTA) that is commonly used on Linux systems. It handles receiving and delivering email by using several server processes and queues. When receiving mail, Postfix uses smtpd, qmqpd, pickup, and cleanup servers to validate messages and add them to the incoming queue. For delivery, it uses qmgr to route messages from the incoming queue through active delivery agents like smtp, lmtp, local, and virtual to recipients or deferred queue if delivery fails. Postfix prioritizes stability, scalability and security in its flexible and modular design.
This document discusses setting up a Linux-based mail server. It begins with an introduction of the author and an explanation of why email is important. It then describes the basic email transfer process from Mail User Agent to Mail Transfer Agent to mailbox. The main components involved are outlined as the Mail Transfer Agent (MTA), Mail Delivery Agent (MDA), and Mail User Agent (MUA). Common protocols like SMTP, IMAP, and POP3 are also defined. Steps for installing and configuring Postfix as the MTA are provided, including adding domains and reloading the server.
The document discusses various methods for encrypting and authenticating data in PHP, including:
1. Encrypting data with md5() hash functions, the MCrypt package, and file-based authentication.
2. MCrypt supports two-way encryption algorithms like DES and allows encrypting and decrypting data.
3. File-based authentication parses a text file into an array to authenticate users by comparing hashed passwords.
Understanding and Designing Ultra low latency systems | Low Latency | Ultra L...Anand Narayanan
Traditional models of concurrent programming have been around for some time. They work well and have matured quite a bit over the last couple of years. However, every once in a while there low latency and high throughput requirements that cannot be met by traditional models of concurrency and application design. How about handling almost 1 billion operations/second per core of a system. When designing such systems one has to throw away the traditional models of application design and thing different. In this seminar we discuss some of the approaches that can make this possible. This approach is hardware friendly and a re-look at data from a hardware perspective. It requires logical understanding of how modern hardware works. It also requires the knowledge of tools that can help track down a particular stall and possibly the reason behind it. This may provide pointers for a redesign if required. In the balance then, this seminar is about an architecture which is Hardware friendly. It also about very specialized data structures that fully exploits the underlying architecture of the processor, cache and memory.
1. Encryption involves encoding information using an algorithm and cryptographic key to convert plain text into an encrypted cipher text that hides the information from anyone except the intended recipient. Symmetric encryption uses a single shared key for encryption and decryption while asymmetric encryption uses separate public and private keys.
2. Common encryption algorithms and standards discussed include AES, DES, RSA, MD5, and SHA which vary in their key sizes, block sizes, number of rounds, use of public/private keys, and applications for encrypting data and verifying message integrity.
3. Symmetric algorithms like AES and DES are block ciphers that encrypt fixed sized blocks of plain text while asymmetric algorithms like RSA use public/private key pairs
CNIT 50: 6. Command Line Packet Analysis ToolsSam Bowne
This document provides an overview of 6 command line packet analysis tools used for network security monitoring: Tcpdump, Dumpcap, Tshark, Argus, the Argus Ra client, and Argus Racluster. It describes what each tool is used for, basic syntax and examples of using filters to view specific traffic like ICMP, DNS, TCP handshakes. It also covers running these tools from the command line, reading captured packet files, and examining Argus session data files.
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
By Rajiv Kurian, software engineer at SignalFx.
At SignalFx, we deal with high-volume high-resolution data from our users. This requires a high performance ingest pipeline. Over time we’ve found that we needed to adapt architectural principles from specialized fields such as HPC to get beyond performance plateaus encountered with more generic approaches. Some key examples include:
* Write very simple single threaded code, instead of complex algorithms
* Parallelize by running multiple copies of simple single threaded code, instead of using concurrent algorithms
* Separate the data plane from the control plane, instead of slowing data for control
* Write compact, array-based data structures with minimal indirection, instead of pointer-based data structures and uncontrolled allocation
The overall volume of Internet traffic has been growing in a tremendous rate day-by-day which also contains unwanted malicious traffic. It has been a continuous challenge for the network operators to effectively identify the threats from line rate traffic. Hyperscan is a pattern (in terms of regular expression) matching software ideal for applications such as intrusion prevention/detection system, antivirus, unified threat management, deep packet inspection systems, etc.
Hyperscan works in two phases. At first, the customer patterns are parsed and compiled into databases in terms of bytecode. During runtime, these bytecode are used to search for patterns against blocks/streams data. Hyperscan library runs entirely in software and scales with IA processors to provide the maximum throughput of 293 Gbps.
This document discusses various techniques for auditing source code to discover vulnerabilities, including both automated and manual methods. It describes tools like Cscope, Ctags, and source code analysis tools. It also covers common classes of vulnerabilities that may be found through source code auditing, such as format string vulnerabilities, off-by-one errors, integer conversion issues, uninitialized variable usage, and race conditions in multithreaded code. Effective methodologies discussed include searching for specific vulnerable code patterns, understanding application logic from entry points, and focusing on attacker-controlled input handling.
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
This presentation shows that code coverage guided fuzzing is possible in the context of network daemon fuzzing.
Some fuzzers are blackbox while others are protocol aware. Even ones which are made protocol aware, fuzzer writers typically model the protocol specification and implement packet awareness logic in the fuzzer. Unfortunately, just because the fuzzer is protocol aware, it does not guarantee that sufficient code paths have been reached.
The presentation deals with specific scenarios where the target protocol is completely unknown (proprietary) and no source code or protocol specs are accessible. The tool developed builds a feedback loop between the client and the server components using the concept of "gate functions". A gate function triggers monitoring. The pintool component tracks the binary code coverage for all the functions untill it reaches an exit gate. By instrumenting such gated functions, the tool is able to measure code coverage during packet processing.
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMJonathan Katz
Jonathan S. Katz gave a talk on safely protecting passwords in PostgreSQL. He discussed:
- The evolution of password management in PostgreSQL, from storing passwords in plain text to using md5 hashes to modern SCRAM authentication.
- How plain text and md5 password storage are insecure as passwords can be intercepted or cracked.
- The SCRAM authentication standard which allows two parties to verify they know a secret without exchanging the secret directly.
- How PostgreSQL implements SCRAM-SHA-256 to generate a secure verifier from the password and authenticate users with random salts and iterations to secure against brute force attacks.
Debugging applications with network security toolsConFoo
This document summarizes network security tools that can be used for debugging applications, including Wireshark for network packet capture and analysis, WebScarab for intercepting and modifying web traffic, and Netcat for network communication. Wireshark is introduced for capturing live network traffic and analyzing protocols. WebScarab is described as a tool for intercepting web traffic in transit to modify it, as well as for fuzzing. Netcat is presented as a utility for listening on ports, connecting to ports, and emulating clients like HTTP. Examples are given for troubleshooting with these tools.
Slides for a college course at City College San Francisco. Based on "The Shellcoder's Handbook: Discovering and Exploiting Security Holes ", by Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte; ASIN: B004P5O38Q.
Instructor: Sam Bowne
Class website: https://samsclass.info/127/127_S17.shtml
This document discusses techniques used to evade detection from enterprise security systems. It covers common security technologies like firewalls, IDS, IPS and how attackers can bypass them. Specific evasion techniques discussed include modifying packet headers, fragmentation, source routing and using tunnels through other compromised systems. The goal is to introduce common concepts but the document is not intended to be comprehensive.
Packet Analysis - Course Technology Computing Conference
Presenter: Lisa Bock - Pennsylvania College of Technology
Most network administrators are well-versed in hardware, applications, operating systems, and network analysis tools. However, many are not trained in analyzing network traffic. Network administrators should be able to identify normal network traffic in order to determine unusual or suspicious activity. Network packet analysis is important in order to troubleshoot congestion issues, create firewall and intrusion detection system rules, and perform incident and threat detection. This hands-on presentation will review fundamental concepts necessary to analyze network traffic, beginning with an overview of network analysis, then a review the TCP/IP protocol suite and LAN operations. Participants will examine packet captures and understand the field values of the protocols and as to what is considered normal behavior, and then examine captures that show exploits, network reconnaissance, and signatures of common network attacks. The program will use Wireshark, a network protocol analyzer for Unix and Windows, to study network packets, look at basic features such as display and capture filters, and examine common protocols such as TCP, HTTP, DNS, and FTP. Time permitting, the presentation will provide suggestions on how to troubleshoot performance problems, conduct a network baseline, and how to follow a TCP or UDP stream and see HTTP artifacts. Participants should have a basic knowledge of computer networking and an interest in the subject.
An introduction to message queues with PHP. We'll focus on RabbitMQ and how to leverage queuing scenarios in your applications. The talk will cover the main concepts of RabbitMQ server and AMQP protocol and show how to use it in PHP. The RabbitMqBundle for Symfony2 will be presented and we'll see how easy you can start to use message queuing in minutes.
Presented at Symfony User Group Belgium: http://www.meetup.com/Symfony-User-Group-Belgium/events/169953362/
Spy hard, challenges of 100G deep packet inspection on x86 platformRedge Technologies
This document discusses challenges and approaches for performing deep packet inspection (DPI) at speeds of 100 gigabits per second and beyond on x86 platforms. It begins by explaining why DPI is needed at such high speeds, for tasks like large-scale intrusion detection. It then examines the performance requirements for scanning payloads at 100Gbps rates. The document reviews different software approaches for payload matching, such as regular expressions, and hardware that can assist, such as Intel's Hyperscan technology. It also provides examples of how Hyperscan can be integrated into real-world intrusion detection and prevention systems.
This document provides an overview of cryptography concepts including:
- A brief history of cryptography from manual ciphers to modern computer-based systems.
- Key terms like plaintext, ciphertext, and keyspace.
- Types of cryptographic methods including symmetric, asymmetric, and hashing algorithms.
- Public key infrastructure components like certificates and certificate authorities.
- Cryptanalysis techniques for attacking cryptosystems such as brute force and social engineering attacks.
- Applications of cryptography including email security, network security protocols like SSL/TLS, and IPSec.
Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, and fault-tolerant database. It originated at Facebook in 2007 to solve their inbox search problem. Some key companies using Cassandra include Twitter, Facebook, Digg, and Rackspace. Cassandra's data model is based on Google's Bigtable and its distribution design is based on Amazon's Dynamo.
Donatas Mažionis, Building low latency web APIsTanya Denisyuk
This document discusses latency and strategies for improving latency when building APIs and services. It describes using percentiles instead of averages to measure latency, and tools like HdrHistogram and Finagle for tracking latencies. It also discusses experiences using Redis, Cassandra and Aerospike for low latency, and strategies for connecting to Cassandra and handling timeouts. Key takeaways are around not relying on averages, challenges of scaling in .NET, choosing the right NoSQL solution, and that blocking is not always bad if it reduces timeouts.
This document provides an overview of tips and best practices for high performance server programming. It discusses fundamentals like avoiding blocking, using efficient algorithms and data structures, and separating I/O from business logic. It also covers more advanced topics like I/O models, scheduling approaches, buffer management, concurrency models, and security considerations. Finally, it recommends tools and resources for profiling, debugging, and learning more about TCP/IP and networking.
6. Rspamd in a nutshell
• Written in plain C
• Uses non-blocking (event-driven) data processing
model
• Oriented on performance
• Plugins and rules are written in Lua
• Supports all major filtering methods
• Packaged for the most of *nix systems
8. Major spam classes
• Fraud and scam:
• “Nigerian” mail
• Phishing (banks, social networks)
• Advertisement:
• Plain text
• Pictures, obfuscated HTML etc
• Spam from the social networks notifications
12. Filtering methods
Basic principles
• Policies (particularly efficient against fraud)
• DNS lists (for IP and URLs)
• SPF, DKIM, DMARC
• Sender’s reputation
• Static content filtering (patterns match)
• Statistics and ML (e.g. Bayesian)
13. Filtering methods
Difficulties
• Policies:
• Require network requests with unpredictable delay
• False positive (e.g. when forwarding)
• Static Content filtering:
• CPU intensive
• Unlikely to be useful long-term
• Statistics:
• Not very precise
• Hard to adopt for multi-user systems
14. Rspamd special features
• Text lemmatisation and normalisation (snowball stemmer)
• Rspamd converts all texts to utf-8
• Support for users settings in JSON
• Possibility to have multiple metrics for a message
• Web interface
• Rspamd can load SpamAssassin rules directly
• OpenSource (BSD 2-clause)
15. Rspamd integration
Using Rmilter
MTA Rmilter Rspamd
Storage
Message and SMTP DATA HTTP request
Redis cache
JSON: action, symbolsDelay, modify or reject
Greylisting, limits Statistics, shared cache
✉✉✉✉
✉ ✉
🚮
Inbox Spam
✉
Reject
✉
16. Rspamd integration
milter
• rmilter + postfix/sendmail:
• supports all rspamd features (e.g. greylisting
action)
• can split inbound and outbound messages
• can whitelist reply to own messages
• can add DKIM signatures to outgoing email
17. Rspamd integration
Other MTA
• exim - supports rspamd internally (from 4.86)
• haraka - supports rspamd internally
• any other MTA - LDA mode when rspamc checks
message on rspamd and calls delivery agent with
altered message using Unix pipes
22. Event driven model
Never blocks*
• Advantages:
✅ Can process network replies asynchronously
✅ Sends all network request simultaneously
✅ Can process many messages in parallel, scales well
• Disadvantages:
📛 Callbacks hell
⛔ Memory consumption can go wild
*almost never
27. Event-driven processing
• Rspamd can send thousands of DNS requests:
time: 5540.8ms real, 2427.4ms virtual, dns req: 120543
• For short messages waiting prevails on the
processing time:
time: 996.140ms real, 22.000ms virtual,
28. Statistics in rspamd
• Uses Markov chains for statistic tokens (using
sparse bi-gramms model)
• Considers metadata in statistics: mime structure,
headers and so on
• Statistics could be stored in redis or sqlite3
• Rspamd supports per-user, per-domain statistics,
multiple classifiers and autolearning
30. Statistics example
Hard case: images spam
0%
25%
50%
75%
100%
Spam Ham
8%5%
92%95%
Correctly recognized Unrecognized
31. Fuzzy hashes
Introduction
• Can find similar or close to similar messages
• Rspamd uses fuzzy logic for text and exact
hashes for attachments
• Storage engine - sqlite3
32. Shingles Algorithm
Quick brown fox jumps over lazy dog
w1 w2 w3
w2 w3 w4
w3 w4 w5
w4 w5 w6
w1 w2 w3
w2 w3 w4
w3 w4 w5
w4 w5 w6
N hashes
h1
h2
h3
h4
h1’
h2’
h3’
h4’
33. Shingles Algorithm
h1 h2 h3
h1’ h2’ h3’
h1’’ h2’’ h3’’
h1’’’’’ h2’’’’ h3’’’’
…
…
…
…
…
min
min
min
min
…
N shinglesN hash pipes
36. Main Principles
• Always consider performance.
• If something could be skipped, skip it
• Use more specific solution if it is faster than
generic one (special FSM, asm fragments)
• If it is possible to do something fast but not 100%
accurate, try faster method
37. Rules Optimisation
Global optimisation
• Stop checks if a message has reached threshold
• Prioritise rules with negative weight
• Prioritise faster and more frequent rules:
• Check weight, hit probability and average time
• Use greedy sorting algorithm
38. Rules Optimisation
Local optimisations
• Build Abstract Syntax Tree for each rule
• Tree is periodically resorted in respect of hit
probability and execution time
• Regular expressions are optimised using PCRE
+JIT and Hyperscan (from 1.1)
• Lua code is optimised by means of LuaJIT
43. Hyperscan in Rspamd
Problem statement
• Need to match many regular expressions for the
same data:
len: 610591, time: 2492.457ms real, 882.251ms virtual
regexp statistics: 4095 pcre regexps scanned, 18 regexps matched, 694M bytes scanned using pcre
• No need to capture patterns
• Need to match both UTF8 and raw data
• Need to support Perl compatible syntax
44. Naive solution
• Join all expressions with ‘|’:
/abc/ + /cde/ => /(?:abc)|(?:cde)/
• Won’t work:
/abc/ + /a/ => /(?:abc)|(?:a)/ -> the second rule won’t match for ‘abc’
46. Rules and Hyperscan
From
Subject
Text Part
HTML Part
Attachment
Header RE class
Header RE class
Text Parts class
Raw Body class
Whole message
m1
m2
m3
m4
m5
m6
MIME message Hyperscan regexps
47. Problems with Hyperscan
• Slow compile time (like 100 times slower than
PCRE+JIT)
• Many PCRE features are unsupported (lookahead,
look-behind)
• Is not packaged for the major OSes (requires
relatively fresh boost)
48. Lazy hyperscan compile
HS compiler
Main
Scanners
Compiled
class
Compiled
class
Compiled
class
Use PCRE only
Compile to disk
49. Lazy hyperscan compile
HS compiler
Main
Scanners
Compiled
class
Compiled
class
Compiled
class
Use PCRE only
All done Broadcast message
50. Lazy hyperscan compile
HS compiler
Main
Scanners
Compiled
class
Compiled
class
Compiled
class
Switch to hyperscan
Loaded from cache
Wait for ‘recompile’ command
51. Lazy hyperscan compile
• Does not affect the starting time
• If nothing changes then scanners load pre-
compiled expressions on startup (using crypto
hashes)
• Recompile command is used to force
recompilation process:
rspamadm recompile
55. Main Principles
• Use secure C programming guidelines
• Perform static analysis for the code including
special clang plugin for own printf format safety
• Deal with malicious messages
• Limit and protect DNS subsystem
• Encrypt and protect all data passed over the
network
56. Examples
• An email can contain thousands of objects to
check (e.g. URLs)
• SPF records could have infinite recursion
• Need limits for network checks and memoization
of the results
57. DNS protection
• Cryptographic DNS id generation
• Socket pool with constant rotation
• Source port randomisation
• Input filtering (+ IDN encoding)
58. Traffic Encryption
• Security -> performance -> simplicity
• TLS is hard in non-blocking IO
• TLS does a lot of unnecessary (in case of
rspamd) operations increasing thus CPU load and
latency
66. Why HTTPCrypt is fast
• Faster key exchange (ECDH)
• Server does not need to sign anything
(needed for replay protection in TLS)
• Encryption is done in-place
• No replay protection for the first message - still
suitable for idempotent scanning
68. Conclusions
• Rspamd is designed to work as fast as possible
• Rspamd has a rich and actively maintained
collection of rules and plugins and can re-use
existing SA rules
• It is possible to encrypt all data transferred over
the network in rspamd ecosystem
69. Open issues
• No Debian maintainer (any volunteers?)
• No dynamic rules updates (planned for 1.2) yet
• Rapid development (hard to keep compatibility)
73. Configuration evolution
1. Grammar parser (lex + yacc)
⛔ Hard to manage
⛔ Hard to extend
2. XML
⛔ Unreadable
⛔ Problems with expressions (A > B)
3. UCL - universal configuration language
✅ Easy to manage (looks like nginx.conf)
✅ Macro support
✅ JSON data model (can be used as JSON parser)
74. UCL building blocks
• Sections
• Arrays
• Variables
• Macros
• Comments
section {
key = “value”;
number = 10K;
}
upstreams = [
“localhost:80”,
“example.com:8080”,
]
static_dir = “${WWWDIR}/“;
filepath = “${CURDIR}/data”;
.include “${CONFDIR}/workers.conf”
.include (glob=true,priority=2) “${CONFDIR}/conf.d/*.conf”
.lua { print(“hey!”); }
key = value; // Single line comment
/* Multiline comment
/* can also be nested */
*/
75. Configuration components
Global options
Workers configuration
Scores (metrics)
Modules configuration
Statistics
• Each component is normally included
to the main configuration
• rspamd.local.conf is used to extend
configuration
• rspamd.override.conf is used to
override values in the configuration
• It is possible to use numeric
multipliers: “k/m/g” or “ms/s/m/h/d” for
time values
76. Lua rules
• Most of rules are defined in Lua configuration
• Two types of Lua rules:
• Regexp rules (look like strings)
• Lua functions (pure Lua code)
77. Lua rules
Some examples
-- Outlook versions that should be excluded from summary rule
local fmo_excl_o3416 = 'X-Mailer=/^Microsoft Outlook, Build 10.0.3416$/H'
local fmo_excl_oe3790 = 'X-Mailer=/^Microsoft Outlook Express 6.00.3790.3959$/H'
-- Summary rule for forged outlook
reconf['FORGED_MUA_OUTLOOK'] = string.format('(%s | %s) & !%s & !%s & !%s',
forged_oe, forged_outlook_dollars, fmo_excl_o3416, fmo_excl_oe3790, vista_msgid)
• Regexp rule
• Lua rule
rspamd_config.R_EMPTY_IMAGE = function(task)
local tp = task:get_text_parts() -- get text parts in a message
for _,p in ipairs(tp) do -- iterate over text parts array using `ipairs`
if p:is_html() then -- if the current part is html part
local hc = p:get_html() -- we get HTML context
local len = p:get_length() -- and part's length
if len < 50 then -- if we have a part that has less than 50 bytes of text
local images = hc:get_images() -- then we check for HTML images
if images then -- if there are images
for _,i in ipairs(images) do -- then iterate over images in the part
if i['height'] + i['width'] >= 400 then -- if we have a large image
return true -- add symbol
end
end
end
end
end
end
end
78. Pure LUA functions
Review
• Are very powerful
• Have access to all information from rspamd via lua
API: https://rspamd.com/doc/lua/
• Are very fast since C <-> LUA interaction is cheap
• Can use zero-copy objects called rspamd{text} to
avoid copying when moving data between C and
LUA
79. Pure LUA functions
• Variables:
• Conditionals:
• Loops:
• Tables:
• Functions:
• Closures:
local ret = false -- Generic variable
local rules = {} -- Empty table
local rspamd_logger = require “rspamd_logger" -- Load rspamd module
if not ret then -- can use ‘not’, ‘and’, ‘or’ here
…
elseif ret ~= 10 then -- note ~= for ‘not equal’ operator
end
for k,m in pairs(opts) do … end -- Iterate over keyed table a[‘key’] = value
for _,i in ipairs(images) do … end -- Iterate over array table a[1] = value
for i=1,10 do … end -- Count from 1 to 10
local options = { [1] = ‘value’, [‘key’] = 1, -- Numbers starts from 1
another_key = function(task) … end, -- Functions can be values
[2] = {} -- Other tables can be values
} -- Can have both numbers and strings as key and anything as values
local function something(task) -- Normal definition
local cb = function(data) -- Functions can be nested
…
end
end
local function gen_closure(option)
local ret = false -- Local variable
return function(task)
task:do_something(ret, option) -- Both ‘ret’ and ‘option’ are accessible here
end
end
rspamd_config.SYMBOL = gen_closure(‘some_option’)
80. Pure LUA functions
Generic recommendations
• Use local whenever possible (otherwise, global
variables are expensive)
• Callbacks, closures and recursion are generally cheap
(when using LuaJIT)
• Do not mix string and number keys in tables, that
makes them hard to iterate
• ipairs and pairs are not equal
• Strings are constant in LUA
81. Regexp rules
Types
• Can work with the following elements:
• Headers: Message-Id=/^something$/H
• Mime parts: /some word/P
• Raw messages: /some pattern/M
• URLs: /example.com/U
• Some new flags are added:
• UTF8 flag: /u
82. Regexp rules
Generic information
• Can be combined using the following operators:
• AND: /something/P && Subject=/some/H
• OR: /something/P || Subject=/some/H
• NOT: !/something/P
• PLUS: /A/P + /B/P + /C/P >= 2
• Priority goes as following: NOT ➡ AND ➡ OR ➡ PLUS
• Braces can change priority: !A AND (B OR C)
83. Regexp rules
Performance considerations
• Avoid message regexps at any cost (use trie
instead)
• Regexp expressions are highly optimised in rspamd
and unnecessary evaluations are not performed
• UTF regexps are more expensive than default ones
(but could be useful sometimes)
• Always use the appropriate type of expression (e.g.
url for links and part for textual content)
84. Trie matching
• Perfect for fast raw message and text pattern
matching
• Scales almost linearly with input size (aho-corasic
algorithm)
• Can handle thousands and hundreds of thousands
of patterns (is a base for all antivirus scanners)
• Highly optimised for 64 bit systems