Presenter: Siddharth Muralee
Abstract: This talk explains why Sanitizers are important and how they can be used to help increase the efficiency of mainstream fuzzing. I will be delving into detail about how they work with relevant information about instrumentation and mapping. We will be looking at how the kernel implementation of the Address Sanitizer with a focus on the implementation in NetBSD. In the end of the talk I will be discussing some small examples of userland and kernel land bugs we can detect with the Address Sanitizer and I will use the opportunity to compare the Address Sanitizer with other tools out there.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
we need to have a good amount of basic or in-depth knowledge on Linux Basics. This will help one's job easy in resolving the issues and supporting the projects.
Are you a system admin or database admin? Or working on any other technology which is deployed or implemented on linux/UNIX machines? Then you should be good with Linux basic concepts and commands. We will cover this section very clearly.
Kernel address sanitizer (KASan) is a dynamic memory error detector for finding out-of-bounds and use-after-free bugs in Linux kernel. It uses shadow memory to record whether each byte of memory is safe to access and uses compile-time instrumentation to check shadow memory
on each memory access. In this presentation Alexander Popov will describe the successful experience of porting KASan to a bare-metal hypervisor: the main steps, pitfalls and the ways to make KASan checks much more strict and multi-purpose.
This presentation was delivered at LinuxCon Japan 2016 by Alexander Popov
Innodb에서의 Purge 메커니즘 deep internal (by 이근오)I Goo Lee.
The document discusses InnoDB's purge mechanism in MySQL. It explains that purge is needed to reclaim disk space used by deleted or updated data and to prevent performance degradation from long history lists. It then describes how purge works for update undo records, maintaining the before images of updated rows in undo pages to support transaction isolation. Purge eventually removes old undo records after transactions commit or rollback.
Linux has become integral part of Embedded systems. This three part presentation gives deeper perspective of Linux from system programming perspective. Stating with basics of Linux it goes on till advanced aspects like thread and IPC programming.
The document provides an introduction to performance tuning. It discusses tracing SQL execution to analyze performance issues. Tracing can be done at different levels, and the tkprof utility helps analyze trace files by providing formatted output. Understanding execution plans is also an important part of performance tuning, as it shows the steps and cost of executing a SQL statement.
The document discusses Linux file systems. It begins with an overview of file system architecture, including inodes, dentries, superblocks, and how data is never erased but overwritten. It then covers various local file systems like Ext2, Ext3, Ext4, ReiserFS, and XFS. Next it discusses log-structured and pseudo file systems. It also covers network file systems like NFS and CIFS. Finally it summarizes cluster, distributed, and Hadoop file systems. The document provides a technical overview of Linux file system types, structures, features and capabilities.
This document discusses the different types of tablespaces in InnoDB including the system tablespace (ibdata1), file-per-table tablespaces (.ibd), general tablespaces (.ibd), undo tablespaces (undo_001), and temporary tablespaces (.ibt, ibtmp1). It provides details on the structure and management of space within these tablespaces including pages, extents, segments, the file space header, extent descriptor pages, and the doublewrite buffer.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
we need to have a good amount of basic or in-depth knowledge on Linux Basics. This will help one's job easy in resolving the issues and supporting the projects.
Are you a system admin or database admin? Or working on any other technology which is deployed or implemented on linux/UNIX machines? Then you should be good with Linux basic concepts and commands. We will cover this section very clearly.
Kernel address sanitizer (KASan) is a dynamic memory error detector for finding out-of-bounds and use-after-free bugs in Linux kernel. It uses shadow memory to record whether each byte of memory is safe to access and uses compile-time instrumentation to check shadow memory
on each memory access. In this presentation Alexander Popov will describe the successful experience of porting KASan to a bare-metal hypervisor: the main steps, pitfalls and the ways to make KASan checks much more strict and multi-purpose.
This presentation was delivered at LinuxCon Japan 2016 by Alexander Popov
Innodb에서의 Purge 메커니즘 deep internal (by 이근오)I Goo Lee.
The document discusses InnoDB's purge mechanism in MySQL. It explains that purge is needed to reclaim disk space used by deleted or updated data and to prevent performance degradation from long history lists. It then describes how purge works for update undo records, maintaining the before images of updated rows in undo pages to support transaction isolation. Purge eventually removes old undo records after transactions commit or rollback.
Linux has become integral part of Embedded systems. This three part presentation gives deeper perspective of Linux from system programming perspective. Stating with basics of Linux it goes on till advanced aspects like thread and IPC programming.
The document provides an introduction to performance tuning. It discusses tracing SQL execution to analyze performance issues. Tracing can be done at different levels, and the tkprof utility helps analyze trace files by providing formatted output. Understanding execution plans is also an important part of performance tuning, as it shows the steps and cost of executing a SQL statement.
The document discusses Linux file systems. It begins with an overview of file system architecture, including inodes, dentries, superblocks, and how data is never erased but overwritten. It then covers various local file systems like Ext2, Ext3, Ext4, ReiserFS, and XFS. Next it discusses log-structured and pseudo file systems. It also covers network file systems like NFS and CIFS. Finally it summarizes cluster, distributed, and Hadoop file systems. The document provides a technical overview of Linux file system types, structures, features and capabilities.
This document discusses the different types of tablespaces in InnoDB including the system tablespace (ibdata1), file-per-table tablespaces (.ibd), general tablespaces (.ibd), undo tablespaces (undo_001), and temporary tablespaces (.ibt, ibtmp1). It provides details on the structure and management of space within these tablespaces including pages, extents, segments, the file space header, extent descriptor pages, and the doublewrite buffer.
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses Ext4 journaling and the write barrier feature. It notes that the write barrier forces a flush-to-disk call after writing the journal to ensure consistency. However, this can cause sluggishness when storage is full during OTA updates. Disabling the write barrier allows reordering of cache-to-disk writes, reducing latency and improving performance, though it introduces a small risk of filesystem corruption in the event of a power failure. Tests showed disabling the barrier reduced fsync latency and improved SQLite transactions per second on HDD and EMMC storage.
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses using plProxy and pgBouncer to split a PostgreSQL database horizontally and vertically to improve scalability. It describes how plProxy allows functions to make remote calls to other databases and how pgBouncer can be used for connection pooling. The RUN ON clause of plProxy is also summarized, which allows queries to execute on all partitions or on a specific partition.
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
ClickHouse Defense Against the Dark Arts - Intro to Security and PrivacyAltinity Ltd
This document discusses building privacy-aware applications using ClickHouse. It presents several multi-tenancy models for ClickHouse including dedicated installations, clusters, databases, and tables. It also covers restricting user access to specific databases, tables, and rows when using shared tables. The document discusses efficient ways to delete tenant data using partitions and column TTLs. Other considerations discussed include securing system tables and logs, encrypting sensitive data, and using constraints on server settings.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
Deadlock occurs when two or more processes are waiting for resources held by each other in a circular chain, resulting in none of the processes making progress. There are four conditions required for deadlock: mutual exclusion, hold and wait, no preemption, and circular wait. Deadlock can be addressed through prevention, avoidance, detection, or recovery methods. Prevention aims to eliminate one of the four conditions, while avoidance techniques like the safe state model and Banker's Algorithm guarantee a safe allocation order to avoid circular waits.
The document discusses various Oracle performance monitoring tools including Oracle Enterprise Manager (OEM), Automatic Workload Repository (AWR), Automatic Database Diagnostic Monitor (ADDM), Active Session History (ASH), and eDB360. It provides overviews of each tool and examples of using AWR, ADDM, ASH and eDB360 for performance analysis through demos. The conclusions recommend OEM as the primary tool and how the other tools like AWR, ADDM and ASH complement it for deeper performance insights.
RocksDB is an embedded key-value store written in C++ and optimized for fast storage environments like flash or RAM. It uses a log-structured merge tree to store data by writing new data sequentially to an in-memory log and memtable, periodically flushing the memtable to disk in sorted SSTables. It reads from the memtable and SSTables, and performs background compaction to merge SSTables and remove overwritten data. RocksDB supports two compaction styles - level style, which stores SSTables in multiple levels sorted by age, and universal style, which stores all SSTables in level 0 sorted by time.
The document discusses compaction in RocksDB, an embedded key-value storage engine. It describes the two compaction styles in RocksDB: level style compaction and universal style compaction. Level style compaction stores data in multiple levels and performs compactions by merging files from lower to higher levels. Universal style compaction keeps all files in level 0 and performs compactions by merging adjacent files in time order. The document provides details on the compaction process and configuration options for both styles.
This document describes the functions of various Linux commands, including commands for listing files (ls), creating directories (mkdir) and files (touch, cat), copying files (cp), changing directories (cd), moving files (mv), finding file locations (whereis, which), displaying manual pages (man, info), checking disk usage (df, du), viewing running processes (ps), setting aliases (alias), changing user identity (su, sudo), viewing command history (history), setting the system date and time (date), displaying calendars (cal), and clearing the terminal screen (clear). It provides the syntax and examples for using each command.
This version of "Oracle Real Application Clusters (RAC) 19c & Later – Best Practices" was first presented in Oracle Open World (OOW) London 2020 and includes content from the OOW 2019 version of the deck. The deck has been updated with the latest information regarding ORAchk as well as upgrade tips & tricks.
There is hardly a Senior Java developer who has never heard of sun.misc.Unsafe. Though it has always been a private API intended for JDK internal use only, the popularity of Unsafe has grown too fast, and now it is used in many open-source projects. OK.RU is not an exception: its software also heavily relies on Unsafe APIs.
During this session we'll try to understand what is so attractive about Unsafe. Why do people keep using it regardless the warnings of removal from future JDK releases? Are there any safe alternatives to private API or is it absolutely vital? We will review the typical cases when Java developers prefer to go unsafe and discuss major benefits and the drawbacks of it. The report will be supported by the real examples from OK.RU experience.
The document provides an introduction to Linux and device drivers. It discusses Linux directory structure, kernel components, kernel modules, character drivers, and registering drivers. Key topics include dynamically loading modules, major and minor numbers, private data, and communicating with hardware via I/O ports and memory mapping.
MySQL Performance Schema in Action: the Complete TutorialSveta Smirnova
Performance Schema is powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
In this tutorial we will try all important instruments out. We will provide test environment and few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information, but have experience with it.
Made it on PerconaLive Frankfurt, 2018: https://www.percona.com/live/e18/sessions/mysql-performance-schema-in-action-the-complete-tutorial
The document discusses various aspects of file systems and storage management in operating systems. It covers topics like file attributes, operations, structures, access methods, directory structures, file sharing, consistency semantics, and protection. File attributes include the file name, size, type and protection attributes. Common file operations are creating, reading, writing and deleting files. Files can have sequential, direct or indexed access methods. Directory structures can be single-level, two-level, tree-structured or graph-based. File sharing requires consistency models like Unix, session or immutable semantics. Protection controls access via access matrices, access control lists or capability lists.
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder
This document summarizes a series of performance issues seen by the author in their work with Oracle Exadata systems. It describes random session hangs occurring across several minutes, with long transaction locks and I/O waits seen. Analysis of AWR reports and blocking trees revealed that many sessions were blocked waiting on I/O, though initial I/O metrics from the OS did not show issues. Further analysis using ASH activity breakdowns and OS tools like sar and vmstat found high apparent CPU usage in ASH that was not reflected in actual low CPU load on the system. This discrepancy was due to the way ASH attributes non-waiting time to CPU. The root cause remained unclear.
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
From webinar on December 3, 2019
New users of ClickHouse love the speed but may run into a few surprises when designing applications. Column storage turns classic SQL design precepts on their heads. This talk shares our favorite tricks for building great applications. We'll talk about fact tables and dimensions, materialized views, codecs, arrays, and skip indexes, to name a few of our favorites. We'll show examples of each and also reserve time to handle questions. Join us to take your next step to ClickHouse guruhood!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses Ext4 journaling and the write barrier feature. It notes that the write barrier forces a flush-to-disk call after writing the journal to ensure consistency. However, this can cause sluggishness when storage is full during OTA updates. Disabling the write barrier allows reordering of cache-to-disk writes, reducing latency and improving performance, though it introduces a small risk of filesystem corruption in the event of a power failure. Tests showed disabling the barrier reduced fsync latency and improved SQLite transactions per second on HDD and EMMC storage.
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses using plProxy and pgBouncer to split a PostgreSQL database horizontally and vertically to improve scalability. It describes how plProxy allows functions to make remote calls to other databases and how pgBouncer can be used for connection pooling. The RUN ON clause of plProxy is also summarized, which allows queries to execute on all partitions or on a specific partition.
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
ClickHouse Defense Against the Dark Arts - Intro to Security and PrivacyAltinity Ltd
This document discusses building privacy-aware applications using ClickHouse. It presents several multi-tenancy models for ClickHouse including dedicated installations, clusters, databases, and tables. It also covers restricting user access to specific databases, tables, and rows when using shared tables. The document discusses efficient ways to delete tenant data using partitions and column TTLs. Other considerations discussed include securing system tables and logs, encrypting sensitive data, and using constraints on server settings.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
Deadlock occurs when two or more processes are waiting for resources held by each other in a circular chain, resulting in none of the processes making progress. There are four conditions required for deadlock: mutual exclusion, hold and wait, no preemption, and circular wait. Deadlock can be addressed through prevention, avoidance, detection, or recovery methods. Prevention aims to eliminate one of the four conditions, while avoidance techniques like the safe state model and Banker's Algorithm guarantee a safe allocation order to avoid circular waits.
The document discusses various Oracle performance monitoring tools including Oracle Enterprise Manager (OEM), Automatic Workload Repository (AWR), Automatic Database Diagnostic Monitor (ADDM), Active Session History (ASH), and eDB360. It provides overviews of each tool and examples of using AWR, ADDM, ASH and eDB360 for performance analysis through demos. The conclusions recommend OEM as the primary tool and how the other tools like AWR, ADDM and ASH complement it for deeper performance insights.
RocksDB is an embedded key-value store written in C++ and optimized for fast storage environments like flash or RAM. It uses a log-structured merge tree to store data by writing new data sequentially to an in-memory log and memtable, periodically flushing the memtable to disk in sorted SSTables. It reads from the memtable and SSTables, and performs background compaction to merge SSTables and remove overwritten data. RocksDB supports two compaction styles - level style, which stores SSTables in multiple levels sorted by age, and universal style, which stores all SSTables in level 0 sorted by time.
The document discusses compaction in RocksDB, an embedded key-value storage engine. It describes the two compaction styles in RocksDB: level style compaction and universal style compaction. Level style compaction stores data in multiple levels and performs compactions by merging files from lower to higher levels. Universal style compaction keeps all files in level 0 and performs compactions by merging adjacent files in time order. The document provides details on the compaction process and configuration options for both styles.
This document describes the functions of various Linux commands, including commands for listing files (ls), creating directories (mkdir) and files (touch, cat), copying files (cp), changing directories (cd), moving files (mv), finding file locations (whereis, which), displaying manual pages (man, info), checking disk usage (df, du), viewing running processes (ps), setting aliases (alias), changing user identity (su, sudo), viewing command history (history), setting the system date and time (date), displaying calendars (cal), and clearing the terminal screen (clear). It provides the syntax and examples for using each command.
This version of "Oracle Real Application Clusters (RAC) 19c & Later – Best Practices" was first presented in Oracle Open World (OOW) London 2020 and includes content from the OOW 2019 version of the deck. The deck has been updated with the latest information regarding ORAchk as well as upgrade tips & tricks.
There is hardly a Senior Java developer who has never heard of sun.misc.Unsafe. Though it has always been a private API intended for JDK internal use only, the popularity of Unsafe has grown too fast, and now it is used in many open-source projects. OK.RU is not an exception: its software also heavily relies on Unsafe APIs.
During this session we'll try to understand what is so attractive about Unsafe. Why do people keep using it regardless the warnings of removal from future JDK releases? Are there any safe alternatives to private API or is it absolutely vital? We will review the typical cases when Java developers prefer to go unsafe and discuss major benefits and the drawbacks of it. The report will be supported by the real examples from OK.RU experience.
The document provides an introduction to Linux and device drivers. It discusses Linux directory structure, kernel components, kernel modules, character drivers, and registering drivers. Key topics include dynamically loading modules, major and minor numbers, private data, and communicating with hardware via I/O ports and memory mapping.
MySQL Performance Schema in Action: the Complete TutorialSveta Smirnova
Performance Schema is powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
In this tutorial we will try all important instruments out. We will provide test environment and few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information, but have experience with it.
Made it on PerconaLive Frankfurt, 2018: https://www.percona.com/live/e18/sessions/mysql-performance-schema-in-action-the-complete-tutorial
The document discusses various aspects of file systems and storage management in operating systems. It covers topics like file attributes, operations, structures, access methods, directory structures, file sharing, consistency semantics, and protection. File attributes include the file name, size, type and protection attributes. Common file operations are creating, reading, writing and deleting files. Files can have sequential, direct or indexed access methods. Directory structures can be single-level, two-level, tree-structured or graph-based. File sharing requires consistency models like Unix, session or immutable semantics. Protection controls access via access matrices, access control lists or capability lists.
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder
This document summarizes a series of performance issues seen by the author in their work with Oracle Exadata systems. It describes random session hangs occurring across several minutes, with long transaction locks and I/O waits seen. Analysis of AWR reports and blocking trees revealed that many sessions were blocked waiting on I/O, though initial I/O metrics from the OS did not show issues. Further analysis using ASH activity breakdowns and OS tools like sar and vmstat found high apparent CPU usage in ASH that was not reflected in actual low CPU load on the system. This discrepancy was due to the way ASH attributes non-waiting time to CPU. The root cause remained unclear.
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
From webinar on December 3, 2019
New users of ClickHouse love the speed but may run into a few surprises when designing applications. Column storage turns classic SQL design precepts on their heads. This talk shares our favorite tricks for building great applications. We'll talk about fact tables and dimensions, materialized views, codecs, arrays, and skip indexes, to name a few of our favorites. We'll show examples of each and also reserve time to handle questions. Join us to take your next step to ClickHouse guruhood!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Ever wondered how to use modern OpenGL in a way that radically reduces driver overhead? Then this talk is for you.
John McDonald and Cass Everitt gave this talk at Steam Dev Days in Seattle on Jan 16, 2014.
Inference on edge has an ever increasing performance for companies and thus it is crucial to be able to make models smaller. Compressing models can be loss-less or can result in loss of accuracy. This presentation provides a survey of compression techniques for deep learning models. It then describes different architectures of AWS IoT/Green Grass to combine on-device inference and GPU inference in a hub model. Additionally the presentation introduces MXNet, which has small footprint and efficient both for inference and training in distributed settings.
The document discusses data partitioning and distribution across multiple machines in a cluster. It explains that data replication does not scale well, but data partitioning, where each record exists on only one machine, allows write latency to scale with the number of machines in the cluster. Coherence provides a distributed cache that partitions data and offers functions for server-side processing near the data through tools like entry processors.
We run multiple DataStax Enterprise clusters in Azure each holding 300 TB+ data to deeply understand Office 365 users. In this talk, we will deep dive into some of the key challenges and takeaways faced in running these clusters reliably over a year. To name a few: process crashes, ephemeral SSDs contributing to data loss, slow streaming between nodes, mutation drops, compaction strategy choices, schema updates when nodes are down and backup/restore. We will briefly talk about our contributions back to Cassandra, and our path forward using network attached disks offered via Azure premium storage.
About the Speaker
Anubhav Kale Sr. Software Engineer, Microsoft
Anubhav is a senior software engineer at Microsoft. His team is responsible for building big data platform using Cassandra, Spark and Azure to generate per-user insights of Office 365 users.
Scylla Summit 2018: From SAP to Scylla - Tracking the Fleet at GPS InsightScyllaDB
Originally using SAP Adaptive Server Enterprise (ASE), the GPS Insight team soon found that relational databases simply aren’t a match for high volume machine data. To top it off, SAP ASE’s clustering technology proved cumbersome to manage and operate. In this presentation, you’ll learn about GPS Insight’s hybrid Scylla deployment that runs on-premises and on AWS datacenter. GPS Insight relies on Scylla to capture and analyze GPS data, offloading data from RDBMS to Scylla for hybrid analytics approach.
Wszyscy zostaliśmy oszukani! Automatyczne zarządzanie pamięci rozwiąże wszystkie Wasze problemy, mówili. W zarządzanych środowiskach takich jak CLR JVM nie będzie wycieków pamięci, mówili! Właściwie pamięć jest tania i nie musisz się już nią nigdy więcej martwić. Wszyscy kłamali. Automatyczne zarządzanie pamięcią jest wygodną abstrakcją i bardzo często działa dobrze. Ale jak każda abstrakcja, wcześniej czy później "wycieka" ona. I to najczęściej w najmniej spodziewanym i przyjemnym momencie. W tej sesji spróbuję otworzyć oczy na fakt, że błoga nieświadomość nt. tej abstrakcji może być kosztowna. Pokażę jak może się objawić frywolne traktowanie pamięci i co możemy zyskać pisząc kod zdając sobie sprawę, że pamięć jednak nie jest nieskończona, tania i zawsze jednakowo szybka.
The document discusses optimizing DirectCompute for graphics processing units (GPUs). It introduces DirectCompute as an API for general purpose GPU programming and explains how it maps better than traditional graphics APIs to the architecture of GPUs like AMD's GCN. Key optimization techniques covered include choosing a thread group size that is a multiple of 64 threads to fully utilize the SIMD hardware, and using thread group shared memory to improve data access speeds compared to off-chip memory. Bank conflicts in shared memory access are also discussed as something to avoid for best performance.
Nsd, il tuo compagno di viaggio quando Domino va in crashFabio Pignatti
Come leggere e trarre utili informazioni dall'analisi di un NSD in caso di crash o hang del server Domino. Alcuni casi pratici ed un tool utile in fase di analisi. - Dominopoint Day 2008
The document discusses different strategies for horizontally scaling databases, including simple sharding, hashed sharding, and master-slave architectures. It describes Aerospike's approach of "smart partitioning", which balances data automatically, hides complexity from clients, and provides redundancy and failover. The key advantages are linear scalability, high availability even during maintenance, and the ability to handle catastrophic failures through multi-datacenter replication that can withstand outages and disasters.
The builder pattern is a design pattern used to separate the construction of a complex object from its representation. It allows the same construction process to create different representations. The document describes the builder pattern using the example of building computers. It defines classes for components like CPU, motherboard, drives, and their assembly into computer objects using either a general ComputerBuilder or specialized builders for desktop and laptop computers. The builder pattern participants are the builder, concrete builders, director and product, and their collaboration is described.
Data Privacy with Apache Spark: Defensive and Offensive ApproachesDatabricks
In this talk, we’ll compare different data privacy techniques & protection of personally identifiable information and their effects on statistical usefulness, re-identification risks, data schema, format preservation, read & write performance.
We’ll cover different offense and defense techniques. You’ll learn what k-anonymity and quasi-identifier are. Think of discovering the world of suppression, perturbation, obfuscation, encryption, tokenization, watermarking with elementary code examples, in case no third-party products cannot be used. We’ll see what approaches might be adopted to minimize the risks of data exfiltration.
Nowadays, scaling and auto-scaling have become relatively easy tasks. Everyone knows how to set up auto-scaling environments - Auto-Scaling groups, Swarm, Kubernetes, etc.
But when we try to scale I/O Bound workloads:
- Message queues (Kafka, Rabbit, NATS)
- Distributed databases (Hadoop, Cassandra)
- Storage subsystems (CEPH, GlusterFS, HDFS),
the traditional auto-scaling mechanisms are just not enough.
Heavy calculations must be performed to determine the I/O bottlenecks. Rebalancing the data after a scaling event can take up to hours depending on your data & could, resulting in data loss if not properly designed.
We will deep dive into this type of workload and walk you through code samples you can apply in your own environment.
[2017.03.18] hst binary training part 1Chia-Hao Tsai
The document provides an overview of binary formats and machine code. It discusses the Mach-O binary format used on Mac OS X, including the header, commands, segments, and sections. It also covers x86-64 machine code layout and opcodes. A minimal Mach-O 64 binary is listed as an example, containing a header, commands, and 12 bytes of machine code while consuming only 4K of space.
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
Slides for the webinar, presented on February 26, 2020
By Robert Hodges, Altinity CEO
Materialized views are the killer feature of ClickHouse, and the Altinity 2019 webinar on how they work was very popular. Join this updated webinar to learn how to use materialized views to speed up queries hundreds of times. We'll cover basic design, last point queries, using TTLs to drop source data, counting unique values, and other useful tricks. Finally, we'll cover recent improvements that make materialized views more useful than ever.
Beyond php it's not (just) about the codeWim Godden
The document discusses database queries and optimization. It begins with an example of a complex database query and explains how to detect problematic queries using tools like slow query log and pt-query-digest. It then discusses indexing strategies and when to use indexes. The document also describes a case study of a client's jobs search site that was experiencing high database load due to inefficient queries in a loop, and how batching the queries into a single query solved the problem.
Similar to BSidesDelhi 2018: Finding Memory Bugs with the Address Sanitizer (20)
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...alexjohnson7307
Predictive maintenance is a proactive approach that anticipates equipment failures before they happen. At the forefront of this innovative strategy is Artificial Intelligence (AI), which brings unprecedented precision and efficiency. AI in predictive maintenance is transforming industries by reducing downtime, minimizing costs, and enhancing productivity.
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
2. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
$ WHOAMI
▸Siddharth Muralee (R3x)
▸Third year CSE @ Amrita Vishwa Vidyapeetham
▸CTF player - Team bi0s
▸Core organising team @ InCTF and InCTFj
▸Reverse Engineering/Exploit Development
▸GSoC ’18 with NetBSD
3. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
AGENDA
▸What are Sanitizers ?
▸What is an Address Sanitizer(ASan)?
▸How does the Address Sanitizer work?
▸The Kernel Address Sanitizer(KASan)
▸Bug report examples
▸Address Sanitizer vs other tools
4. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
WHAT IS A
SANITIZER?
▸ Programming tool to detect
computer program errors.
▸ Compiler Instrumented
▸ The various Sanitizers are :
▸ Address Sanitizer
▸ Memory Sanitizer
▸ Thread Sanitizer
▸ Undefined Behaviour Sanitizer
5. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
ADDRESS SANITIZER (ASAN)
▸Open source tool developed by Google.
▸Detects memory corruption bugs - Overflows and UAFs
▸Implemented in Clang, GCC and Xcode.
▸Easy to use just compile with -fsanitize=address.
▸Run-time library
▸A pretty Amazing Fuzzer aide
6. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
ADDRESS SANITIZER - PREREQUISITE
KNOWLEDGE
▸Shadow Memory
▸Storing metadata corresponding to each piece of
application data.
▸Each Address Mapped to a Shadow memory offset
▸Compiler Instrumentation
▸Compiler adds instructions during compilation which
allows some information
7. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
ADDRESS
SANITIZER
WORKING
▸Each Memory access is modified using compiler
instrumentation. Prevents unintended memory read/write.
8. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
ADDRESS SANITIZER WORKING
▸Run time library replaces malloc and free.
▸Memory around chunks (memory blocks allocated using
malloc) are poisoned. Prevents Heap overflow bugs.
▸Freed memory placed in quarantine list . Prevents Use-
after-free bugs from being missed.
▸Parts of the stack are also poisoned. This is to avoid Stack
overflow bugs
9. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
SHADOW MEMORY
▸ ASan maps 1 byte of Application data to 1 bit of shadow memory.
▸ Shadow bit
▸ 1 - Unaddressable
▸ 0 - Addressable
▸ Total Shadow region size = Application size / 8
▸ Each byte needs to be converted to the corresponding shadow
memory address.
12. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
KERNEL ADDRESS SANITIZER (KASAN)
▸Gcc and Clang come with a build in option -fsanitize=kernel-
address
▸You can build the entire kernel with the Address Sanitizer
using the kernel config file.
▸Provides an API which the kernel has to implement.
▸Is implemented as a feature in the Linux Kernel, OS X and
now NetBSD !!!
13. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
IMPLEMENTING KASAN
▸Allocation and population of the Shadow buffer during boot
▸Need to write interceptors for kernel allocator functions to
update shadow memory. (Multiple allocators)
▸Kernel VA space needs to be properly managed.
▸Implement Quarantine lists in page allocation
▸Bug Reporting infrastructure needs to be written
14. KASAN WORKING
FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER BSIDES
ptr - points to 0xdeadbeef
shadow_ptr - shadow address
kmem_to_shadow - converts
kernel address to shadow offset.
15. NETBSD KASAN -VA SPACE MAPPING
FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER BSIDES
16. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
Fix buffer overflow, detected by kASan.
ifconfig gif0 create
ifconfig gif0 up
[ 50.682919] kASan: Unauthorized Access In 0xffffffff80f22655:
Addr 0xffffffff81b997a0 [8 bytes, read]
[ 50.682919] #0 0xffffffff8021ce6a in kasan_memcpy <netbsd>
[ 50.692999] #1 0xffffffff80f22655 in m_copyback_internal
<netbsd>
[ 50.692999] #2 0xffffffff80f22e81 in m_copyback <netbsd>
[ 50.692999] #3 0xffffffff8103109a in rt_msg1 <netbsd>
[ 50.692999] #4 0xffffffff8159109a in compat_70_rt_newaddrmsg1 <n
[ 50.692999] #5 0xffffffff81031b0f in rt_newaddrmsg <netbsd>
[ 50.692999] #6 0xffffffff8102c35e in rt_ifa_addlocal <netbsd>
[ 50.692999] #7 0xffffffff80a5287c in in6_update_ifa1 <netbsd>
[ 50.692999] #8 0xffffffff80a54149 in in6_update_ifa <netbsd>
[ 50.692999] #9 0xffffffff80a59176 in in6_ifattach <netbsd>
[ 50.692999] #10 0xffffffff80a56dd4 in in6_if_up <netbsd>
[ 50.692999] #11 0xffffffff80fc5cb8 in if_up_locked <netbsd>
[ 50.703622] #12 0xffffffff80fcc4c1 in ifioctl_common <netbsd>
[ 50.703622] #13 0xffffffff80fde694 in gif_ioctl <netbsd>
[ 50.703622] #14 0xffffffff80fcdb1f in doifioctl <netbsd>
17. FINDING MEMORY BUGS WITH THE ADDRESS SANITIZER
BSIDES DELHI
ASAN VS OTHER TOOLS
ASan Memcheck(Valgrind) Mudflap
Technique
Compiler
Instrumentation
Dynamic Binary
Instrumentation
Compiler
Instrumentation
Slowdown 2x 20x 2x - 20x
Types of Bugs
Heap and Stack
overflows, UAF, UAR
UMR, Heap
overflows and UAFs
Heap overflows
and UAFs
UMR - Uninitialised Memory Reads
UAF - Use After Free
UAR - Use After Return
Third year CSE , Amrita , Kollam. CTF player with team bi0s mainly focusing on Reverse engineering and exploitation. I am also a core member of the organising team for InCTF and InCTF junior , CTFs bi0s conducts for college and school students in India.
I did my Google Summer of Code this year with the Amazing NetBSD Foundation.
This talk aims at introducing you to the sanitizer family and in specific the Address Sanitizer. We will be looking at how the address Sanitizer works. We will also be looking into the kernel address sanitizer and how its implemented.
In the end we will compare the address sanitizer with some similar tools.
The Sanitizer is a programming tool used to detect program errors.
The technology behind sanitizers being compiler instrumentation.
There are several sanitizers created for various purposes like the Address, Memory , Thread and the Undefined Behavior Sanitizer.
Now let’s look at the Address Sanitizer - It is a open source tool developed by google which is mainly used to detect memory corruption bugs - such as Stack and Heap overflows, Use after Frees etc
The sanitizer comes as a compiler option in Clang, Gcc and Xcode.
In the user space its linked with the executable as a run time library.
The features that the address sanitizer has makes it a pretty cool fuzzing aide that is widely used even in the OSS fuzz tool which google has.
Okay before look deeper into the address Sanitizer there two concepts that are very vital. One of them is the concept of Shadow memory.
Shadow memory is a additional memory that is used by the address sanitizer to keep track of what data is legal and what data is illegal. Here we store metadata corresponding to each piece of application data. This metadata is updated with each instruction and is used to determine whether a instruction is legal or not.
Compiler instrumentation - This is the underlying technology behind the address sanitizer - The Compiler adds certain instruction during the compilation process which allows the program to update and check the shadow memory(metadata) at every necessary point.
Now let’s look at how the address sanitizer works.
During compilation each memory access (that is a memory read or write) is modified to have a check whether the address is poisoned or not.
An address is said to be poisoned when the program is not supposed to be access it.
If a poisoned memory is accessed then we report an error.
When the Run time library for ASan is linked with the executable it replaces the malloc and free function provided by the libc with its own.
This allows it to allocate chunks with the memory around it poisoned like a redzone. This is done to prevent heap overflow bugs since there is now no possibility to overflow and write to another chunk.
Also the chunks that are freed are now put into a quarantine list. This means that a chunk that has been recently used would not be used for a long time unless there is a shortage of memory. This is done to make sure that Use after free bugs are seen. If the freed chunk is again malloc’d then we miss it.
That’s the main features that the address sanitizer provides.
Now let’s take a better look at the Shadow memory to find out how the mapping works.
The address sanitizer maps 1 byte of the application memory to 1 bit of shadow memory. So eight bytes of application memory is taken care by 1 byte of metadata. This is indeed a significant amount of memory - that is we require 1/8th of the total application memory to manage the shadow region.
Also during each check/updation the bytes need to converted to the corresponding shadow memory address. This means that the shadow memory conversion process should be pretty simple otherwise we waste a lot of time there.
So the equation to get the shadow address is to take the addr and divide it by 8 (>> 3) and then add it to a predetermined address called the shadow offset.
If you look at the diagram you can see the address 0xffff is mapped to - 0x3ffff
0x00 - 0x2000000
This determines the bounds of the shadow memory. The shadow region is marked in light blue.
The region marked in red is the shadow address range of the shadow memory region - this region is not supposed to be used.
Hence every address in the entire range has been mapped to a shadow region and this has been done in a very simple process.
The kernel address sanitizer is the result of the build in option fsanitize = kernel address. The entire kernel can be built with the address sanitizer using the kernel config file. Sadly, implementing kernel address sanitizer is not very easy - the compiler can only provide a set of API’s which the kernel has to implement.
The kernel address Sanitizer has been implemented in the linux kernel, Mac OS X and Now its also there in NetBSD.
There are a lot of things that the OS must implement to get the API working perfectly. Since running the address sanitizer as a shared library isn’t viable in such a situation we need to modify the OS kernel itself.
Several things that need to taken care are.
Making sure that the Shadow buffer is allocated and populated during the boot process. This is vital since the allocator functions are also used during the boot process.
The Memory allocators like slab, slob in linux need to be modified so that they update the shadow memory during allocation and freeing of pages
The kernel Virtual Address Space also needs to be managed so that no other allocations happen in the memory region corresponding to the shadow memory. We will take a look at the kernel VA space with resepect to NetBSD in a short while.
Also we need to implement the quarantine list for the same reason as earlier. In the case of the kernel the allocator allocates chunks that are the in essence pages. These pages have to undergo the same procedure.
We also have to write the bug reporting infrastructure. Since the bugs are supposed to be printed out by the kernel - either in the kernel log or as a kernel crash.
Okay let’s consider a example in the kernel space and see what happens.
Assume the kernel code has a pointer in the address 0xdeadbeef. The address is dereferenced at some point of time and we are writing something into it. Since the kernel is compiled with KASAN the compiler now has added certain functions __asan_store().
asan_store checks if the address is poisoned - for this it first passes the address to the kmem to shadow function which converts the address to the shadow address as we saw earlier and then the check_memory_poisoned function checks whether the memory has been poisoned or not. If its poisoned then we call kasan_report error - report the bug and exit else we continue execution.
Similar functions exist for kmalloc and other implementations.
This is the virtual address space of the NetBSD kernel for the amd64 architecture. Here you can see that there are several sections in the kernel and there are pointers containing the address of that section.
The shadow buffer takes up around 128 Tb of space when it comes to the 64 bit kernel. Hence we placed it in a hole that’s not been used for any other purposes.
Also we have to deal with some regions in a certain way and the others slightly different. The kernel memory, the user land, page tables and the Module Map are areas we need to take care of.
This is a example bug report of a bug we found in the NetBSD kernel with Kasan. As you can see the output is slightly different and we are printing it in the kernel log.
Kasan_memcpy instead of memcpy.