This document describes Logram, a log parsing approach that uses n-gram dictionaries to distinguish static and dynamic tokens in logs. It works in two steps: 1) generating n-gram dictionaries from sample logs to calculate token frequencies, and 2) parsing new logs by looking up n-grams in the dictionaries to identify static and dynamic parts of templates. Logram achieves accurate, efficient, stable, and scalable log parsing compared to other methods. It can generate templates from a small sample of logs and parse new logs linearly without losing accuracy.
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
This document summarizes Noah Watkins' presentation on building a distributed shared log using Ceph. The key points are:
1) Noah discusses how shared logs are challenging to scale due to the need to funnel all writes through a total ordering engine. This bottlenecks performance.
2) CORFU is introduced as a shared log design that decouples I/O from ordering by striping the log across flash devices and using a sequencer to assign positions.
3) Noah then explains how the components of CORFU can be mapped onto Ceph, using RADOS object classes, librados, and striping policies to implement the shared log without requiring custom hardware interfaces.
4) ZLog is presented
This document summarizes logging in Android systems. It discusses logging from Java programs using android.util.Log and System.out/err. For native programs, it describes using the liblog library. It provides an overview of Android's logging system architecture and how to read logs with logcat. It also includes tips on dumping stack traces, character encoding, using logwrapper, and logging from the init process.
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)JiandSon
This document provides debugging tips for Xcode including using breakpoint actions, exception breakpoints, symbolic breakpoints, and static analysis. It compares ARC and non-ARC debugging and discusses diagnostic tools like memory management and logging. Finally, it introduces DTrace as a dynamic tracing facility for macOS and iOS.
Linux kernel tracing superpowers in the cloudAndrea Righi
The Linux 4.x series introduced a new powerful engine of programmable tracing (BPF) that allows to actually look inside the kernel at runtime. This talk will show you how to exploit this engine in order to debug problems or identify performance bottlenecks in a complex environment like a cloud. This talk will cover the latest Linux superpowers that allow to see what is happening “under the hood” of the Linux kernel at runtime. I will explain how to exploit these “superpowers” to measure and trace complex events at runtime in a cloud environment. For example, we will see how we can measure latency distribution of filesystem I/O, details of storage device operations, like individual block I/O request timeouts, or TCP buffer allocations, investigating stack traces of certain events, identify memory leaks, performance bottlenecks and a whole lot more.
This document discusses logging in Android systems. It describes how to log from Java programs using android.util.Log and from native programs using liblog. It provides an overview of Android's logging system including the log device files and how logs can be read with logcat or adb. It also provides some tips for logging like dumping stack traces, dealing with character encoding, using logwrappers, and logging from init processes.
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...Insight Technology, Inc.
SQLite4 was a project started at the beginning of 2012 and designed to provide a follow-on to SQLite3 without the constraints of backwards compatibility. SQLite4 was built around a Log Structured Merge (LSM) storage engine that is transactional, stores all content in a single file on disk, and that is faster than LevelDB. Other innovations in include the use of decimal floating-point arthimetic and a single storage engine namespace used for all tables and indexes. Expectations were initially high. However, development stopped about 2.5 years later, after finding that the design of SQLite4 would never be competitive with SQLite3. This talk overviews the technological ideas tried in SQLite4 and discusses why they did not work out for the kinds of workloads typically encountered for an embedded database engine.
Logging for Production Systems in The Container Era discusses how to effectively collect and analyze logs and metrics in microservices-based container environments. It introduces Fluentd as a centralized log collection service that supports pluggable input/output, buffering, and aggregation. Fluentd allows collecting logs from containers and routing them to storage systems like Kafka, HDFS and Elasticsearch. It also supports parsing, filtering and enriching log data through plugins.
This document summarizes a three-part challenge involving cracking a MIPS binary, exploiting a Python/XXE vulnerability in a web application, and decrypting messages from a SecureDrop-like system. The MIPS binary is cracked by inverting its password checking algorithm. The web app is exploited via XXE to retrieve files containing an admin URL and view state details. Python code is modified at runtime to decrypt an AES key and access a "secret.key" file. This key reveals a tarball containing a SecureDrop implementation. A buffer overflow in SecDrop's service is used to run shellcode. Timing attacks via the CPU cache are then used to retrieve the private RSA key and decrypt messages stored by the SecureDrop-
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
This document summarizes Noah Watkins' presentation on building a distributed shared log using Ceph. The key points are:
1) Noah discusses how shared logs are challenging to scale due to the need to funnel all writes through a total ordering engine. This bottlenecks performance.
2) CORFU is introduced as a shared log design that decouples I/O from ordering by striping the log across flash devices and using a sequencer to assign positions.
3) Noah then explains how the components of CORFU can be mapped onto Ceph, using RADOS object classes, librados, and striping policies to implement the shared log without requiring custom hardware interfaces.
4) ZLog is presented
This document summarizes logging in Android systems. It discusses logging from Java programs using android.util.Log and System.out/err. For native programs, it describes using the liblog library. It provides an overview of Android's logging system architecture and how to read logs with logcat. It also includes tips on dumping stack traces, character encoding, using logwrapper, and logging from the init process.
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)JiandSon
This document provides debugging tips for Xcode including using breakpoint actions, exception breakpoints, symbolic breakpoints, and static analysis. It compares ARC and non-ARC debugging and discusses diagnostic tools like memory management and logging. Finally, it introduces DTrace as a dynamic tracing facility for macOS and iOS.
Linux kernel tracing superpowers in the cloudAndrea Righi
The Linux 4.x series introduced a new powerful engine of programmable tracing (BPF) that allows to actually look inside the kernel at runtime. This talk will show you how to exploit this engine in order to debug problems or identify performance bottlenecks in a complex environment like a cloud. This talk will cover the latest Linux superpowers that allow to see what is happening “under the hood” of the Linux kernel at runtime. I will explain how to exploit these “superpowers” to measure and trace complex events at runtime in a cloud environment. For example, we will see how we can measure latency distribution of filesystem I/O, details of storage device operations, like individual block I/O request timeouts, or TCP buffer allocations, investigating stack traces of certain events, identify memory leaks, performance bottlenecks and a whole lot more.
This document discusses logging in Android systems. It describes how to log from Java programs using android.util.Log and from native programs using liblog. It provides an overview of Android's logging system including the log device files and how logs can be read with logcat or adb. It also provides some tips for logging like dumping stack traces, dealing with character encoding, using logwrappers, and logging from init processes.
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...Insight Technology, Inc.
SQLite4 was a project started at the beginning of 2012 and designed to provide a follow-on to SQLite3 without the constraints of backwards compatibility. SQLite4 was built around a Log Structured Merge (LSM) storage engine that is transactional, stores all content in a single file on disk, and that is faster than LevelDB. Other innovations in include the use of decimal floating-point arthimetic and a single storage engine namespace used for all tables and indexes. Expectations were initially high. However, development stopped about 2.5 years later, after finding that the design of SQLite4 would never be competitive with SQLite3. This talk overviews the technological ideas tried in SQLite4 and discusses why they did not work out for the kinds of workloads typically encountered for an embedded database engine.
Logging for Production Systems in The Container Era discusses how to effectively collect and analyze logs and metrics in microservices-based container environments. It introduces Fluentd as a centralized log collection service that supports pluggable input/output, buffering, and aggregation. Fluentd allows collecting logs from containers and routing them to storage systems like Kafka, HDFS and Elasticsearch. It also supports parsing, filtering and enriching log data through plugins.
This document summarizes a three-part challenge involving cracking a MIPS binary, exploiting a Python/XXE vulnerability in a web application, and decrypting messages from a SecureDrop-like system. The MIPS binary is cracked by inverting its password checking algorithm. The web app is exploited via XXE to retrieve files containing an admin URL and view state details. Python code is modified at runtime to decrypt an AES key and access a "secret.key" file. This key reveals a tarball containing a SecureDrop implementation. A buffer overflow in SecDrop's service is used to run shellcode. Timing attacks via the CPU cache are then used to retrieve the private RSA key and decrypt messages stored by the SecureDrop-
Skiron - Experiments in CPU Design in DMithun Hunsur
This document discusses Skiron, an experimental CPU design project implemented in the D programming language. It provides an overview of Skiron, which simulates a RISC-inspired instruction set architecture. It describes the idioms and patterns used in D to define the instruction set and encoding in a way that is self-documenting and allows different parts of the software to stay in sync. It also discusses lessons learned, such as issues with delegates, as well as potential improvements to D's metaprogramming capabilities and standard library support for @nogc code. Realizing Skiron in hardware with an FPGA and making it self-hosting are presented as future goals.
Fantastic caches and where to find themAlexey Tokar
"Magical caches are terrorizing engineers. When engineers are afraid, they debug. Contain this, or it’ll mean refactoring." (c)
The story of how an internal Hibernate cache can consume 99% of 30GiB of your application memory with just the addition of a single line of code. The way it was discovered and root cause analysis to prevent it in the future will be the topic of the talk.
The slides we used at the first meetup hosted at Redis Labs' TLV offices :)
Touches on some of the more notable user-facing functionality in the newest Redis version, as well as interesting internal optimizations with major gains.
#RedisTLV: www.meetup.com/Tel-Aviv-Redis-Meetup/events/227594422/
Three tricks how to understand what's happening inside of .NET Core app running on Linux: perf, lttng and lldb. As unrelated bonus, last slides have a brief intro into Google Cloud Platform
Hierarchical free monads and software design in fpAlexander Granin
I invented the approach I call "Hierarchical Free Monads". It helps to build applications in Haskell with achieving all the needed code quality requirements. I tested this approach in several real world projects and companies, and it works very well.
This document summarizes a presentation about using Redis for duplicate document detection in a real-time data stream. The key points covered include:
- Redis is used to map external document IDs to internal IDs and cache these mappings to detect duplicates efficiently
- Lua scripting is used to generate IDs and check for duplicates in an atomic way
- Redis data structures like hashes and counters help count documents and store metadata efficiently
- A production deployment involved a single Redis server handling 70M keys and 10GB of RAM, with replication for high availability
Redis - for duplicate detection on real time streamCodemotion
Roberto "frank" Franchini presenta a Codemotion Techmeetup Torino Redis, un data structure server che può utilizzare come chiavi stringhe, hashes, lists, sets, sorted sets, bitmaps e hyperloglogs
.
Swug July 2010 - windows debugging by sainathDennis Chung
The document provides an overview of basic debugging terms and tools like process, thread, registers, exceptions, memory dumps, and AdPlus. It discusses setting up a debugger, understanding assembly code, using important CPU registers and variables, reading memory types, and examining stacks. The document also asks questions to check understanding of debugging concepts.
Building Your First App with Shawn Mcarthy MongoDB
This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app. We’ll cover inserting, updating, and querying data. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
The document discusses transforming Kubernetes logs into metrics using the TICK stack. It begins by describing how syslog logs from journald can be parsed using the go-syslog parser and sent as metrics to InfluxDB via the Telegraf syslog input plugin. It then shows the YAML configuration to deploy Chronograf and InfluxDB for visualization. Finally, it proposes writing a Kapacitor tick script with a UDF to detect and count OOM events from the logs and send as metrics.
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...Felipe Prado
This document provides documentation for the r2c analysis platform and command line interface (CLI). It describes how to install and set up the r2c CLI, create an example analyzer, write analysis code using Python, run the analyzer locally on a test codebase, and publish the analyzer to the r2c platform to run at larger scale. The example analyzer counts the percentage of whitespace in JavaScript files to identify potentially minified code. The document guides the reader through each step of developing and testing an analyzer locally before publishing it for cloud-based analysis.
This document provides an overview of Apache Flink internals. It begins with an introduction and recap of Flink programming concepts. It then discusses how Flink programs are compiled into execution plans and executed in a pipelined fashion, as opposed to being executed eagerly like regular code. The document outlines Flink's architecture including the optimizer, runtime environment, and data storage integrations. It also covers iterative processing and how Flink handles iterations both by unrolling loops and with native iterative datasets.
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideLinaro
A tour of essential topics for working on the Android Optimizing Compiler, with a special emphasis on helping new engineers integrate and hit the ground running. Learn how to work on intrinsics, instruction simplification, platform specific optimizations, how to submit good patches, write Checker tests, analyse IR, take boot.oat measurements, and debug performance and execution issues with Streamline and GDB.
1. The document discusses various steps and tools for troubleshooting real production problems related to CPU spikes, thread dumps, memory leaks, and garbage collection issues.
2. It provides guidance on using tools like 'top', 'jstack', 'jmap', 'jcmd', Eclipse MAT and HeapHero to analyze thread dumps, capture heap dumps, and diagnose memory leaks.
3. The document also emphasizes the importance of enabling GC logs and capturing the right system metrics like thread states, file descriptors, and GC throughput to detect problems early.
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
Getting to understand your Kubernetes storage capabilities is important in order to run a proper cluster in production. In this session I will demonstrate how to use Sherlock, an open source platform written to test persistent NVMe/TCP storage in Kubernetes, either via synthetic workload or via variety of databases, all easily done and summarized to give you an estimate of what your IOPS, Latency and Throughput your storage can provide to the Kubernetes cluster.
go-git is a 100% Go libray used to interact with git repositories. Even if it already supports most of the functionality it still lags a bit in performance when compared with the git CLI or some other libraries. I'll explain some of the problems that we face when dealing with git repos and some examples of performance improvements done to the library.
Customize and Secure the Runtime and Dependencies of Your Procedural Language...VMware Tanzu
Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container
Greenplum Summit at PostgresConf US 2018
Hubert Zhang and Jack Wu
The RestFS is an experimental project to develop an open-source distributed filesystem for large environments. It is designed to scale up from a single server to thousand of nodes and delivering a high availability storage system with special features for high i/o performance and network optimization for work better in WAN environment.
The document provides an agenda for a presentation on getting expertise with MongoDB design patterns. It includes sections on MongoDB recap, how MongoDB works, the _id field, query execution order, indexes, replication, sharding, and introduces the presenters.
Eric Lafortune - Fighting application size with ProGuard and beyondGuardSquare
The document discusses various techniques for reducing Android application size, including compressing resources and assets, trimming unused resources and assets, splitting APK files, shrinking libraries, shrinking the application bytecode, and splitting dex files. It provides examples of using tools like ProGuard, DexGuard, and the Android Gradle plugin to apply these techniques at build time in order to reduce the overall size of the packaged Android application.
This document discusses using Application Performance Management (APM) tools to detect performance regressions in web applications. It presents a case study where performance regressions were injected into test systems and then evaluated whether commercial and open source APM tools could detect the issues. The study found that APM tools can successfully detect some performance regressions, but they have limitations like producing large reports that require manual exploration and lacking actionable suggestions for fixes. The document concludes that APM tools show promise as a way to deploy performance regression detection research into practice.
This document summarizes the findings of a study analyzing the copy and paste behavior of over 20,000 Eclipse IDE users over 20 months. The study found that IDE users copy and paste code differently than regular computer users. Specifically, IDE users were found to copy and paste more within files, engage in more consecutive "relay" copy and paste operations, and copy and paste less between different files than regular users. The study also found a large number of copy and paste operations occurred between different editor types, indicating clone detection tools need to detect clones across programming languages.
Skiron - Experiments in CPU Design in DMithun Hunsur
This document discusses Skiron, an experimental CPU design project implemented in the D programming language. It provides an overview of Skiron, which simulates a RISC-inspired instruction set architecture. It describes the idioms and patterns used in D to define the instruction set and encoding in a way that is self-documenting and allows different parts of the software to stay in sync. It also discusses lessons learned, such as issues with delegates, as well as potential improvements to D's metaprogramming capabilities and standard library support for @nogc code. Realizing Skiron in hardware with an FPGA and making it self-hosting are presented as future goals.
Fantastic caches and where to find themAlexey Tokar
"Magical caches are terrorizing engineers. When engineers are afraid, they debug. Contain this, or it’ll mean refactoring." (c)
The story of how an internal Hibernate cache can consume 99% of 30GiB of your application memory with just the addition of a single line of code. The way it was discovered and root cause analysis to prevent it in the future will be the topic of the talk.
The slides we used at the first meetup hosted at Redis Labs' TLV offices :)
Touches on some of the more notable user-facing functionality in the newest Redis version, as well as interesting internal optimizations with major gains.
#RedisTLV: www.meetup.com/Tel-Aviv-Redis-Meetup/events/227594422/
Three tricks how to understand what's happening inside of .NET Core app running on Linux: perf, lttng and lldb. As unrelated bonus, last slides have a brief intro into Google Cloud Platform
Hierarchical free monads and software design in fpAlexander Granin
I invented the approach I call "Hierarchical Free Monads". It helps to build applications in Haskell with achieving all the needed code quality requirements. I tested this approach in several real world projects and companies, and it works very well.
This document summarizes a presentation about using Redis for duplicate document detection in a real-time data stream. The key points covered include:
- Redis is used to map external document IDs to internal IDs and cache these mappings to detect duplicates efficiently
- Lua scripting is used to generate IDs and check for duplicates in an atomic way
- Redis data structures like hashes and counters help count documents and store metadata efficiently
- A production deployment involved a single Redis server handling 70M keys and 10GB of RAM, with replication for high availability
Redis - for duplicate detection on real time streamCodemotion
Roberto "frank" Franchini presenta a Codemotion Techmeetup Torino Redis, un data structure server che può utilizzare come chiavi stringhe, hashes, lists, sets, sorted sets, bitmaps e hyperloglogs
.
Swug July 2010 - windows debugging by sainathDennis Chung
The document provides an overview of basic debugging terms and tools like process, thread, registers, exceptions, memory dumps, and AdPlus. It discusses setting up a debugger, understanding assembly code, using important CPU registers and variables, reading memory types, and examining stacks. The document also asks questions to check understanding of debugging concepts.
Building Your First App with Shawn Mcarthy MongoDB
This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app. We’ll cover inserting, updating, and querying data. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
The document discusses transforming Kubernetes logs into metrics using the TICK stack. It begins by describing how syslog logs from journald can be parsed using the go-syslog parser and sent as metrics to InfluxDB via the Telegraf syslog input plugin. It then shows the YAML configuration to deploy Chronograf and InfluxDB for visualization. Finally, it proposes writing a Kapacitor tick script with a UDF to detect and count OOM events from the logs and send as metrics.
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...Felipe Prado
This document provides documentation for the r2c analysis platform and command line interface (CLI). It describes how to install and set up the r2c CLI, create an example analyzer, write analysis code using Python, run the analyzer locally on a test codebase, and publish the analyzer to the r2c platform to run at larger scale. The example analyzer counts the percentage of whitespace in JavaScript files to identify potentially minified code. The document guides the reader through each step of developing and testing an analyzer locally before publishing it for cloud-based analysis.
This document provides an overview of Apache Flink internals. It begins with an introduction and recap of Flink programming concepts. It then discusses how Flink programs are compiled into execution plans and executed in a pipelined fashion, as opposed to being executed eagerly like regular code. The document outlines Flink's architecture including the optimizer, runtime environment, and data storage integrations. It also covers iterative processing and how Flink handles iterations both by unrolling loops and with native iterative datasets.
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideLinaro
A tour of essential topics for working on the Android Optimizing Compiler, with a special emphasis on helping new engineers integrate and hit the ground running. Learn how to work on intrinsics, instruction simplification, platform specific optimizations, how to submit good patches, write Checker tests, analyse IR, take boot.oat measurements, and debug performance and execution issues with Streamline and GDB.
1. The document discusses various steps and tools for troubleshooting real production problems related to CPU spikes, thread dumps, memory leaks, and garbage collection issues.
2. It provides guidance on using tools like 'top', 'jstack', 'jmap', 'jcmd', Eclipse MAT and HeapHero to analyze thread dumps, capture heap dumps, and diagnose memory leaks.
3. The document also emphasizes the importance of enabling GC logs and capturing the right system metrics like thread states, file descriptors, and GC throughput to detect problems early.
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
Getting to understand your Kubernetes storage capabilities is important in order to run a proper cluster in production. In this session I will demonstrate how to use Sherlock, an open source platform written to test persistent NVMe/TCP storage in Kubernetes, either via synthetic workload or via variety of databases, all easily done and summarized to give you an estimate of what your IOPS, Latency and Throughput your storage can provide to the Kubernetes cluster.
go-git is a 100% Go libray used to interact with git repositories. Even if it already supports most of the functionality it still lags a bit in performance when compared with the git CLI or some other libraries. I'll explain some of the problems that we face when dealing with git repos and some examples of performance improvements done to the library.
Customize and Secure the Runtime and Dependencies of Your Procedural Language...VMware Tanzu
Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container
Greenplum Summit at PostgresConf US 2018
Hubert Zhang and Jack Wu
The RestFS is an experimental project to develop an open-source distributed filesystem for large environments. It is designed to scale up from a single server to thousand of nodes and delivering a high availability storage system with special features for high i/o performance and network optimization for work better in WAN environment.
The document provides an agenda for a presentation on getting expertise with MongoDB design patterns. It includes sections on MongoDB recap, how MongoDB works, the _id field, query execution order, indexes, replication, sharding, and introduces the presenters.
Eric Lafortune - Fighting application size with ProGuard and beyondGuardSquare
The document discusses various techniques for reducing Android application size, including compressing resources and assets, trimming unused resources and assets, splitting APK files, shrinking libraries, shrinking the application bytecode, and splitting dex files. It provides examples of using tools like ProGuard, DexGuard, and the Android Gradle plugin to apply these techniques at build time in order to reduce the overall size of the packaged Android application.
This document discusses using Application Performance Management (APM) tools to detect performance regressions in web applications. It presents a case study where performance regressions were injected into test systems and then evaluated whether commercial and open source APM tools could detect the issues. The study found that APM tools can successfully detect some performance regressions, but they have limitations like producing large reports that require manual exploration and lacking actionable suggestions for fixes. The document concludes that APM tools show promise as a way to deploy performance regression detection research into practice.
This document summarizes the findings of a study analyzing the copy and paste behavior of over 20,000 Eclipse IDE users over 20 months. The study found that IDE users copy and paste code differently than regular computer users. Specifically, IDE users were found to copy and paste more within files, engage in more consecutive "relay" copy and paste operations, and copy and paste less between different files than regular users. The study also found a large number of copy and paste operations occurred between different editor types, indicating clone detection tools need to detect clones across programming languages.
The document proposes using MapReduce as a general framework to support research in mining software repositories (MSR). It describes how MapReduce can provide efficiency, scalability, adaptability and flexibility for common MSR tasks like analyzing large code repositories. A case study of applying MapReduce to the J-REX MSR tool shows significant reductions in running time for large datasets. Minimal programming effort was required and MapReduce could run on various computing environments.
This document reports on scaling tools for mining software repositories (MSR) studies using MapReduce. It finds that MapReduce can effectively scale three large MSR studies - a software evolution study, code clone detection, and log analysis - to larger datasets and clusters of up to 28 machines. The main challenges in migrating MSR studies to MapReduce are the locality and granularity of the analysis, locating a suitable cluster, managing large datasets, and handling errors.
This document proposes an approach to assist developers in verifying the deployment of big data analytics applications on Hadoop clouds. The approach involves three main steps: 1) log abstraction reduces the size of logs by grouping similar log lines, 2) log linking provides context by linking logs with the same task IDs, and 3) sequence simplification deals with repeated logs by removing duplicate events. This helps address issues like the large amount of log data and lack of context when verifying applications at scale in cloud environments.
This document discusses an approach for detecting performance anti-patterns in applications developed using Object-Relational Mapping (ORM). It presents a framework that can detect and rank performance anti-patterns based on their expected impact. As an example, it describes how the framework can detect an excessive data anti-pattern where ORM configurations eagerly retrieve data from the database that is never used. Repeated measurements are used to quantify the actual performance impact of anti-patterns by fixing the issues. The framework was evaluated on several open-source systems where it identified hundreds of potential excessive data anti-patterns.
The document describes a study on understanding log lines using development knowledge from source code. The researchers examined real-life inquiries about log lines from user mailing lists and logs of three large software systems. They found that experts are crucial in resolving log inquiries, with 8 out of 11 resolved inquiries addressed by experts. The researchers propose attaching development knowledge like source code, code comments, and issue reports to logs to help practitioners understand log messages without relying on expert assistance. An example demonstrates how different types of development knowledge can help explain the meaning, cause, impact and solution for the log message "fetch failure".
Our approach uses regression models on clustered performance counters to automatically detect performance regressions. It reduces counters, clusters remaining counters, selects target counters showing most significant differences between versions, and builds regression models to predict counters in the new version. When applied to real systems, our approach picks a small number of target counters and can accurately detect performance regressions, outperforming traditional approaches.
UI5con 2024 - Bring Your Own Design SystemPeter Muessig
How do you combine the OpenUI5/SAPUI5 programming model with a design system that makes its controls available as Web Components? Since OpenUI5/SAPUI5 1.120, the framework supports the integration of any Web Components. This makes it possible, for example, to natively embed own Web Components of your design system which are created with Stencil. The integration embeds the Web Components in a way that they can be used naturally in XMLViews, like with standard UI5 controls, and can be bound with data binding. Learn how you can also make use of the Web Components base class in OpenUI5/SAPUI5 to also integrate your Web Components and get inspired by the solution to generate a custom UI5 library providing the Web Components control wrappers for the native ones.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsPeter Muessig
The UI5 tooling is the development and build tooling of UI5. It is built in a modular and extensible way so that it can be easily extended by your needs. This session will showcase various tooling extensions which can boost your development experience by far so that you can really work offline, transpile your code in your project to use even newer versions of EcmaScript (than 2022 which is supported right now by the UI5 tooling), consume any npm package of your choice in your project, using different kind of proxies, and even stitching UI5 projects during development together to mimic your target environment.
Malibou Pitch Deck For Its €3M Seed Roundsjcobrien
French start-up Malibou raised a €3 million Seed Round to develop its payroll and human resources
management platform for VSEs and SMEs. The financing round was led by investors Breega, Y Combinator, and FCVC.
Flutter is a popular open source, cross-platform framework developed by Google. In this webinar we'll explore Flutter and its architecture, delve into the Flutter Embedder and Flutter’s Dart language, discover how to leverage Flutter for embedded device development, learn about Automotive Grade Linux (AGL) and its consortium and understand the rationale behind AGL's choice of Flutter for next-gen IVI systems. Don’t miss this opportunity to discover whether Flutter is right for your project.
8 Best Automated Android App Testing Tool and Framework in 2024.pdfkalichargn70th171
Regarding mobile operating systems, two major players dominate our thoughts: Android and iPhone. With Android leading the market, software development companies are focused on delivering apps compatible with this OS. Ensuring an app's functionality across various Android devices, OS versions, and hardware specifications is critical, making Android app testing essential.
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Drona Infotech is a premier mobile app development company in Noida, providing cutting-edge solutions for businesses.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemPeter Muessig
Learn about the latest innovations in and around OpenUI5/SAPUI5: UI5 Tooling, UI5 linter, UI5 Web Components, Web Components Integration, UI5 2.x, UI5 GenAI.
Recording:
https://www.youtube.com/live/MSdGLG2zLy8?si=INxBHTqkwHhxV5Ta&t=0
8. Logzip: Extracting Hidden
Structures via Iterative
Clustering for Log Compression
8
Workflow of Logzip
Ref:Liu,J
inyang,et al."Logzip:Extracting Hidden S
tructures via Iterative Clustering for Log Compression." 2019 34th IEEE/ACM International
Conference onAutomated S
oftware Engineering (AS
E).IEEE,2019.
Optimized if logs are
in big sizes!
9. Logs are typically stored in small
blocks
Log File
Split
Log Blocks
Compressed
Log Blocks
Compress Time/Size-based log rolling
16KB/60KB 128KB
256KB
384KB ~ 1024KB
64KB 64KB
10. Logzip does not perform well on
small log blocks.
The compression ratios of Logzip are 4% to 98% (by a median
of 63%) of the compression ratio without it.
• Do not have enough data to accurately extract template
• Not enough repetitiveness
• Preprocessing largely impact speed (up to 42s to
compress a 128KB log block)
• Inter-file repetitiveness not used
11. Initial investigation on log data
• T1: Identical tokens: Tokens with the same information (e.g., Year
component).
We observe 4 types of repetitiveness from the non-
content part of our selected log data.
• T2: Similar numeric tokens: Long & numeric tokens (e.g., Timestamp).
• T3: Repetitive tokens: Few tokens repeating a lot. (e.g., Log level)
• T4: Tokens with common prefix string: Tokens start with the same
information (e.g., Module).
H1: Extract identical tokens: Extract the identical token and its number of occurrences.
H2: Delta encoding for numbers: Save the delta between the current token and its prior
token (first token preserved).
H3: Build dictionary for repetitive tokens: Build a dictionary for each identical token and
replace tokens with their indexes.
H4: Extract common prefix string: Save the prefix string and store the remaining part of
each token.
12. Design of our preprocessing
approach: LogBlock
We do not perform extra information reduction steps to log content
part for compression performance concern.
13. An example of preprocessing heuristics
LogBlock’s preprocessing example
14. Our approach improves the compression ratio by a median of
5%, 9%, 15% and 21% on 16KB, 32KB, 64KB, and 128KB blocks
in comparison to compression without any preprocessing.
LogBlock improves the
compression ratio on small log
blocks Our approach is 31.0 to 50.1 times
faster than Logzip in preprocessing
and compressing small-sized log
blocks.
16. Log Parsing
16
logInfo("Found block $blockId locally")
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_20 locally
Timestamp: 17/06/09 20:11:11; Level: INFO
Logger: storage.BlockManager
Static template: Found block <*> locally
Dynamic variable(s): rdd_42_20
Generate
Contain
17. Automated log parsing suffers from low efficiency
17
Efficiency is an important concern for log
parsing
18. The main idea of Logram
18
Raw log
(Unstructured)
Found block rdd_42_20 locally
Found block rdd_42_22 locally
Found block rdd_42_23 locally
Found block rdd_42_24 locally
19. 19
Raw log
(Unstructured)
Found block rdd_42_20 locally
Found block rdd_42_22 locally
Found block rdd_42_23 locally
Found block rdd_42_24 locally
static dynamic
The main idea of Logram
20. 20
Raw log
(Unstructured)
Found block rdd_42_20 locally
Found block rdd_42_22 locally
Found block rdd_42_23 locally
Found block rdd_42_24 locally
The goal of log parsing is to identify whether a token
is a static token or a dynamic token
Each static token has a higher number
of appearance.
Token “Found” appears 4 times.
Each dynamic token has a lower number of
appearance.
Token “rdd_42_20” appears only once.
The main idea of Logram
21. 21
Raw log
(Unstructured)
Found block rdd_42_20 locally
Found block rdd_42_22 locally
Found block rdd_42_23 locally
Found block rdd_42_24 locally
Each static token has a higher
number of appearance.
Token “Found” appears 4 times.
Each dynamic token has a lower
number of appearance.
Token “rdd_42_20” appears only
once.
We use the number of appearances to
distinguish static and dynamic tokens.
The main idea of Logram
22. 22
Raw log
(Unstructured)
Expecting attribute name [0x800f080d - CBS_E_MANIFEST_INVALID_ITEM]
Failed to get next element [0x800f080d - CBS_E_MANIFEST_INVALID_ITEM]
A dynamic token may also appear
frequently.
The main idea of Logram
23. If we consider 3-grams instead of individual token,
each 3-gram only appear once.
23
Raw log
(Unstructured)
Expecting attribute name [0x800f080d - CBS_E_MANIFEST_INVALID_ITEM]
Failed to get next element [0x800f080d - CBS_E_MANIFEST_INVALID_ITEM]
A dynamic token may also appear frequently.
The main idea of Logram
24. 24
17/06/09 20:10:46 INFO rdd.HadoopRDD: Input split:
hdfs://hostname/2kSOSP.log:29168+7292
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_20 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_22 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_23 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_24 locally
Step 1: Dictionary Setup for n-grams
25. 25
17/06/09 20:10:46 INFO rdd.HadoopRDD: Input split:
hdfs://hostname/2kSOSP.log:29168+7292
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_20 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_22 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_23 locally
17/06/09 20:11:11 INFO storage.BlockManager: Found block rdd_42_24 locally
Header Content
Step 1: Dictionary Setup for n-grams
Input split: hdfs://hostname/2kSOSP.log:29168+7292
Found block rdd_42_20 locally
Found block rdd_42_22 locally
Found block rdd_42_23 locally
Found block rdd_42_24 locally
31. 3-grams # appearance
split: hdfs://hostname/2kSOSP.log:29168+7292
Found
hdfs://hostname/2kSOSP.log:29168+7292 Found
block
Found block rdd_42_20
block rdd_42_20 locally
rdd_42_20 locally Found
locally Found block
1
1
1
1
1
5
31
Found block rdd_42_20 locally
Look up
Found block rdd_42_20 block rdd_42_20 locally
Step 2: Parsing logs with n-gram dictionaries
32. 3-grams # appearance
split: hdfs://hostname/2kSOSP.log:29168+7292
Found
hdfs://hostname/2kSOSP.log:29168+7292 Found
block
Found block rdd_42_20
block rdd_42_20 locally
rdd_42_20 locally Found
locally Found block
1
1
1
1
1
5
Both 3-grams may contain dynamic values since
their appearances are only 1. 32
32
Found block rdd_42_20 locally
Look up
Found block rdd_42_20 block rdd_42_20 locally
Step 2: Parsing logs with n-gram dictionaries
33. 3-grams # appearance
split: hdfs://hostname/2kSOSP.log:29168+7292 Found
hdfs://hostname/2kSOSP.log:29168+7292 Found block
Found block rdd_42_20
block rdd_42_20 locally
rdd_42_20 locally Found
locally Found block
1
1
1
1
1
5
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4 33
Look up
Step 2: Parsing logs with n-gram dictionaries
34. 3-grams # appearance
split: hdfs://hostname/2kSOSP.log:29168+7292 Found
hdfs://hostname/2kSOSP.log:29168+7292 Found block
Found block rdd_42_20
block rdd_42_20 locally
rdd_42_20 locally Found
locally Found block
1
1
1
1
1
5
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4 34
Look up
This 2-gram contains only static tokens.
Step 2: Parsing logs with n-gram dictionaries
35. 3-grams # appearance
split: hdfs://hostname/2kSOSP.log:29168+7292 Found
hdfs://hostname/2kSOSP.log:29168+7292 Found block
Found block rdd_42_20
block rdd_42_20 locally
rdd_42_20 locally Found
locally Found block
1
1
1
1
1
5
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4 35
Look up
These 2-grams may contain dynamic tokens.
Step 2: Parsing logs with n-gram dictionaries
36. block rdd_42_20
rdd_42_20 locally
Finding overlapping
token
36
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4
Step 2: Parsing logs with n-gram dictionaries
37. block rdd_42_20
rdd_42_20 locally
Finding overlapping
token
Dynamic value 37
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4
Step 2: Parsing logs with n-gram dictionaries
38. block rdd_42_20
rdd_42_20 locally
Found block $1 locally
$1=rdd_42_20
Finding overlapping
token
Generating
template
Dynamic value 38
2-grams # appearance
hdfs://hostname/2kSOSP.log:29168+7292 Found
Found block
block rdd_42_20
rdd_42_20 locally
locally Found
1
4
1
1
4
Step 2: Parsing logs with n-gram dictionaries
40. 40
Average accuracy
Drain AEL Lenma Spell IPLoM
percent
age(%)
Logram
Logram achieves stable
parsing results using a
dictionary generated from
a small portion of log data
Logram achieves
near-linear
scalability
without
sacrificing
parsing accuracy.
Field workloads continually change as the user base changes (e.g., as more users use the system), as user feature preferences change (e.g., preferences change from desktop to mobile access), as features are activated or disabled and as the deployment configuration changes (e.g., new servers are added). Changing field workloads may be a major impact on the performance of the system. Therefore, as the field workload change, so must the load test workloads.
Shortcoming: query efficiency
Scene2: Store logs that need to be frequently queried.
Sort: breaks the inner-similarity of other components; Record line number, introduce extra non-repetitive information.
Text replacement: Need extra processing steps which impact the processing speed.
The goal of log parsing is to extract the static template, dynamic variables, and the header information from a raw log message to a structured format
as the size of logs grows rapidly and the need for low-latency log analysis increases
efficiency becomes an important concern for log parsing
To increase efficiency, we propose Logram which use n-gram model for log parsing
Here are some raw logs from a software system
The tokens in the blue box are the static parts, the tokens in red box are the dynamic parts
Just like the example in this slide
However, a dynamic token can appear frequently in different log events. Here we will show an example
so only depending on the frequency of individual tokens may not be sufficient.
we will use n-gram model to limit the appearance of this kind of dynamic token
At the beginning, we need to preprocess the raw logs
And followed by content part
the header parts of logs often follow a common format in the same software system, we can directly use a pre-defined regular expression to obtain this part
After getting the content of each log message, we split the log message into tokens.
After getting the content of each log message, we split the log message into tokens.
then, combine the tokens to 2-gram
use the same method
The value is the corresponding appearance
transfer the log to 3-grams, and look up the appearance in 3-gram dictionary
Then, we can find
Transform the candidate 3-grams to 2-grams and look up the appearance in 2-gram dictionary
Find the overlapping part of the candidate 2-grams
Find the overlapping part of the candidate 2-grams
Find the overlapping part of the candidate 2-grams
For evaluation, we evaluate Logram with 16 datasets from LogPai on 4 aspects