Instead of randomly injecting faults ( i.e. Chaos Monkey), what if we could order our experiments to perform min number of experiments for maximum yield? We present a solution(& results) to the problem of experiment selection using Lineage Driven Fault Injection to reduce the search space of faults.
Lineage Driven Fault Injection (LDFI) is a state of the art technique in chaos engineering experiment selection. LDFI since its inception has used an SAT solver under the hood which presents solutions to the decision problem (which faults to inject) in no particular order. As SRE’s we would like to perform experiments that reveal the bugs that the customers are most likely to hit first. In this talk, we present new improvements to LDFI that orders the experiment suggestions.
In the first the half of the talk we will show LDFI is a technique that can be widely used within an enterprise. We present the motivation for ordering the chaos experiments along with some prioritization we utilized while conducting the experiments. We also highlight how ordering is a general purpose technique that we can use to encode the peculiarities of a heterogeneous microservices architecture. LDFI can work in an enterprise by harnessing the observability infrastructure to model the redundancy of the system.
Next, we present experiments conducted within our organization using ordered LDFI and some preliminary results. We show examples of services where we discovered bugs, and how carefully controlling the order of experiments allowed LDFI to avoid running unnecessary experiments. We also present an example of an application where we declared the service shippable under crash stop model. We also present a comparison with Chaos Monkey and show how LDFI found the known bugs in a given application using orders of magnitude fewer experiments than a random fault injection tool like Chaos Monkey.
Finally, we discuss how we plan to take LDFI forward. We discuss open problems and possible solutions for scalarizing probabilities of failure, latency injection, integration with service mesh technologies like envoy for fine-grained fault injection, fault injection for stateful systems.
Key takeaways: 1) Understand how LDFI can be integrated in the enterprise by harnessing the observability infrastructure. 2) Limitations of LDFI w.r.t unordered solutions and why ordering matters for chaos engineering experiments. 3) Preliminary results of prioritized LDFI and a future direction for the community.
[Defcon Russia #29] Алексей Тюрин - Spring autobindingDefconRussia
В Spring MVC есть классная фича — autobinding. Но если пользоваться ей неправильно, могут появиться «незаметные» уязвимости, иногда с серьёзным импактом. Рассмотрим пару примеров, углубимся в тонкости появления autobinding-багов. Writeup [ENG]: http://agrrrdog.blogspot.ru/2017/03/autobinding-vulns-and-spring-mvc.html
(automatic) Testing: from business to university and backDavid Rodenas
This talk cares about the fundamentals of testing, a little bit history of how the professional community developed what we currently know as testing, but also about why I should care about testing? why is it important to do a test? What is important to test? What is not important to test? How to do testing?
There some examples in plnker just to see each step, and many surprises.
This talk also compares what people learned in the Computer Sciences and Engineering degrees and what people does in testing. It gives some tips to catch up with current state of art and gives some points to start changing syllabus to make better engineers.
This talk is good for beginners, teachers, bosses, but also for seasoned techies that just want to light up some of the ideas that they might have been hatching.
Spoiler alert: testing will save you development time and make you a good professional.
EdSketch: Execution-Driven Sketching for JavaLisa Hua
Sketching is a relatively recent approach to program synthesis, which has shown much promise. The key idea in sketching is to allow users to write partial programs that have “holes” and provide test harnesses or reference implementations, and let synthesis tools create program fragments that the holes such that the resulting complete program has the desired functionality. Traditional solutions to the sketching problem perform a translation to SAT and employ CEGIS. While e ective for a range of programs, when applied to real applications, such translation-based approaches have a key limitation: they require either translating all relevant libraries that are invoked directly or indirectly by the given sketch – which can lead to impractical SAT problems – or creating models of those libraries – which can require much manual effort.
is paper introduces execution-driven sketching, a novel approach for synthesis of Java programs using a backtracking search that is commonly employed in so ware model checkers. e key novelty of our work is to introduce effective pruning strategies to effciently explore the actual program behaviors in presence of libraries and to provide a practical solution to sketching small parts of real-world applications, which may use complex constructs of modern languages, such as reflection or native calls. Our tool EdSketch embodies our approach in two forms: a stateful search based on the Java PathFinder model checker; and a stateless search based on re-execution inspired by the VeriSoft model checker. Experimental results show that EdSketch’s performance compares well with the well-known SAT-based Sketch system for a range of small but complex programs, and moreover, that EdSketch can complete some sketches that require handling complex constructs.
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...Anne Nicolas
Coverity is a static analyzer that scans the kernel code and reports issues that can hide coding mistakes and vulnerabilities. Currently, it reports around 5,000 outstanding defects in the Linux kernel. I’m dedicated to fixing those defects and, this talk is a status report of the work I have been doing over the course of a year. Lessons learned, as well as the most common types of issues reported, will also be presented.
[Defcon Russia #29] Алексей Тюрин - Spring autobindingDefconRussia
В Spring MVC есть классная фича — autobinding. Но если пользоваться ей неправильно, могут появиться «незаметные» уязвимости, иногда с серьёзным импактом. Рассмотрим пару примеров, углубимся в тонкости появления autobinding-багов. Writeup [ENG]: http://agrrrdog.blogspot.ru/2017/03/autobinding-vulns-and-spring-mvc.html
(automatic) Testing: from business to university and backDavid Rodenas
This talk cares about the fundamentals of testing, a little bit history of how the professional community developed what we currently know as testing, but also about why I should care about testing? why is it important to do a test? What is important to test? What is not important to test? How to do testing?
There some examples in plnker just to see each step, and many surprises.
This talk also compares what people learned in the Computer Sciences and Engineering degrees and what people does in testing. It gives some tips to catch up with current state of art and gives some points to start changing syllabus to make better engineers.
This talk is good for beginners, teachers, bosses, but also for seasoned techies that just want to light up some of the ideas that they might have been hatching.
Spoiler alert: testing will save you development time and make you a good professional.
EdSketch: Execution-Driven Sketching for JavaLisa Hua
Sketching is a relatively recent approach to program synthesis, which has shown much promise. The key idea in sketching is to allow users to write partial programs that have “holes” and provide test harnesses or reference implementations, and let synthesis tools create program fragments that the holes such that the resulting complete program has the desired functionality. Traditional solutions to the sketching problem perform a translation to SAT and employ CEGIS. While e ective for a range of programs, when applied to real applications, such translation-based approaches have a key limitation: they require either translating all relevant libraries that are invoked directly or indirectly by the given sketch – which can lead to impractical SAT problems – or creating models of those libraries – which can require much manual effort.
is paper introduces execution-driven sketching, a novel approach for synthesis of Java programs using a backtracking search that is commonly employed in so ware model checkers. e key novelty of our work is to introduce effective pruning strategies to effciently explore the actual program behaviors in presence of libraries and to provide a practical solution to sketching small parts of real-world applications, which may use complex constructs of modern languages, such as reflection or native calls. Our tool EdSketch embodies our approach in two forms: a stateful search based on the Java PathFinder model checker; and a stateless search based on re-execution inspired by the VeriSoft model checker. Experimental results show that EdSketch’s performance compares well with the well-known SAT-based Sketch system for a range of small but complex programs, and moreover, that EdSketch can complete some sketches that require handling complex constructs.
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...Anne Nicolas
Coverity is a static analyzer that scans the kernel code and reports issues that can hide coding mistakes and vulnerabilities. Currently, it reports around 5,000 outstanding defects in the Linux kernel. I’m dedicated to fixing those defects and, this talk is a status report of the work I have been doing over the course of a year. Lessons learned, as well as the most common types of issues reported, will also be presented.
[Defcon Russia #29] Борис Савков - Bare-metal programming на примере Raspber...DefconRussia
Докладчик покажет, как с помощью bare-metal programming подружить Raspberry Pi с GPIO, памятью и Ethernet, и пояснит, кому и зачем это может понадобиться.
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Anne Nicolas
The Coccinelle C-program matching and transformation tool was first released in 2008 to facilitate specification and automation in the evolution of Linux kernel code. The novel contribution of Coccinelle is to allow software developers to write code manipulation rules in terms of the code structure itself, via a generalization of the patch syntax. Over the years, Coccinelle has been extensively used in Linux kernel development, resulting in over 6000 commits to the Linux kernel, and has found its place as part of the Linux kernel development process. This talk will review the history of Coccinelle and its impact on the Linux kernel. It will also briefly present two newer tools, prequel and spinfer, that have built on the Coccinelle infrastructure.
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...corehard_by
This is an update of the Clang-based implementation of Herb Sutter’s Lifetime safety profile for the C++ Core Guidelines, available online at cppx.godbolt.org.
TDD, BDD, ATDD are all methodologies that enable incremental design that is suitable for Agile environments. It seems that every day a new xDD methodology is born with the promise to be better than what came before. Should you use behaviour-driven tests or plain old unit tests? Which methodology is better? And how exactly would it benefit the development life cycle?
In this session, Dror will help to sort out the various methodologies – explaining where they came from, the tools they use, and discussing how and when to use each one. Here we will once and for all answer the question as to whether or not there’s one “DD” to rule them all.
These slides contain an introduction to Symbolic execution and an introduction to KLEE.
I made this for a small demo/intro for my research group's meeting.
JavaFest. Виктор Полищук. Legacy: как победить в гонкеFestGroup
У вас древний проект? Все зовут его «Legacy», а вас «неудачник»? Возможно они даже смеются над вами.
Давайте взглянем на ситуацию с другого ракурса. Все (все, Карл!) успешные проекты рано или поздно превращаются в Legacy-проекты.
Я затрону тему Legacy не просто как явление, а как возможность быть постоянно в тренде, прослыть супер-спецом (даже если ты знаешь всего два фреймворка), сделать карьеру, как делать, то что ты хочешь, а не то что тебя просят. Ладно, ладно, я наврал про два фреймворка, но все остальное чистая правда. Я покажу, что вы можете творить, имея правильный подход к Legacy коду.
Суть в том, что Legacy — это не грустно/уныло/немодно, это просто/клево/весело, если с умом подойти к задаче!
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
[Defcon Russia #29] Борис Савков - Bare-metal programming на примере Raspber...DefconRussia
Докладчик покажет, как с помощью bare-metal programming подружить Raspberry Pi с GPIO, памятью и Ethernet, и пояснит, кому и зачем это может понадобиться.
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Anne Nicolas
The Coccinelle C-program matching and transformation tool was first released in 2008 to facilitate specification and automation in the evolution of Linux kernel code. The novel contribution of Coccinelle is to allow software developers to write code manipulation rules in terms of the code structure itself, via a generalization of the patch syntax. Over the years, Coccinelle has been extensively used in Linux kernel development, resulting in over 6000 commits to the Linux kernel, and has found its place as part of the Linux kernel development process. This talk will review the history of Coccinelle and its impact on the Linux kernel. It will also briefly present two newer tools, prequel and spinfer, that have built on the Coccinelle infrastructure.
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...corehard_by
This is an update of the Clang-based implementation of Herb Sutter’s Lifetime safety profile for the C++ Core Guidelines, available online at cppx.godbolt.org.
TDD, BDD, ATDD are all methodologies that enable incremental design that is suitable for Agile environments. It seems that every day a new xDD methodology is born with the promise to be better than what came before. Should you use behaviour-driven tests or plain old unit tests? Which methodology is better? And how exactly would it benefit the development life cycle?
In this session, Dror will help to sort out the various methodologies – explaining where they came from, the tools they use, and discussing how and when to use each one. Here we will once and for all answer the question as to whether or not there’s one “DD” to rule them all.
These slides contain an introduction to Symbolic execution and an introduction to KLEE.
I made this for a small demo/intro for my research group's meeting.
JavaFest. Виктор Полищук. Legacy: как победить в гонкеFestGroup
У вас древний проект? Все зовут его «Legacy», а вас «неудачник»? Возможно они даже смеются над вами.
Давайте взглянем на ситуацию с другого ракурса. Все (все, Карл!) успешные проекты рано или поздно превращаются в Legacy-проекты.
Я затрону тему Legacy не просто как явление, а как возможность быть постоянно в тренде, прослыть супер-спецом (даже если ты знаешь всего два фреймворка), сделать карьеру, как делать, то что ты хочешь, а не то что тебя просят. Ладно, ладно, я наврал про два фреймворка, но все остальное чистая правда. Я покажу, что вы можете творить, имея правильный подход к Legacy коду.
Суть в том, что Legacy — это не грустно/уныло/немодно, это просто/клево/весело, если с умом подойти к задаче!
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
University of Virginia
cs4414: Operating Systems
Rust Expressions and Higher-Order Procedures
How to Share a Processor
Non-Preemptive and Preemptive Multitasking
Kernel Timer Interrupt
Recent years have seen the emergence of several static analysis techniques for reasoning about programs. This talk presents several major classes of techniques and tools that implement these techniques. Part of the presentation will be a demonstration of the tools.
Dr. Subash Shankar is an Associate Professor in the Computer Science department at Hunter College, CUNY. Prior to joining CUNY, he received a PhD from the University of Minnesota and was a postdoctoral fellow in the model checking group at Carnegie Mellon University. Dr. Shankar also has over 10 years of industrial experience, mostly in the areas of formal methods and tools for analyzing hardware and software systems.
In VLSI design, Design for Testability (DFT) is an approach that aims to make digital circuits easier to test during the manufacturing and debugging process. DFT in VLSI design involves incorporating additional circuitry and design features such as scan chains, built-in self-test (BIST) circuits, and boundary scan cells into the chip design to facilitate testing. Design for testability in VLSI design is essential to ensure that the fabricated chips are free from any kind of manufacturing defects. It also reduces the overall test time and thereby the cost of testing, and debugging. By incorporating DFT techniques into the chip design, it becomes easier to test the structural correctness of the chip, leading to higher-quality products and faster time-to-market.
(SAC2020 SVT-2) Constrained Detecting Arrays for Fault Localization in Combin...Hao Jin
Authors:
Hao Jin, Osaka University
Ce Shi, Shanghai Lixin University of Accounting and Finance
Tatsuhiro Tsuchiya, Osaka University
Abstract:
Detecting Arrays (DAs) are mathematical objects that enable fault localization in combinatorial interaction testing. Each row of a DA serves as a test case, whereas a whole DA is treated as a test suite. In real-world testing problems, it is often the case that some constraints exist among test parameters. In this paper, we show that it may be impossible to construct a DA using only constraint-satisfying test cases. The reason for this is that a set of some faulty interactions may always mask the effect of other faulty interactions in the presence of constraints. Based on this observation, we propose the notion of Constrained Detecting Arrays (CDAs) to adapt DAs to practical situations. The definition of CDAs requires that all rows of a CDA must satisfy the constraints and the same fault localization capability as the DA must hold except for such inherently undetectable faults. We then propose a computational method for constructing CDAs. Experimental results obtained by using a program that implements the method show that the method was able to produce CDAs within a reasonable time for practical problem instances.
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...Iosif Itkin
TMPA-2017: Tools and Methods of Program Analysis
3-4 March, 2017, Hotel Holiday Inn Moscow Vinogradovo, Moscow
Distributed Analysis of the BMC Kind: Making It Fit the Tornado Supercomputer
Azat Abdullin, Daniil Stepanov,St.Petersburg Polytechnic University
Marat Akhin, JetBrains Research
For video follow the link: https://youtu.be/CPlPpwFtN7k
Would like to know more?
Visit our website:
www.tmpaconf.org
www.exactprosystems.com/events/tmpa
Follow us:
https://www.linkedin.com/company/exactpro-systems-llc?trk=biz-companies-cym
https://twitter.com/exactpro
Similar to Madaari : Ordering For The Monkeys (20)
Massively scalable ETL in real world applications: the hard wayJ On The Beach
Big Data examples always give the correct answers. However, in the real world, Big Data might be corrupt, contradictory or consist of so many small files it becomes extremely hard to keep track - let alone scale. A solid architecture will help to overcome many of the difficulties.
Floris will talk about a real-world implementation of a massively scalable ETL architecture. Two years ago, at the time of the implementation, Airflow just became part of Apache and still left many features to be desired for. However, requirements from the start were thousands of ETL tasks per day on average, but on occasion, this could become hundreds of thousands. The script-based method that was in place was already not capable to meet the requirements on a day to day basis and needed to be replaced as soon as possible. So this custom framework was rolled out in just 8 weeks of development time.
Traditional Big Data is done on Data you have. You load the data into a repository and perform map reduce or other style calculations on the data. However, certain industries need to perform complex operations on data you might not have. Data you can acquire, Data that can be shared with you, and Data that you can model are all types of data you may not have but may need to integrate instantly into a complex data analysis. Problem is: you may not even know you need this data until deep into the execution stack at runtime. This talk discusses a new functional language paradigm for dealing naturally with data you don’t have and about how to make all data first-class citizens, regardless of whether you have it or you don’t, and we will give a demo of a project written in Scala to deal exactly with this issue.
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...J On The Beach
Industry 4.0, aka the "Fourth Industrial Revolution," refers to the computerization of manufacturing. One important aspect of Industry 4.0 is the ability to monitor the health and reliability of a physical manufacturing plant using low-cost IoT sensors. For example, machine learning models can be trained to predict the physical degradation of a manufacturing system as a function of acoustic measurements obtained from strategically placed microphones; however, the same acoustic measurements can be used to reverse engineer proprietary information about the manufacturing process and/or precisely what is being manufactured at the time of recording. Thus, improved reliability and fault tolerance is achieved at the cost of what appears to be an unprecedented new class of security vulnerabilities related to the acoustic side channel.
As a case study, we report a novel acoustic side channel attack against a commercial DNA synthesizer, a commonly used instrument in fields such as synthetic biology. Using a smart phone-quality microphone placed on or in the near vicinity of a DNA synthesizer, we were able to determine with 88.07% accuracy the sequence of DNA being produced; using a database of biologically relevant known-sequences, we increased the accuracy of our model to 100%. An academic or industrial research project may use the synthetic DNA to engineer an organism with desired traits or functions; however, while the organism is still under development, prior to publication, patent, and/or copyright, the research remains vulnerable to academic intellectual property theft and/or industrial espionage. On the other hand, this attack could also be used for benevolent purposes, for example, to determine whether a suspected criminal or terrorist is engineering a harmful pathogen. Thus, it is essential to recognize both the benefits and risks inherent to the cyber-physical systems that will inevitably control Industry 4.0 manufacturing processes and to take steps to mitigate them whenever possible.
Where is the edge in IoT and how much can you do there? Data collection? Analytics? I’ll show you how to build and deploy an embedded IoT edge platform that can do data collection, analytics, dashboarding and much more. All using Open Source.
As IoT deployments move forward, the need to collect, analyze, and respond to data further out on the edge becomes a critical factor in the success – or failure – of any IoT project. Network bandwidth costs may be dropping, and storage is cheaper than ever, but at IoT scale, these costs can still quickly overrun a project’s budget and ultimately doom it to failure.
The more you centralize your data collection and storage, the higher these costs become. Edge data collection and analysis can dramatically lower these costs, plus decrease the time to react to critical sensor data. With most data platforms, it simply isn’t practical, or even possible, to push collection AND analytics to the edge. In this talk I’ll show how I’ve done exactly this with a combination of open source hardware – Pine64 – and open source software – InfluxDB – to build a practical, efficient and scalable data collection and analysis gateway device for IoT deployments. The edge is where the data is, so the edge is where the data collection and analytics needs to be.
Drinking from the firehose, with virtual streams and virtual actorsJ On The Beach
Event Stream Processing is a popular paradigm for building robust and performant systems in many different domains, from IoT to fraud detection to high-frequency trading. Because of the wide range of scenarios and requirements, it is difficult to conceptualize a unified programming model that would be equally applicable to all of them. Another tough challenge is how to build streaming systems with cardinalities of topics ranging from hundreds to billions while delivering good performance and scalability.
In this session, Sergey Bykov will talk about the journey of building Orleans Streams that originated in gaming and monitoring scenarios, and quickly expanded beyond them. He will cover the programming model of virtual streams that emerged as a natural extension of the virtual actor model of Orleans, the architecture of the underlying runtime system, the compromises and hard choices made in the process. Sergey will share the lessons learned from the experience of running the system in production, and future ideas and opportunities that remain to be explored.
Over the last twenty years, there has been a paradigm shift in software development: from meticulously planned release cycles to an experimental way of working in which lead times are becoming shorter and shorter.
How can Java ever keep up with this trend when we have Docker containers that are several hundred megabytes in size, with warm-up times of ten minutes or longer? In this talk, I'll demonstrate how we can use Quarkus so that we can create super small, super fast Java containers! This will give us better possibilities for scaling up and down - which can be a game-changer, especially in a serverless environment. It will also provide the shortest possible lead times, as well as a much better use of cloud performance with the added bonus of lower costs.
When Cloud Native meets the Financial SectorJ On The Beach
We live in our own bubble of microservices and endlessly horizontal scaling infrastructure, but there is still critical infrastructure that runs the world of financial systems depending on Windows boxes, FTP servers, and single-threaded protocols. This talk is about how to glue these two worlds together, what works for us and what doesn't.
The advancement of technology in the last decade or so has allowed astronomy to see exponential growth in data volumes. ESA's space telescope Euclid will gather high-resolution images of a third of the sky, ~850GB of data downloaded daily for 6 years, by 2032 ground-based telescope LSST will have generated 500PB of data and the radio telescope SKA will be producing more data per second than the entire internet worldwide. This talk will address the questions of what current techniques exist to address big data volumes, how the astronomical community will prepare for this big data wave, and what other challenges lie ahead?
The world is moving from a model where data sits at rest, waiting for people to make requests of it, to where data is constantly moving and streams of data flow to and from devices with or without human interaction. Decisions need to be made based on these streams of data in real-time, models need to be updated, and intelligence needs to be gathered. In this context, our old-fashioned approach of CRUD REST APIs serving CRUD database calls just doesn't cut it. It's time we moved to a stream-centric view of the world.
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...J On The Beach
Our increasingly connected world leveraging the Internet of Things (IoT) creates great value, in connected healthcare, smart cities, and more. The increasing use of IoT also creates great risk. We will discuss the challenges and risks we need to address as developers in TIPPSS - Trust, Identity, Privacy, Protection, Safety, and Security - for devices, systems and solutions we deliver and use. Florence leads IEEE workstreams on clinical IoT and data interoperability with blockchain addressing TIPPSS issues. She is an author of IEEE articles on "Enabling Trust and Security - TIPPSS for IoT" and "Wearables and Medical Interoperability - the Evolving Frontier", "TIPPSS for Smart Cities" in the 2017 book "Creating, Analysing and Sustaining Smarter Cities: A Systems Perspective" , and Editor in Chief for an upcoming book on "Women Securing the Future with TIPPSS for IoT."
Pushing AI to the Client with WebAssembly and BlazorJ On The Beach
Want to run your AI algorithms directly in the browser on the client-side? Now you can with WebAssembly and Blazor. Join us as we write code directly in WebAssembly. Then, we’ll look at Blazor and how you can use it, along with WebAssembly to run your tooling client side in the browser.
Want to run your AI algorithms directly in the browser on the client-side without the need for transpilers or browser plug-ins? Well, now you can with WebAssembly and Blazor. WebAssembly (WASM) is the W3C specification that will be used to provide the next generation of development tools for the web and beyond. Blazor is Microsoft’s experiment that allows ASP.Net developers to create web pages that do much of the scripting work in C# using WASM. Come join us as we learn to write code directly in WebAssembly’s human-readable format. Then, we’ll look at the current state of Blazor and how you can use it, along with WebAssembly to run your tooling client side in the browser.
RAFT protocol is a well-known protocol for consensus in Distributed Systems. Want to learn how consensus is achieved in a system with a large amount of data such as Axon Server’s Event Store? Join this talk to hear about all specifics regarding data replication in highly available Event Store!
Axon is a free and open source Java framework for writing Java applications following DDD, event sourcing, and CQRS principles. While especially useful in a microservices context, Axon provides great value in building structured monoliths that can be broken down into microservices when needed.
Axon Server is a messaging platform specifically built to support distributed Axon applications. One of its key benefits is storing events published by Axon applications. In not so rare cases, the number of these events is over millions, even billions. Availability of Axon Server plays a significant role in the product portfolio. To keep event replication reliable we chose RAFT protocol for consensus implementation of our clustering features.
In short, consensus involves multiple servers agreeing on values. Once they reach a decision on a value, that decision is final. Typical consensus algorithms make progress when any majority of their servers is available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result).
Join this talk to learn why we chose RAFT; what were our findings during the design, the implementation, and testing phase; and what does it mean to replicate an event store holding billions of events!
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...J On The Beach
Thinking of moving to Microservices? Watch out! That quest is full of traps, social traps. If you are not able to handle it, you may be blocked by meetings, frustration, endless challenges that will make you miss the monolith. In this talk, I share my experience and mistakes, so you can avoid them.
Creating or migrating to a Microservices architecture might easily become a big mess, not only due to technical challenges but mostly because of human factors: it’s a major change in the software culture of a company. In this talk, I’ll share my past experience as the technical lead of an ambitious Microservices-based product, I’ll go through the parts we struggled with, and give you some advice on how to deal with what I call the Six Pitfalls:
The Common Patterns Phobia
The Book Club Cult
The Never-Decoupled Story
The Buzz Words Syndrome
The Agile Trap
The Conway’s Law Hackers
Complexity in systems should be defeated if it is possible to do. But the default nature of our computer systems are complex and servers are doomed to fail. In this talk, we will go through new approaches in modern architectures to design and evaluate new computer systems.
Interaction Protocols: It's all about good mannersJ On The Beach
Distributed systems collaborate to achieve collective goals via a system of rules. Rules that affords good hygiene, fault tolerance, effective communication and trusted feedback. These rules form protocols which enable the system to achieve its goals.
Distributed and concurrent systems can be considered a social group that collaborates to achieve collective goals. In order to collaborate a system of rules must be applied, that affords good hygiene, fault tolerance, and effective communication to coordinate, share knowledge, and provide feedback in a polite trusted manner. These rules form a number of protocols which enable the group to act as a system which is greater than the sum of the individual components.
In this talk, we will explore the history of protocols and their application when building distributed systems.
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...J On The Beach
Do you want to check the efficiency of the new, state of the art, GraalVM JIT Compiler in comparison to the old but mostly used JIT C2? Let’s have a side by side comparison from a performance standpoint on the same source code.
The talk reveals how traditional Just In Time Compiler (e.g. JIT C2) from HotSpot/OpenJDK internally manages runtime optimizations for hot methods in comparison to the new, state of the art, GraalVM JIT Compiler on the same source code, emphasizing all of the internals and strategies used by each Compiler to achieve better performance in most common situations (or code patterns). For each optimization, there is Java source code and corresponding generated assembly code in order to prove what really happens under the hood.
Each test is covered by a dedicated benchmark (JMH), timings and conclusions. Main topics of the agenda: - Scalar replacement - Null Checks - Virtual calls - Lock coarsening - Lock elision - Virtual calls - Scalar replacement - Lambdas - Vectorization (few cases)
The tools used during my research study are JITWatch, Java Measurement Harness, and perf. All test scenarios will be launched against the latest official Java release (e.g. version 11).
Leadership is easy when you're a manager, or an expert in a field, or a conference speaker! In a Kanban organisation, though, we "encourage acts of leadership at every level". In this talk, we look at what it means to be a leader in the uncertain, changing and high-learning environment of software development. We learn about the importance of safety in encouraging others to lead and follow, and how to get that safety using both technical and human practices; the necessity of a clear, compelling vision and provision of information on how we're achieving it; and the need to be able to ask awkward and difficult questions... especially the ones without easy answers.
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
During this presentation, we will answer how much you’ll need to invest in a superhero costume to be as popular as Superman. We will generate a unique logo which will stand against the ever popular Batman and create new superhero teams. We shall achieve it using linear regression and neural networks.
Machine learning is one of the hottest buzzwords in technology today as well as one of the most innovative fields in computer science – yet people use libraries as black boxes without basic knowledge of the field. In this session, we will strip them to bare math, so next time you use a machine learning library, you’ll have a deeper understanding of what lies underneath.
During this session, we will first provide a short history of machine learning and an overview of two basic teaching techniques: supervised and unsupervised learning.
We will start by defining what machine learning is and equip you with an intuition of how it works. We will then explain the gradient descent algorithm with the use of simple linear regression to give you an even deeper understanding of this learning method. Then we will project it to supervised neural networks training.
Within unsupervised learning, you will become familiar with Hebb’s learning and learning with concurrency (winner takes all and winner takes most algorithms). We will use Octave for examples in this session; however, you can use your favourite technology to implement presented ideas.
Our aim is to show the mathematical basics of neural networks for those who want to start using machine learning in their day-to-day work or use it already but find it difficult to understand the underlying processes. After viewing our presentation, you should find it easier to select parameters for your networks and feel more confident in your selection of network type, as well as be encouraged to dive into more complex and powerful deep learning methods.
Getting started with Deep Reinforcement LearningJ On The Beach
Reinforcement Learning is a hot topic in Artificial Intelligence (AI) at the moment with the most prominent example of AlphaGo Zero. It shifted the boundaries of what was believed to be possible with AI. In this talk, we will have a look into Reinforcement Learning and its implementation.
Reinforcement Learning is a class of algorithms which trains an agent to act optimally in an environment. The most prominent example is AlphaGo Zero, where the agent is trained to place tokens on the board of Go in order to win the game. AlphaGo Zero has won against the world champion which was thought to be impossible at that time. This was enabled by combining Reinforcement Learning with Deep Neural Networks and is today known as Deep Reinforcement Learning. This has shifted the frontier of Artificial Intelligence and enabled multiple complex use cases, among them controlling the cooling devices in the server rooms by google. Applying Deep Reinforcement Learning saved them several million in power costs. In this talk, we will understand the basics of Deep Reinforcement Learning and implement a simple example. We will have a look at OpenAIs gym which is the defacto standard for Reinforcement Learning environments. This will enable the audience to implement both an environment and Reinforcement Learning agent on their own.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
3. Agenda
● Distributed Systems and Chaos Engineering : State Of The Union
● Lineage Driven Fault Injection : A Brief Primer
● LDFI : Ordering Of Faults
● Bringing LDFI to the Enterprise
● Results
● Future Work
3
4. Industry + Academia = Win !!
Joint work between eBay and Disorderly Labs
● Dr. Peter Alvaro ( UCSC )
● Kamala Ramasubramanian ( UCSC )
● eBay SRE Team
Madaari : a trainer who teaches a monkey to perform tricks
4
5. The Problem : Testing Distributed
Systems
Combinatorial Space of FailuresMicroservices Death Star
Consider 100 Services
Fault Search Space : 2100
5
Fault
Cardinality
Possible
Faults
1 100
4 3 Million
6. Chaos Engineering : A Possible Solution
● Failure is inevitable, let’s fail in a controlled environment
● Proactively inject failure in your system to reveal weaknesses
● Perturbation and observation of large-scale systems
6
7. Chaos Engineering : A Brief Primer
Doesn’t
scale well !!
7
A genius holds the
mental model of the
system
Guided Fault Injection
No Model Of The
System
Random Fault
Injection
Can’t quantify
progress
8. Lineage Driven Fault Injection aka LDFI
CLAIM : Fault Tolerance = Redundancy
● Use explanations of successful outcomes to search for faults that can drive the system
into a bad state
● Observing successful executions enables LDFI to build a model of the redundancy of the
system
8
9. Lineage Driven Fault Injection aka LDFI
Why did a good thing happen?
Consider its lineage.
9
What could have gone wrong?
Faults are cuts in the lineage graph.
Is there a cut that breaks all supports?
10. Lineage Driven Fault Injection aka LDFI
(RepA OR Bcast1)
10
AND (RepA OR Bcast2)
AND (RepB OR Bcast2)
AND (RepB OR Bcast1)
11. Lineage Driven Fault Injection aka LDFI
(RepA OR Bcast1)
AND (RepA OR Bcast2)
AND (RepB OR Bcast2)
AND (RepB OR Bcast1)
Hypothesis: {Bcast1, Bcast2}
11
12. LDFI : Building Blocks
● Witnessing a large number of successful
executions allows LDFI to build a model
of redundancy of the system
● How? Because it can reason about why
faults were tolerated
12
13. LDFI : Building Blocks
Recipe:
1. Start with a successful outcome. Work
backwards.
2. Ask why it happened ? Ans. Lineage (Traces)
3. Convert lineage to a CNF formula and solve
the decision problem ( using a SAT solver )
4. Lather, rinse, repeat
13
14. Encoding the Lineage
(A v B v C v D v E)
14
A
B
C
ED
(A v C v D v E)
(A v B v C v D v E) ^ (A v C v D v E)
A
C
D E
B
15. Injecting Faults That Matter
● Drawbacks of existing approach
○ LDFI (using SAT) reduces the search space but the search space might still be still
large
○ LDFI is a decision problem, solutions are returned in no particular order
● We want to order solutions (run experiments) to:
○ Find the most likely faults before users do!
○ Reduce the search space as much as possible
15
16. Ordering Faults : Injecting Faults That
Matter
16
LDFI assumes all faults are equally likely,
the reality differs !!
Intuition : Some faults are more likely than
others; incident history usually backs this
claim
We want to encode our intuition of failure
in LDFI
A
B
C
ED
F
17. Ordering Of Faults
(A ∨ B ∨ C ) ∧ (C ∨ D ∨ E ∨ F) ∧ (D ∨ E ∨ F ∨ G)
∧ (H ∨ I)
(A, B, C), (C, D, E, F), (D, E, F, G), (H, I)
17
18. Ordering Of Faults : Minimal Hitting Set
(A ∨ B ∨ C ) ∧ (C ∨ D ∨ E ∨ F) ∧ (D ∨ E ∨ F ∨ G) ∧ (H ∨ I)
(A, B, C), (C, D, E, F), (D, E, F, G), (H, I)
18
e.g (C,E,H)
19. Ordering Of Faults : Minimal Hitting Set
(A ∨ B ∨ C ) ∧ (C ∨ D ∨ E ∨ F) ∧ (D ∨ E ∨ F )
Maximise: XAlog(PA) + XBlog(PB) + XClog(PC) + XDlog(PD) + XElog(PE) +
XFlog(PF)
Subject to:
XA + XB + XC >= 1
XC + XD + XE + XF >= 1
XD + XE + XF >= 1 19
20. Ordering Faults : Injecting Faults That
Matter
20
A
B
Use the structure of the Trace to prune the Solution Space :
1. Rank Of the Service ( distance from the root )
2. Size Of the sub graph of the Service
3. If we survive the failure of C, we will surely survive the failure
of D, E and F
A
B
C
ED
F
21. Ordering Faults : Injecting Faults That
Matter
● All services are not created equal, some services fail more than others
● Likelihood and Containment :
○ P(Node failure) > P(Rack Failure) >> P(Data center failure)
● Historical measures :
○ Time since last release
○ History Of Failure and Bug Rate
21
22. LDFI in the Enterprise
Explanations
Models Of
Redundancy
Fault Injection
22
23. Traces = Explanations
● Distributed Tracing
○ Call graphs come for free
● Less Ideal (but OK) : Structured
Logging
○ We did this too !!
23
What are traces anyway ?
○ Ordered Events with context
stitched together
○ Create the call graphs using
service names and endpoints
24. Fault Injection Tool
● We rolled our own ( Mowgli )
○ Inspired by Trogdor ( Kafka’s FIT
Tool)
○ Circuit breaker aware fault injection
tool, deals with services and
databases
○ Built in safety mechanisms
○ Hooks for AZ level, node level fault
injection
○ Audit and Tracking capabilities
24
● Lots of open source options available
○ Start simple, a script to drop
network traffic is also OK
○ https://github.com/dastergon/awes
ome-chaos-engineering
● Tip : Be safe by default
○ Always have a rollback strategy
25. Interaction Replay
● Ability to replay interactions ( Tip : E2E Tests )
● Measure of Success
○ A unique binary (yes or no ) way of saying whether the execution was successful or not
● Works for Eventually Consistent systems as well, as long as there is finite
upper bound on the eventuality
25
26. LDFI in the Enterprise
Traces/Structured
Logs LDFI FIT Tool
To Call
Graphs
Encode For
The Solver Fault
Suggestion
● PyCoSAT
● PULP
● SAT4J
26
28. Comparison With Chaos Monkey
28
Strategy Fault Experiment Runs
(avg.)
Standard Deviation
Ordered LDFI 17 0
Uniform Random 210.35 111.42
How long did it take to find those 5 bugs? A few hours
(An experiment takes ~2 minute, and we did retries to get around our infrastructure)
30. Madaari : The Road Ahead
● Scalarizing Probabilities of Failure
● SLA verification using strategic Delay Injection
● Reason about Stateful systems
● Fine Grained Fault Injection
● Microservices Only ?
○ Databases, Containers, Service Mesh .. Let’s Go !!
30
31. LDFI : The Road Ahead
3 W’s For Fault Injection
1. What to inject ? ( type of fault we want to inject )
2. Where to inject ? (the target component )
3. When to inject ? ( inject when there are exactly 5 items in the cart !! )
31
32. LDFI : The Road Ahead
A Journey from Time to State and back
1. What’s time anyway ??
2. Applications have state and change of state gives you implicit order.
3. A rendezvous of state and time gives us precision for fault injection.
32
33. Madaari : Key Takeaways
● Industry and Academia can work together for fun(d) and profit
● Limitations of LDFI w.r.t unordered solutions and why ordering matters for
chaos engineering experiments
● Understand how LDFI can be integrated in the enterprise by harnessing the
observability infrastructure
● Preliminary results of prioritized LDFI and a future direction for the community
● Evangelising new techniques is hard; start small and stay simple
33