Trust and Confidence through Chaos Keynote for W-JAX Munich 2018

•

1 like•552 views

Keynote delivered for W-JAX in Munich in November 2018 on how you can use Chaos Engineering as part of establishing your own Resilience Engineering capability.

Technology

Trust and Conﬁdence through
Chaos
Russ Miles
CEO, ChaosIQ
The Why, What, How and Who
of Chaos Engineering
or
How and why you should start doing Chaos Engineering
in your organisation today!

Russ Miles
CEO, co-founder
Your Host Today
Sylvain Hellegouarch
CTO, co-founder
ChaosIQ

“To support our users in
establishing their own  
Resilience Engineering
Capability”

“To enable EVERYONE to do chaos
engineering, safely and with the emphasis
on establishing learning through building
your own Resilience Engineering
Capability.”

2010
https://www.gremlin.com/community/tutorials/chaos-engineering-the-history-principles-and-practice/
Chaos Monkey

“she caused a “mission” to crash by selecting
the DSKY keys in an unexpected way, alerting
the team as to what would happen if the
prelaunch program, P01, were inadvertently
selected by a real astronaut during a real
mission, during real midcourse.”
Murphy, Niall Richard; Beyer, Betsy; Jones,
Chris; Petoff, Jennifer. Site Reliability
Engineering: How Google Runs Production
Systems . O'Reilly Media. Kindle Edition.

Distributed Systems?
External Dependencies?

Infrastructure
Platform
Applications
People, Practices & Process

TBD
Wouldn’t it be great
if there was a proactive
practice for exploring and
diminishing system
weaknesses before they
affected users?
Probably a pipe
dream…

A mindset,  
a process,
some ethics 
some practices,  
and some tools,
a minimum bar

"never let an outage go to
waste" -
@caseyrosenthal

“Chaos Engineering allows us to have
‘pre-mortems’ instead of post-mortems.”
- Michael Barrett, Head of Infrastructure
Engineering, Remind

1. Form a hypothesis.
2. Communicate to your team.
3. Run experiments.
4. Analyze the results.
5. Increase the scope.
6. Automate experiments.
https://blog.codeship.com/embracing-the-chaos-of-chaos-engineering/

Rule 1:
Don’t talk
about Chaos Club…
ChaosIQ.io

Rule 2:
It’s about
“Learning”
ChaosIQ.io

Rule 3:
Chaos is not
a surprise
ChaosIQ.io

Rule 4:
If you know
the consequences,
don’t do the experiment
ChaosIQ.io

See you on the road!
Ride Safe, Ride Reliable,
Do Chaos (Engineering)
@russmiles
russ@chaosiq.io

This document discusses chaos engineering, which is the practice of experimenting on a distributed system in production to build confidence in its ability to withstand failures. It describes introducing controlled failures or experiments to test a system's resilience. The key aspects covered are defining hypotheses about potential failures before experiments, designing and executing small experiments initially, learning from the results to identify issues, fixing any problems found, and embedding chaos engineering into the development process and culture. Patterns for building resilient systems like parallelism, async communication, and circuit breakers are also overviewed.

From Chaos to Verification at Expedia Group, London

Russell Miles

Choose your own adventure Chaos Engineering - QCon NYC 2017

Nora Jones

Chaos Engineering 101 by Russ Miles

Russell Miles

muCon 2017 - Build Confidence in your System with Chaos Engineering

Sylvain Hellegouarch

Chaos Engineering – why we should all practice breaking things on purpose by ...

Alex Cachia

An Introduction to Chaos Engineering

Gremlin

This document summarizes a Chaos Engineering Meetup in Mumbai. It provides an agenda for the meetup including introductions to resilient systems, Chaos Engineering, examples at other companies, and demos of Chaos Engineering tools like Pumba and Pod-Reaper. The meetup organizer Shantanu Deshpande is identified. Prerequisites for Chaos Engineering like incident management, monitoring, and measuring downtime impact are also briefly covered.

Chaos Engineering: Why the World Needs More Resilient Systems

C4Media

Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2luk9iS. Tammy Butow shares her experiences using chaos engineering to build resilient systems, when they couldn’t build their systems from scratch. Filmed at qconlondon.com. Tammy Butow is a Principal SRE at Gremlin where she works on Chaos Engineering, the facilitation of controlled experiments to identify systemic weaknesses. Previously, she led SRE teams at Dropbox responsible for Databases and Storage systems used by over 500 million customers.

CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

DevOpsDays Tel Aviv

Configuration Management is at the core of Ops. It’s the biggest enabler of any compute operation, small and big. In the past decade, we have switched from thinking about the machines we are configuring, to think about the software and services we are controlling. With that change of mindset, so did the tools we are using. Traditional tools like Puppet, chef, salt and Ansible are slowly declining while new tools such as Terraform, Pulumi, Helm and Kustomize are on the rise. In this talk I will try to describe the pain-points and the opportunities of this transformation as well as suggesting a future direction based on tools developed at the big-tech companies (Mainly facebook and google).

Chaos Engineering

Yury Roa

This document discusses chaos engineering, which involves deliberately inducing failures or errors in a system to test its resilience. It defines chaos engineering and provides an overview of its history and principles. Netflix is cited as pioneering chaos engineering in 2008 with tools like Chaos Monkey that randomly terminate instances. The document outlines the phases of chaos engineering experiments and provides an example using Chaos Monkey for Spring Boot applications. It also notes that while testing is important, chaos engineering generates new information about how systems respond under turbulent conditions.

Ops Happen: Improve Security Without Getting in the Way

SeniorStoryteller

The document discusses how operations and security teams are under pressure to deploy code faster while maintaining reliability and security, and proposes a "shift left" approach to incident response where developers define procedures for fixing issues in their code and are responsible for responding to incidents involving that code. It describes a design pattern where organizations establish a secure operations portal, develop an SDLC for operations procedures, and connect with management systems to enable developers to more proactively address operations and security issues.

Colin Domoney -

DevSecCon

This document summarizes a presentation about how traditional security teams can cope with a move to DevOps. The presentation discusses how security teams initially struggled to engage with development and operations teams, but that the security team was eventually able to better communicate and work pragmatically with developers by understanding their processes and priorities, providing guidance on fixes, and taking a risk-based approach to remediation. The presentation concludes by discussing how security can help empower developers to build more securely on their own.

Chaos Engineering when you're not Netflix

Martez Reed

This document discusses chaos engineering and how organizations that are not Netflix can implement it. It begins with defining chaos engineering as experimenting on systems to build confidence in their ability to withstand turbulent conditions. It then discusses why Netflix uses chaos engineering due to their large scale microservices architecture. While most organizations are not the size of Netflix, the document outlines how chaos engineering can still be beneficial by challenging common assumptions about architectures and validating system resilience. It provides examples of chaos engineering experiments and tools that can be used to implement chaos engineering.

Chaos Engineering: Injecting Failure for Building Resilience in Systems

Yury Roa

This document discusses chaos engineering and building resilient systems. It defines chaos engineering as experimenting in production to reveal weaknesses and build confidence in resilience. Some key principles of chaos engineering are discussed, such as having steady state periods between experiments and formulating hypotheses before experiments. Game days are mentioned where engineers take on roles like master of disaster to experiment with failures. The goal of chaos engineering is to design systems that can withstand failures through practices like circuit breaking and observability.

How and why to design your Teams for modern Software Systems - Matthew Skelto...

Skelton Thatcher Consulting Ltd

The document discusses guidelines for designing teams for modern software systems. It notes that team structure should mirror software architecture (Conway's Law). High-performing teams optimize cognitive load by matching responsibilities to a team's capacity. Various team topologies are presented, including anti-patterns to avoid, like separate silos. Guidelines include evolving topologies over time for discovery vs. predictability, and using different topologies in different parts of an organization. However, team structure alone is not enough - culture, engineering practices, and business vision are also needed for effective software systems.

Teams and monoliths - Matthew Skelton - LondonCD 2016

Skelton Thatcher Consulting Ltd

How to break apart a monolithic system safely without destroying your team Moving from a monolith to microservices can be daunting. How do we choose the right bounded contexts? How small should services be? Which teams should get which services? And how do we keep things from falling apart? By starting with the needs of the team, we can infer some useful heuristics for evolving from a monolithic architecture to a set of more loosely coupled services. Matthew Skelton is co-founder of Skelton Thatcher Consulting / @matthewpskelton

Chaos Engineering 101: A Field Guide

matthewbrahms

This document provides an introduction to chaos engineering, including: - Defining chaos engineering as experimenting on distributed systems to build confidence in withstanding turbulent conditions. - Outlining the brief history of chaos engineering from 2010-2018. - Describing the methodology which involves forming hypotheses, testing ideas through experiments, analyzing results, and repeating. - Explaining how to start chaos engineering "in the wild" through basic steps and increasing levels of experimentation. - Highlighting valuable outcomes like avoiding downtime and increasing productivity. - Addressing common myths around chaos engineering. - Providing additional resources for learning more.

DevOps not a Toolbox

DevOps Indonesia

DevOps is not just about tools, but rather a culture and way of working. It involves cross-functional collaboration between development and operations teams. When implementing DevOps, organizations should focus on automating processes, integrating tools, communicating effectively, and iterating quickly rather than which specific tools to use. DevOps aims to break down silos between teams and move away from a blame culture.

Continuous Delivery Tools Collaboration Conways Law - QCon London - Matthew S...

Skelton Thatcher Consulting Ltd

Presentation given at QCon London on 4th March 2015 Tools, Collaboration, and Conway's Law: how to choose and use tools effectively for Continuous Delivery and DevOps With an ever-increasing array of tools and technologies claiming to 'enable DevOps' or 'implement Continuous Delivery', how do we know which tools to try or to choose? In-house, open source, or commercial? Ruby or shell? Dedicated or plugins? It transpires that highly collaborative practices such as DevOps and Continuous Delivery require new ways of assessing tools and technologies in order to avoid creating new silos. Matthew Skelton shares his recent experience of helping many different organisations to evaluate and select tools to facilitate DevOps and Continuous Delivery, including version control, log aggregation, deployment pipelines, monitoring and metrics, and infrastructure automation tools; the recommendations may surprise you.

Accelerated Troubleshooting with Komodor and Coralogix

Komodor

What we learned from three years sciencing the crap out of devops

Nicole Forsgren

Three years, 20,000 DevOps professionals, and some science... What did we find? Well, the headline is that IT *does* matter if you do it right. With a mix of technology, processes, and a great culture, IT contributes to organizations' profitability, productivity, and market share. We also found that using continuous delivery and lean management practices not only makes IT better -- giving you throughput and stability without tradeoffs -- but it also makes your work feel better -- making your organizational culture better and decreasing burnout. Jez and Nicole will share these findings as well as tips and tricks to help make your own DevOps transformation awesome.

Organization Flow

Ken Power

Business Continuity for Humans: Keeping Your Business Running When Your Peopl...

Rundeck

This document discusses enabling business continuity when employees are unavailable by focusing on adaptive capacity. It recommends decentralizing platforms, communication, and knowledge through approaches like cloud-native engineering, modern communication tools, and runbook automation. Runbook automation involves capturing expert knowledge in automated runbooks to standardize responses and allow anyone to handle incidents. The document advocates testing capabilities regularly through everyday operations to prepare for disruptions and becoming a learning organization that treats incidents as opportunities. The goal is to move beyond legacy business continuity strategies that may be undermined by increasing complexity and change.

Chaos Engineering - The Art of Breaking Things in Production

Keet Sugathadasa

The Rising Tide Raises All Boats: The Advancement of Science of Cybersecurity

laurieannwilliams

Creating An Incremental Architecture For Your System

Giovanni Asproni

What's hot

Chaos Engineering, When should you release the monkeys?

Thoughtworks

Introduction to Chaos Engineering

Raymond Adrian (Rad) Butalid

Chaos engineering intro

Shantanu Deshpande

Chaos Engineering: Why the World Needs More Resilient Systems

C4Media

CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

DevOpsDays Tel Aviv

Chaos Engineering

Yury Roa

Ops Happen: Improve Security Without Getting in the Way

SeniorStoryteller

Colin Domoney -

DevSecCon

Chaos Engineering when you're not Netflix

Martez Reed

Chaos Engineering: Injecting Failure for Building Resilience in Systems

Yury Roa

How and why to design your Teams for modern Software Systems - Matthew Skelto...

Skelton Thatcher Consulting Ltd

Teams and monoliths - Matthew Skelton - LondonCD 2016

Skelton Thatcher Consulting Ltd

Chaos Engineering 101: A Field Guide

matthewbrahms

DevOps not a Toolbox

DevOps Indonesia

Continuous Delivery Tools Collaboration Conways Law - QCon London - Matthew S...

Skelton Thatcher Consulting Ltd

Accelerated Troubleshooting with Komodor and Coralogix

Komodor

What we learned from three years sciencing the crap out of devops

Nicole Forsgren

Organization Flow

Ken Power

Business Continuity for Humans: Keeping Your Business Running When Your Peopl...

Rundeck

What's hot (19)

Chaos Engineering, When should you release the monkeys?

Introduction to Chaos Engineering

Chaos engineering intro

Chaos Engineering: Why the World Needs More Resilient Systems

CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

Chaos Engineering

Ops Happen: Improve Security Without Getting in the Way

Colin Domoney -

Chaos Engineering when you're not Netflix

Chaos Engineering: Injecting Failure for Building Resilience in Systems

How and why to design your Teams for modern Software Systems - Matthew Skelto...

Teams and monoliths - Matthew Skelton - LondonCD 2016

Chaos Engineering 101: A Field Guide

DevOps not a Toolbox

Continuous Delivery Tools Collaboration Conways Law - QCon London - Matthew S...

Accelerated Troubleshooting with Komodor and Coralogix

What we learned from three years sciencing the crap out of devops

Organization Flow

Business Continuity for Humans: Keeping Your Business Running When Your Peopl...

Similar to Trust and Confidence through Chaos Keynote for W-JAX Munich 2018

Chaos Engineering - The Art of Breaking Things in Production

Keet Sugathadasa

The Rising Tide Raises All Boats: The Advancement of Science of Cybersecurity

laurieannwilliams

Creating An Incremental Architecture For Your System

Giovanni Asproni

Pivotal APJ Security Chaos Engineering

Aaron Rinehart

This document provides an overview of a session on security chaos engineering. The session will cover combating complexity in software, chaos engineering, resilience engineering and security, security chaos engineering, open source chaos tools, and a product demo from Verica. The presenters from Verica will be Casey Rosenthal, CEO and founder, and Aaron Rinehart, CTO and founder. Casey Rosenthal helped create the discipline of chaos engineering at Netflix and built their chaos automation platform. Aaron Rinehart has experience leading security engineering strategies and pioneered the area of security chaos engineering. Chaos engineering involves experimenting on distributed systems to build confidence in their ability to withstand turbulent conditions. It is used to combat the increasing complexity

Creating An Incremental Architecture For Your System

Giovanni Asproni

Agile Architecture and Modeling - Where are we Today

Gary Pedretti

RSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering

Aaron Rinehart

Distributed systems at scale have unpredictable and complex outcomes that are costly when security incidents occur. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. If the latter is even remotely true how is it possible to adequately secure services that are not even fully comprehended by the engineering teams that built them. How do we realign the actual state of operational security measures to maintain an acceptable level of confidence that our security actually works. Security Chaos Engineering allows teams to proactively, safely discover system weakness before they disrupt business outcomes.

OWASP AppSec Global 2019 Security & Chaos Engineering

Aaron Rinehart

Chaos engineering open science for software engineering - kube con north am...

Sylvain Hellegouarch

This document discusses chaos engineering and the need for more reliable systems. It begins with examples of past engineering failures from NASA space missions. It then discusses the emergence of chaos engineering practices and the formation of a CNCF working group to develop standards. The document outlines deliverables for the working group, including a whitepaper and landscape of chaos engineering tools. It argues that chaos engineering should be viewed as an open science for exploring reliability. It proposes initiatives like the Open Chaos Initiative to share experiments and findings across organizations to improve reliability through collective learning.

UMich CI Days: Scaling a code in the human dimension

matthewturk

This document provides an overview of the yt astrophysics analysis and visualization toolkit. It discusses yt's goals of addressing physical rather than computational questions and getting out of the way of analysis. It also covers yt's community aspects, including the challenges of developing open source scientific software and strategies used by yt like reducing barriers to entry, open communication, and emphasizing a community of peers. Key points discussed are designing the community desired, challenges of academic rewards, and successes of yt like its development by working astrophysicists and usage on supercomputers.

WSO2CONMay2024OpenSourceConferenceDebrief.pptx

Jennifer Lim

Agile Architecture: Ideals, History, and a New Hope

Gary Pedretti

This document summarizes Gary Pedretti's presentation on Agile Architecture. It begins by defining architecture and discussing the ideals and principles of Agile Architecture, which come from the Agile Manifesto and ideas from Kent Beck, Martin Fowler, and Scott Ambler. It then discusses common misunderstandings, like thinking Agile means no planning or documentation. This has led to a backlash where some think heavy planning is needed. However, the presentation offers a new hope through tools like CRC cards and sacrificial architectures that align with Agile principles. It emphasizes communication, modeling, and organizational transformation to successfully adopt Agile Architecture.

Threat Modelling in DevSecOps Cultures

DevOps Indonesia

Faisal Yahya discusses threat modelling in DevSecOps culture. Traditional prevent and detect security approaches are becoming inadequate as organizations increasingly use cloud systems and open APIs. Threat modelling helps security professionals identify potential threats by decomposing systems and identifying threats using techniques like STRIDE. It is important to embed security during planning and design through activities like threat modelling. This helps harden DevOps processes and can accelerate delivery while improving quality, security, and reliability.

Unleash The Monkeys

Jacob Duijzer

Large online organizations like Netflix, Amazon, and LinkedIn have already been doing it for years: Chaos Engineering, i.e. injecting chaos into their production environments. And while it might sound scary (and it will be in the beginning), even you can apply some chaos to your applications. In this talk, I will demonstrate how to create chaos and how to apply resilience to work around it and create a more stable platform. In this session we will look at the Chaos Monkey pizza shop, an event-driven, microservice oriented web application where you can order pizzas. The application will be running on Kubernetes, have a frontend, a GraphQL API, RabbitMQ, and a few .NET microservices. When everything is running smoothly, we will apply chaos on different components and try to resolve this chaos in different scenarios. While trying to manage the application, it will become apparent that it is not only logging that is important but also traceability and metrics.

Craft 2019 - Security Chaos Engineering - Security Precognition

Aaron Rinehart

Security incident response is a reactive and chaotic exercise. What if it were possible to flip the scenario on its head? Security focused chaos engineering takes the approach of advancing the security incident response apparatus by reversing the postmortem and preparation phases. Contrary to Purple Team or Red Team game days, Security Chaos Engineering does not use threat actor tactics, techniques and procedures. It develops teams through unique configuration, cyber threat and user error scenarios that challenge responders to react to events outside their playbooks and comfort zones. Security Chaos Engineering allows incident response and product teams to derive new information about the state of security within their distributed systems that was previously unknown. Within this new paradigm of instrumentation where we proactively conduct “Pre-Incident” vs. “Post-Incident” reviews we are now able to more accurately measure how effective our security incident response teams, tools, skills, and procedures are during the manic of the Incident Response function. In this session Aaron Rinehart, the mind behind the first Open Source Security Chaos Engineering tool ChaoSlingr, will introduce how Security Chaos Engineering can be applied to create highly secure, performant, and resilient distributed systems.

Using security to drive chaos engineering - April 2018

Dinis Cruz

Defining DevSecOps

Uchit Vyas ☁

Today everybody wants to deploy the app and infrastructure faster without any disputes. An Even, Agile framework can help to deploy faster in real-time. But Continuous Innovation may conflict with stability and security. Without security at every stage, DevOps merely introduces vulnerabilities into application quickly. To resolve such conflict, the gap in recursive feedback loops need to be eliminated. Mostly, teams are not effectively working in a collaboration and interacting with each other smoothly. This results in gaps and produce problems with code development and quality, meaning slower delivery plans and serious vulnerabilities that create security risk at most. Fortunately, these shortcomings can be addressed very well, as developers/testers are set to launch off into the DevSecOps world or via adopting rugged DevOps model.

Software engineering the genesis

Pawel Szulc

The document discusses the origins of software engineering as a discipline. It summarizes discussions from a conference in 1968 where the term "software engineering" was first used. Key points discussed included that testing is best done iteratively during design rather than after, that small groups tend to be more successful than large groups on software projects, and that an organizational structure is needed for communication and decision making in large groups. The document also discusses criticisms of the "waterfall" development model and advocates for an iterative approach.

DevSecOps Days Istanbul 2020 Security Chaos Engineering

Aaron Rinehart

This document summarizes a presentation on chaos engineering and security chaos engineering. It discusses how systems have become too complex for humans to fully understand and that failures are the normal condition. Chaos engineering experiments intentionally introduce failures to build confidence in a system's resilience. Security chaos engineering uses the same principles to continuously validate security controls and reduce uncertainty. The document provides examples of chaos experiments and introduces ChaoSlingr, an open source tool for automating security chaos experiments.

SBQS 2013 Keynote: Cooperative Testing and Analysis

Tao Xie

Similar to Trust and Confidence through Chaos Keynote for W-JAX Munich 2018 (20)

Chaos Engineering - The Art of Breaking Things in Production

The Rising Tide Raises All Boats: The Advancement of Science of Cybersecurity

Creating An Incremental Architecture For Your System

Pivotal APJ Security Chaos Engineering

Creating An Incremental Architecture For Your System

Agile Architecture and Modeling - Where are we Today

RSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering

OWASP AppSec Global 2019 Security & Chaos Engineering

Chaos engineering open science for software engineering - kube con north am...

UMich CI Days: Scaling a code in the human dimension

WSO2CONMay2024OpenSourceConferenceDebrief.pptx

Agile Architecture: Ideals, History, and a New Hope

Threat Modelling in DevSecOps Cultures

Unleash The Monkeys

Craft 2019 - Security Chaos Engineering - Security Precognition

Using security to drive chaos engineering - April 2018

Defining DevSecOps

Software engineering the genesis

DevSecOps Days Istanbul 2020 Security Chaos Engineering

SBQS 2013 Keynote: Cooperative Testing and Analysis

More from Russell Miles

Don't be a victim of your own success: Using Service Levels to give a Consist...

Russell Miles

The document discusses using service level objectives (SLOs) to provide a consistent user experience for services. It notes that establishing SLOs sets user expectations for acceptable performance and availability. The document advocates engineering systems to meet SLOs consistently rather than having occasional outstanding performance, and cites Google's Chubby service as an example. It acknowledges that SLOs need to be contextual and may change over time to meet evolving user needs and priorities around performance and availability.

Service Level Objectives and SRE: Service Level Overkill with Mick Roper

Russell Miles

This document discusses service level agreements (SLAs), service level objectives (SLOs), and service level indicators (SLIs) in the context of service-oriented architectures (SOAs). It notes that in an interconnected system of services, the performance of a service depends on the service levels of the services it relies on. The document provides a formula to calculate the aggregate service level based on individual service levels and dependencies. It also offers recommendations for improving SLOs through techniques like handling errors, caching, and transaction management. The overall message is that services should be architected with resiliency in mind given the realities of interconnected systems and inevitable outages.

How to be Wrong (or How to be Successful at Being Wrong)

Russell Miles

Production Microservices @ Jazoon

Russell Miles

Applying Machine Learning and Artificial Intelligence to Business

Russell Miles

Machine Learning is coming out of the halls of Academia and straight into the arms of those businesses looking for a competitive edge. This session by the experts of GoDataScience.io on machine learning is designed to give a high level overview of the field of machine learning for business consumers covering: - What Machine Learning is - Where it came from - Why we need it - Why now - How to make it real with the various toolkits and processes.

The Future of Machine Learning

Russell Miles

Machine learning is rapidly advancing and will transform many aspects of society. It has the potential to automate jobs, improve lives through applications in healthcare, transportation, and more. However, it also poses risks like unemployment and a widening inequality gap that will require addressing. The future of AI is uncertain, but predictions include human-level machine intelligence within the next 10-15 years, and an acceleration of scientific discoveries. Oversight and safety research aims to ensure AI's benefits are maximized and its risks are minimized.

More from Russell Miles (6)

Don't be a victim of your own success: Using Service Levels to give a Consist...

Service Level Objectives and SRE: Service Level Overkill with Mick Roper

How to be Wrong (or How to be Successful at Being Wrong)

Production Microservices @ Jazoon

Applying Machine Learning and Artificial Intelligence to Business

The Future of Machine Learning

Recently uploaded

Must Know Postgres Extension for DBA and Developer during Migration

Mydbops

Mydbops Opensource Database Meetup 16 Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting Date & Time: 8th June | 10 AM - 1 PM IST Venue: Bangalore International Centre, Bangalore Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle. Key Takeaways: * Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities. * Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom. * Discover how these key extensions can empower both developers and DBAs during the migration process. * Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends. Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL. Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability. Contact us: info@mydbops.com Visit: https://www.mydbops.com/ Follow us on LinkedIn: https://in.linkedin.com/company/mydbops For more details and updates, please follow up the below links. Meetup Page : https://www.meetup.com/mydbops-databa... Twitter: https://twitter.com/mydbopsofficial Blogs: https://www.mydbops.com/blog/ Facebook(Meta): https://www.facebook.com/mydbops/

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

Chart Kalyan

"$10 thousand per minute of downtime: architecture, queues, streaming and fin...

Fwdays

Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless. As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency. We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.

GNSS spoofing via SDR (Criptored Talks 2024)

Javier Junquera

In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security. This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing. The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.

Principle of conventional tomography-Bibash Shahi ppt..pptx

BibashShahi

PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx

christinelarrosa

[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...

Jason Yip

The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.

ScyllaDB Tablets: Rethinking Replication

ScyllaDB

ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.

Northern Engraving | Nameplate Manufacturing Process - 2024

Northern Engraving

Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!

"What does it really mean for your system to be available, or how to define w...

Fwdays

Apps Break Data

Ivo Velitchkov

How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?

Mutation Testing for Task-Oriented Chatbots

Pablo Gómez Abajo

Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots. To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.

Christine's Supplier Sourcing Presentaion.pptx

christinelarrosa

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

What is an RPA CoE? Session 2 – CoE Roles

DianaGray10

Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels

Northern Engraving

The Microsoft 365 Migration Tutorial For Beginner.pptx

operationspcvita

Astute Business Solutions | Oracle Cloud Partner |

AstuteBusiness

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...

Pitangent Analytics & Technology Solutions Pvt. Ltd

Dandelion Hashtable: beyond billion requests per second on a commodity server

Antonios Katsarakis

This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).

Recently uploaded (20)

Must Know Postgres Extension for DBA and Developer during Migration

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

"$10 thousand per minute of downtime: architecture, queues, streaming and fin...

GNSS spoofing via SDR (Criptored Talks 2024)

Principle of conventional tomography-Bibash Shahi ppt..pptx

PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx

[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...

ScyllaDB Tablets: Rethinking Replication

Northern Engraving | Nameplate Manufacturing Process - 2024

"What does it really mean for your system to be available, or how to define w...

Apps Break Data

Mutation Testing for Task-Oriented Chatbots

Christine's Supplier Sourcing Presentaion.pptx

Nordic Marketo Engage User Group_June 13_ 2024.pptx

What is an RPA CoE? Session 2 – CoE Roles

Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels

The Microsoft 365 Migration Tutorial For Beginner.pptx

Astute Business Solutions | Oracle Cloud Partner |

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...

Dandelion Hashtable: beyond billion requests per second on a commodity server

Trust and Confidence through Chaos Keynote for W-JAX Munich 2018

1. Trust and Conﬁdence through Chaos Russ Miles CEO, ChaosIQ The Why, What, How and Who of Chaos Engineering or How and why you should start doing Chaos Engineering in your organisation today!

2. Russ Miles CEO, co-founder Your Host Today Sylvain Hellegouarch CTO, co-founder ChaosIQ

3. “To support our users in establishing their own   Resilience Engineering Capability”

4. “To enable EVERYONE to do chaos engineering, safely and with the emphasis on establishing learning through building your own Resilience Engineering Capability.”

5. TBD @russmiles

10.

11.

12. Parable of Ignored Architect

13. Reliability is the GOAL

14. Resilience is CRITICAL

15.

16.

17. Where did chaos come from?

18. 2010 https://www.gremlin.com/community/tutorials/chaos-engineering-the-history-principles-and-practice/ Chaos Monkey

19. An Alternative…

20.

21.

22. “she caused a “mission” to crash by selecting the DSKY keys in an unexpected way, alerting the team as to what would happen if the prelaunch program, P01, were inadvertently selected by a real astronaut during a real mission, during real midcourse.” Murphy, Niall Richard; Beyer, Betsy; Jones, Chris; Petoff, Jennifer. Site Reliability Engineering: How Google Runs Production Systems . O'Reilly Media. Kindle Edition.

23.

24. So what’s the problem?

25. “Production Hates You.”

26. Expectations have Grown

27. Feature Velocity != Terminal Velocity

28.

29. You are here

30. Distributed Systems? External Dependencies?

31. You are here

32. Building complex systems…

33. … that evolve quickly …

34. (Which was the point)

35. You are here

36. Attack Vectors on “Reliability”

37. Infrastructure

38. Infrastructure Platform

39. Infrastructure Platform Applications

40. Infrastructure Platform Applications People, Practices & Process

41. TBD Wouldn’t it be great if there was a proactive practice for exploring and diminishing system weaknesses before they affected users? Probably a pipe dream…

42. Chaos Engineering

43. A mindset,   a process, some ethics  some practices,   and some tools, a minimum bar

44. The Chaos Engineering   Mindset

45. "never let an outage go to waste" - @caseyrosenthal

46. “Chaos Engineering allows us to have ‘pre-mortems’ instead of post-mortems.” - Michael Barrett, Head of Infrastructure Engineering, Remind

47. The Chaos Engineering   Process

48. 1. Form a hypothesis. 2. Communicate to your team. 3. Run experiments. 4. Analyze the results. 5. Increase the scope. 6. Automate experiments. https://blog.codeship.com/embracing-the-chaos-of-chaos-engineering/

49. The Chaos Engineering   Ethics

50. “Skin in the game”

51. Two Key Practices

52. Game Days Chaos Experiments Automate

53. How do non-technical folks react???

54. Drop the term!

55. Talk Production Incidents

56. Rules of   Chaos “Club”

57. Rule 1: Don’t talk about Chaos Club… ChaosIQ.io

58. Rule 2: It’s about “Learning” ChaosIQ.io

59. Rule 3: Chaos is not a surprise ChaosIQ.io

60. Rule 4: If you know the consequences, don’t do the experiment ChaosIQ.io