Modern applications are faced with the expectation of being always up. As a result, an increasing amount of businesses are transitioning to modern architectures such as the one of reactive microservice-based systems. As compelling as this sounds, this also means to embrace distribution so as to ensure fault-tolerance and scalability - which also means embracing the uncertainty and nondeterminism that comes with building networked applications: changes in link quality, network partitions and outages of individual nodes are scenarios that need to be addressed first-hand when designing such systems.
In this talk you will learn about the fundamental mechanisms that clustered systems need in order to operate in this uncertain world, such as failure detection, gossip-based cluster state propagation, split brain resolution and convergent replicated data types (CRDTs). We will look at these mechanisms in the context of Akka Cluster.
By the end of this talk you should have a solid understanding of the trade-offs that need to be made when building clustered applications. You will also get the sense that a networked application is not just one application, but the cooperation between individual processes that need to probe for each other’s availability and decide what to do when one or more of them are maybe unavailable all while not being able to talk with one another.
Is there anybody out there? Scala Days Berlin 2018Manuel Bernhardt
Foundations of cluster construction, more particularly the memberhsip service abstraction for building reliable distributed applications. This talk introduces failure detectors, dissemination strategies and consensus protocols
Is there anybody out there? Reactive Systems HamburgManuel Bernhardt
Foundations of cluster construction, more particularly the memberhsip service abstraction for building reliable distributed applications. This talk introduces failure detectors, dissemination strategies and consensus protocols
This is the keynote talk i delivered at GeekCamp.SG 2014
The main purpose of the talk is to create an awareness, if not existent, in the community when it comes to choosing and wanting to building a distributed system.
This presentation is not meant to be a survey of distributed computing through the ages but hopefully it serves as a good starting point in which the journeyman can start from.
I want to thank Jonas, CTO of Typesafe, as his work in Akka strongly influenced my own and i hope it would help you in the way his work helped me.
The 10 minutes presentation I gave at my PhD defence on 21.9.2015 in Amsterdam. Prof. Frank van Harmelen was my promoter. Prof. Ian Horrocks, prof. Manfred Hauswirth, prof. Geert-Jan Houben, Peter Boncz and prof. Guus Schreiber were my opponents.
Is there anybody out there? Scala Days Berlin 2018Manuel Bernhardt
Foundations of cluster construction, more particularly the memberhsip service abstraction for building reliable distributed applications. This talk introduces failure detectors, dissemination strategies and consensus protocols
Is there anybody out there? Reactive Systems HamburgManuel Bernhardt
Foundations of cluster construction, more particularly the memberhsip service abstraction for building reliable distributed applications. This talk introduces failure detectors, dissemination strategies and consensus protocols
This is the keynote talk i delivered at GeekCamp.SG 2014
The main purpose of the talk is to create an awareness, if not existent, in the community when it comes to choosing and wanting to building a distributed system.
This presentation is not meant to be a survey of distributed computing through the ages but hopefully it serves as a good starting point in which the journeyman can start from.
I want to thank Jonas, CTO of Typesafe, as his work in Akka strongly influenced my own and i hope it would help you in the way his work helped me.
The 10 minutes presentation I gave at my PhD defence on 21.9.2015 in Amsterdam. Prof. Frank van Harmelen was my promoter. Prof. Ian Horrocks, prof. Manfred Hauswirth, prof. Geert-Jan Houben, Peter Boncz and prof. Guus Schreiber were my opponents.
For a Bioinformatics Discussion for Students and Post-Docs (BioDSP) meeting: Expands on Sandve's "Ten Simple Rules for Reproducible Computational Research"
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are several different necessary components which are anything but trivial to combine, and, of course, even more challenging when attempting to optimize for performance. Over the past years there has been significant progress in both the science and practical implementations of such data stores. In this talk Dan Larkin-York will introduce the audience to some of the challenges, address the difficulties of their interplay, and cover key approaches taken by some of the industry’s leaders (ArangoDB, Cassandra, CockroachDB, MarkLogic, and more).
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICESMykola Novik
Stateless service architectures can be easily scaled horizontally by adding backend servers to a front-end load balancer. Such approach is not always optimal, any application that needs to perform soft real-time work could never be built using stateless CRUD models, because state locality is required in order to achieve those response times. In this talk I'll cover benefits of statefull services, gave an overview of academic research and existing frameworks in js, scala, .net and golang worlds. Unfortunately in this area python has little to offer. To fix this we will figure out key concepts for building scalable stateful services: membership and dissemination protocols, failure detection and message routing.
This presentation will provide an overview of what a penetration test is, why companies pay for them, and what role they play in most IT security programs. It will also include a brief overview of the common skill sets and tools used by today’s security professionals. Finally, it will offer some basic advice for getting started in penetration testing. This should be interesting to aspiring pentesters trying to gain a better understanding of how penetration testing fits into the larger IT security world.
Additional resources can be found in the blog below:
https://www.netspi.com/blog/entryid/140/resources-for-aspiring-penetration-testers
More security blogs by the authors can be found @
https://www.netspi.com/blog/
Modern software projects cannot exist without open source software (OSS). It allows software projects to have rapid growth, credibility, and trust of their users. However, the wide adoption of OSS also brings huge security risks. Improper maintenance of OSS components may result in serious and costly security breaches, like the Equifax case, when the company lost 100K credit card profiles. In this talk, we will have an overview of the current problems regarding the management of third-party components of software projects, the ways how to address them, and I will also present you our methodology for identification of possible security issues coming from OSS dependencies. The methodology demonstrated its sustainability being used by SAP, a large international software development company.
This short talk presents some observations and opportunities in the space of using
data analysis to enable, accomplish, and improve software engineering tasks such
as testing.
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware ofManuel Bernhardt
CAUTION: If you are responsible for an Akka system deployed to production, attending this talk may cause intense moments of self-doubt, stress and possibly panic.
Akka is a toolkit for building highly concurrent and distributed applications on the JVM using the actor model. Given the prevalence of frameworks over toolkits and models in the industry, it is easy to forget that the former will not prevent you from using them in any way you please – including ways that are possibly suboptimal or perhaps even harmful.
In this talk you'll learn about a few of the most common anti-patterns related to Akka usage. You'll also get to know about alternative and more appropriate solutions to use for each one of those anti-patterns. It should be noted that these suboptimal uses of Akka are not merely theoretical ponderings but real and recurring observations that the speaker made during a range of consulting projects.
For a Bioinformatics Discussion for Students and Post-Docs (BioDSP) meeting: Expands on Sandve's "Ten Simple Rules for Reproducible Computational Research"
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are several different necessary components which are anything but trivial to combine, and, of course, even more challenging when attempting to optimize for performance. Over the past years there has been significant progress in both the science and practical implementations of such data stores. In this talk Dan Larkin-York will introduce the audience to some of the challenges, address the difficulties of their interplay, and cover key approaches taken by some of the industry’s leaders (ArangoDB, Cassandra, CockroachDB, MarkLogic, and more).
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICESMykola Novik
Stateless service architectures can be easily scaled horizontally by adding backend servers to a front-end load balancer. Such approach is not always optimal, any application that needs to perform soft real-time work could never be built using stateless CRUD models, because state locality is required in order to achieve those response times. In this talk I'll cover benefits of statefull services, gave an overview of academic research and existing frameworks in js, scala, .net and golang worlds. Unfortunately in this area python has little to offer. To fix this we will figure out key concepts for building scalable stateful services: membership and dissemination protocols, failure detection and message routing.
This presentation will provide an overview of what a penetration test is, why companies pay for them, and what role they play in most IT security programs. It will also include a brief overview of the common skill sets and tools used by today’s security professionals. Finally, it will offer some basic advice for getting started in penetration testing. This should be interesting to aspiring pentesters trying to gain a better understanding of how penetration testing fits into the larger IT security world.
Additional resources can be found in the blog below:
https://www.netspi.com/blog/entryid/140/resources-for-aspiring-penetration-testers
More security blogs by the authors can be found @
https://www.netspi.com/blog/
Modern software projects cannot exist without open source software (OSS). It allows software projects to have rapid growth, credibility, and trust of their users. However, the wide adoption of OSS also brings huge security risks. Improper maintenance of OSS components may result in serious and costly security breaches, like the Equifax case, when the company lost 100K credit card profiles. In this talk, we will have an overview of the current problems regarding the management of third-party components of software projects, the ways how to address them, and I will also present you our methodology for identification of possible security issues coming from OSS dependencies. The methodology demonstrated its sustainability being used by SAP, a large international software development company.
This short talk presents some observations and opportunities in the space of using
data analysis to enable, accomplish, and improve software engineering tasks such
as testing.
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware ofManuel Bernhardt
CAUTION: If you are responsible for an Akka system deployed to production, attending this talk may cause intense moments of self-doubt, stress and possibly panic.
Akka is a toolkit for building highly concurrent and distributed applications on the JVM using the actor model. Given the prevalence of frameworks over toolkits and models in the industry, it is easy to forget that the former will not prevent you from using them in any way you please – including ways that are possibly suboptimal or perhaps even harmful.
In this talk you'll learn about a few of the most common anti-patterns related to Akka usage. You'll also get to know about alternative and more appropriate solutions to use for each one of those anti-patterns. It should be noted that these suboptimal uses of Akka are not merely theoretical ponderings but real and recurring observations that the speaker made during a range of consulting projects.
In this talk you'll learn about some of the many ways in which you can shoot yourself in the foot using the Akka toolkit. The anti-patterns are based on observations made in real-life consulting projects.
Beyond the buzzword: a reactive web-appliction in practiceManuel Bernhardt
This talk & live-coding session gives an insight into the why, what and how of reactive applications, and web-applications in particular. The first part is theoretical, whilst the second part is a live-coding session introducing the Play Framework, Akka and the Scala programming language all whilst demonstrating reactive patterns such as Futures, Actors, Pipes and Circuit Breakers.
Beyond the Buzzword - a reactive application in practiceManuel Bernhardt
Beyond the Buzzword presentation about building reactive web applications with Play Framework, Akka and Scala at the Slovak Scala User's Group, 26th of May 2016
3 things you must know to think reactive - Geecon Kraków 2015Manuel Bernhardt
Over the past few years, web-applications have started to play an increasingly important role in our lives. We expect them to be always available and the data to be always fresh. This shift into the realm of real-time data processing is now transitioning to physical devices, and Gartner predicts that the Internet of Things will grow to an installed base of 26 billion units by 2020.
As reactive architectures gain in popularity, more and more developers find themselves faced with the challenge of "thinking reactive". To leave behind the well-known concepts of mutable, object-oriented, imperative and synchronous programming in favour of immutable, functional, declarative and asynchronous programming requires quite a mind shift and it isn't obvious to take the plunge.
In this talk we will explore three concepts from the world of functional programming that are at the core of building reactive applications: immutability, higher-order functions and manipulating immutable collections. We will first see how the "traditional" mutable, object-oriented approach of doing things can be problematic when it comes to multi-core programming, and then how to apply them to asynchronous systems.
Over the past few years, web-applications have started to play an increasingly important role in our lives. We expect them to be always available and the data to be always fresh. This shift into the realm of real-time data processing is now transitioning to physical devices, and Gartner predicts that the Internet of Things will grow to an installed base of 26 billion units by 2020.
Reactive web-applications are an answer to the new requirements of high-availability and resource efficiency brought by this rapid evolution. On the JVM, a set of new languages and tools has emerged that enable the development of entirely asynchronous request and data handling pipelines. At the same time, container-less application frameworks are gaining increasing popularity over traditional deployment mechanisms.
This talk is going to give you an introduction into one of the most trending reactive web-application stack on the JVM, involving the Scala programming language, the concurrency toolkit Akka and the web-application framework Play. It will show you how functional programming techniques enable asynchronous programming, and how those technologies help to build robust and resilient web-applications.
My experience writing "Reactive Web-Applications with Play". This is my subjective view on writing a book with Manning. I'm not finished yet so there will likely be a second version of this presentation when the book is out.
Note if you weren't at the talk: the "know nothing" part refers to discovering the depth of a topic while writing, which I find to be one of the nicest things about writing (because I really like the subject matter)
Project Phoenix - From PHP to the Play Framework in 3 monthsManuel Bernhardt
This is an experience report about Project Phoenix, aiming at porting a platform to the Play Framework with Scala in the short time period of 3 months. The presentation was given at Devoxx UK 2014
This is a short introduction to the Scala programming language and its ecosystem, giving some background about how the language came to be and a quick taste of what it looks like. It also shortly introduces a few tools and technologies related to it
Introduction to the Scala programming language @ the Java Klassentreffen 2013 in Vienna and Linz organized by http://www.javaklassentreffen.at
Introducing the background, design goals, and some examples (from a Java perspective)
http://java.at/web/guest/jkt13
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppGoogle
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-fusion-buddy-review
AI Fusion Buddy Review: Key Features
✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini
✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique!
✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs!
✅Fully automated AI articles bulk generation!
✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more.
✅With one keyword or URL, generate complete websites, landing pages, and more…
✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7.
✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches.
✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all!
✅Save over $5000 per year and kick out dependency on third parties completely!
✅Brand New App: Not available anywhere else!
✅ Beginner-friendly!
✅ZERO upfront cost or any extra expenses
✅Risk-Free: 30-Day Money-Back Guarantee!
✅Commercial License included!
See My Other Reviews Article:
(1) AI Genie Review: https://sumonreview.com/ai-genie-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIFusionBuddyReview,
#AIFusionBuddyFeatures,
#AIFusionBuddyPricing,
#AIFusionBuddyProsandCons,
#AIFusionBuddyTutorial,
#AIFusionBuddyUserExperience
#AIFusionBuddyforBeginners,
#AIFusionBuddyBenefits,
#AIFusionBuddyComparison,
#AIFusionBuddyInstallation,
#AIFusionBuddyRefundPolicy,
#AIFusionBuddyDemo,
#AIFusionBuddyMaintenanceFees,
#AIFusionBuddyNewbieFriendly,
#WhatIsAIFusionBuddy?,
#HowDoesAIFusionBuddyWorks
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Introduction to Pygame (Lecture 7 Python Game Development)
Is there anybody out there?
1. Is there anybody out there?
Voxxed Days Vienna 2018
Manuel Bernhardt
2.
3. manuel.bernhardt.io
• Helping companies to get started with
reac4ve systems ...
• ... or to get them to perform
• Lightbend consul4ng and training
partner (Akka, Akka Cluster, Akka
Streams)
@elmanu - h+ps://manuel.bernhardt.io
4. Mo#va#onal quote
Life is a single player game. You’re born alone. You’re going to die
alone. All of your interpreta8ons are alone. All your memories are
alone. You’re gone in three genera8ons and no one cares. Before
you showed up nobody cared. It’s all single player.
— Naval Ravikant
@elmanu - h+ps://manuel.bernhardt.io
5.
6.
7.
8. Key issues for building clusters
• discovery: who's there?
• fault detec0on: who's in trouble?
• load balancing: who can take up work?
@elmanu - h+ps://manuel.bernhardt.io
9. Key issues for building clusters
• discovery: who's there?
• fault detec0on: who's in trouble?
• load balancing: who can take up work?
Group membership
@elmanu - h+ps://manuel.bernhardt.io
12. Failure detector key
proper0es
• Completeness: crash-failure of any
group member is detected by all non-
faulty members
@elmanu - h+ps://manuel.bernhardt.io
13. Failure detector key
proper0es
• Completeness: crash-failure of any
group member is detected by all non-
faulty members
• Accuracy: no non-faulty group member
is declared as failed by any other non-
faulty group member (no false posi9ves)
@elmanu - h+ps://manuel.bernhardt.io
14. Failure detector key
proper0es
• Completeness: crash-failure of any group
member is detected by all non-faulty
members
• Accuracy: no non-faulty group member is
declared as failed by any other non-faulty
group member (no false posi9ves)
Also relevant in prac/ce:
• speed
• network message load
@elmanu - h+ps://manuel.bernhardt.io
15. Impossibility result
It is impossible for a failure detector algorithm to determinis3cally
achieve both completeness and accuracy over an asynchronous
unreliable network1
1
Chandra, Toueg: Unreliable failure detectors for reliable distributed systems (1996)
@elmanu - h+ps://manuel.bernhardt.io
16. Trade-offs
• Strong - Weak completeness: all / some non-faulty members
detect a crash
• Strong - Weak accuracy: there are no / some false-posi7ves
@elmanu - h+ps://manuel.bernhardt.io
17. Trade-offs
• Strong - Weak completeness: all / some non-faulty members
detect a crash
• Strong - Weak accuracy: there are no / some false-posi7ves
In prac(ce most applica(ons prefer strong completeness with a
weaker form of accuracy
@elmanu - h+ps://manuel.bernhardt.io
18. Failure Detector strategies
• heartbeat: no heartbeat = failure
• ping: no response = failure
@elmanu - h+ps://manuel.bernhardt.io
19. Phi Adap)ve Accrual
Failure Detector 3
• has a cool name
• adap.ve: adjusts to network condi.ons
• introduces the no.on of accrual failure
detec.on: suspicion value φ rather than
boolean (trusted or suspected)
3
N. Hayashibara et al: The ϕ Accrual Failure Detector (2004)
@elmanu - h+ps://manuel.bernhardt.io
20. Phi Adap)ve Accrual
Failure Detector 3
Example: master and worker processes
• φ(w) > 8 stop sending new work to
the node
• φ(w) > 10 start to rebalance current
tasks of the worker to other nodes
• φ(w) > 12 remove the worker from
the list of nodes
3
N. Hayashibara et al: The ϕ Accrual Failure Detector (2004)
@elmanu - h+ps://manuel.bernhardt.io
21. New Adap)ve accrual
Failure Detector 4
• much simpler to calculate suspicion
level than Phi
• performs slightly be7er and more
adap)ve 5
5
h$ps://manuel.bernhardt.io/2017/07/26/a-new-adap=ve-accrual-
failure-detector-for-akka/
4
B. Satzger et al: A new adap3ve accrual failure detector for dependable
distributed systems (2007)
@elmanu - h+ps://manuel.bernhardt.io
22. SWIM Failure Detector
As you swim lazily through the milieu,
The secrets of the world will infect you 6
• has both a dissemina.on and a failure detec.on component
• scalable membership protocol
• members are first suspected and not immediately flagged as
failed
6
A. Das et al: SWIM: scalable weakly-consistent infec:on-style process group membership protocol (2002)
@elmanu - h+ps://manuel.bernhardt.io
23.
24.
25.
26.
27.
28.
29. Lifeguard Failure Detector
• based on SWIM, developed by
Hashicorp 7
• memberlist implementa;on 8
• extensions to the SWIM protocol
• dras;cally reduces the amount of false-
posi;ves
8
h$ps://github.com/hashicorp/memberlist
7
A. Dadgar et al: Lifeguard: SWIM-ing with Situa:onal Awareness (2017)
@elmanu - h+ps://manuel.bernhardt.io
31. Dissemina(on
How to communicate changes in the cluster?
• members joining
• members leaving
• members failing
@elmanu - h+ps://manuel.bernhardt.io
32. Dissemina(on strategies
• hardware / IP / UDP mul1cast: not readily (or willingly) enabled
in data centres
• gossip protocols
@elmanu - h+ps://manuel.bernhardt.io
33. Gossip styles
• gossip to one node at random 10
10
van Renesse et al: A gossip-style failure detec9on service (1998)
@elmanu - h+ps://manuel.bernhardt.io
34. Gossip styles
• gossip to one node at random 10
• round-robin, binary round-robin, round-
robin with sequence check 11
11
S. Ranganathan et al: Gossip-Style Failure Detec:on and Distributed
Consensus for Scalable Heterogeneous Clusters (2000)
10
van Renesse et al: A gossip-style failure detec9on service (1998)
@elmanu - h+ps://manuel.bernhardt.io
35. Gossip styles
• gossip to one node at random 10
• round-robin, binary round-robin,
round-robin with sequence check 11
11
S. Ranganathan et al: Gossip-Style Failure Detec:on and Distributed
Consensus for Scalable Heterogeneous Clusters (2000)
10
van Renesse et al: A gossip-style failure detec9on service (1998)
@elmanu - h+ps://manuel.bernhardt.io
36. Gossip styles
• gossip to one node at random 10
• round-robin, binary round-robin, round-
robin with sequence check 11
• piggy-back on another protocol (e.g. on
a failure detector 6
): also called
infec&on-style gossip
6
A. Das et al: SWIM: scalable weakly-consistent infec:on-style process
group membership protocol (2002)
11
S. Ranganathan et al: Gossip-Style Failure Detec:on and Distributed
Consensus for Scalable Heterogeneous Clusters (2000)
10
van Renesse et al: A gossip-style failure detec9on service (1998)
@elmanu - h+ps://manuel.bernhardt.io
37. What do you even gossip about
@elmanu - h+ps://manuel.bernhardt.io
38. What do you even gossip about
@elmanu - h+ps://manuel.bernhardt.io
39. What do you even gossip about
@elmanu - h+ps://manuel.bernhardt.io
40. Example of gossip op.miza.ons
• Akka Cluster: gossip with a higher probability to nodes that have
not already seen a gossip
• Akka Cluster: speeds up gossip (3x) when less than half of the
members have seen the latest gossip
• Lifeguard: an<-entropy mechanism based on nodes doing a full
sync with a node at random (helps to speed up convergence a?er
a network par<<on)
@elmanu - h+ps://manuel.bernhardt.io
42. Impossibility result - group membership
[...] the primary-par//on group membership problem cannot be
solved in asynchronous systems with crash failures, even if one
allows the removal or killing of non-faulty processes that are
erroneously suspected to have crashed 9
9
Chandra et al: On the Impossibility of Group Membership (1996)
@elmanu - h+ps://manuel.bernhardt.io
43. It would be unwise to make
membership-related decisions while
there are processes suspected of
having crashed
@elmanu - h+ps://manuel.bernhardt.io
44. Reaching consensus: .me
• Lamport Clocks 12
: how do you order events in a distributed
system
12
L. Lamport: Time, Clocks, and the Ordering of Events in a Distributed System (1978)
@elmanu - h+ps://manuel.bernhardt.io
45. Reaching consensus: .me
• Lamport Clocks: how do you order events in a distributed
system
• Vector Clocks 13
: how do you order events in a distributed
system and flag concurrent event
13
F. Ma(ern: Virtual Time and Global States of Distributed Systems (1989)
@elmanu - h+ps://manuel.bernhardt.io
46. Reaching consensus: .me
• Lamport Clocks: how do you order events in a distributed system
• Vector Clocks: how do you order events in a distributed system and flag
concurrent events
• Version Vectors 14 15
: how do you order replicas in a distributed system and
detect conflicts
• Do3ed Version Vectors 16
: same as version vectors, but prevent false conflicts
16
N. Preguiça: Do1ed Version Vectors: Efficient Causality Tracking for Distributed Key-Value Stores (2012)
15
h%ps://haslab.wordpress.com/2011/07/08/version-vectors-are-not-vector-clocks/
14
D.S. Parker: Detec/on of mutual inconsistency in distributed systems (1983)
@elmanu - h+ps://manuel.bernhardt.io
47. Reaching consensus: .me
private[cluster] final case class Gossip(
members: immutable.SortedSet[Member],
overview: GossipOverview = GossipOverview(),
version: VectorClock = VectorClock(),
tombstones: Map[UniqueAddress, Gossip.Timestamp] = Map.empty)
@elmanu - h+ps://manuel.bernhardt.io
48. Reaching consensus:
CRDTs
• Conflict-Free Replicated Data types 17
• strong eventual consistency
• two families:
• CmRDTs (commuta+vity of
opera+ons)
• CvRDTs (convergence of state - merge
funcBon)
17
F.B. Schneider: Implemen4ng Fault Tolerant Services Using the State
Machine Approach: A Tutorial (1990)
@elmanu - h+ps://manuel.bernhardt.io
49. Reaching consensus:
replicated state machines
Any suffisciently complicated model class
contains an ad-hoc, informally-specified,
bug-ridden, slow implementa;on of half a
state machine
— Pete Forde
@elmanu - h+ps://manuel.bernhardt.io
50. Reaching consensus:
replicated state machines
This method allows one to implement any
desired form of mul4process
synchroniza4on in a distributed system
— Leslie Lamport 12
12
L. Lamport: Time, Clocks, and the Ordering of Events in a Distributed
System (1978)
@elmanu - h+ps://manuel.bernhardt.io
51. Reaching consensus:
protocols
• fault-tolerant distributed systems
• how do mul1ple servers agree on a
value?
• Paxos 19 20
, Ra@ 21
, CASPaxos 22
22
D. Rystsov: CASPaxos: Replicated State Machines without logs (2018)
21
D. Ongaro and J. Ousterhout: In Search of an Understandable
Consensus Algorithm (Extended Version) (2014)
20
L. Lamport: Paxos made simple (2001)
19
L. Lamport: The Part-Time Parliament (1998)
@elmanu - h+ps://manuel.bernhardt.io
52. Reaching consensus: conven/ons
• reaching a decision by not transmi2ng any informa4on
• example: Akka Cluster leader designa4on
/**
* INTERNAL API
* Orders the members by their address except that members with status
* Joining, Exiting and Down are ordered last (in that order).
*/
private[cluster] val leaderStatusOrdering: Ordering[Member] = ...
@elmanu - h+ps://manuel.bernhardt.io
61. Failure detectors
• Chandra, Toueg: Unreliable failure detectors for reliable distributed systems (1996)
• N. Hayashibara et al: The ϕ Accrual Failure Detector (2004)
• B. Satzger et al: A new adapNve accrual failure detector for dependable distributed systems
(2007)
• hQps://manuel.bernhardt.io/2017/07/26/a-new-adapNve-accrual-failure-detector-for-akka/
• A. Das et al: SWIM: scalable weakly-consistent infecNon-style process group membership
protocol (2002)
• A. Dadgar et al: Lifeguard: SWIM-ing with SituaNonal Awareness (2017)
• hQps://github.com/hashicorp/memberlist
@elmanu - h+ps://manuel.bernhardt.io
62. Impossibility results
• M.J. Fischer, N.A. Lynch, and M.S. Paterson: Impossibility of
distributed consensus with one faulty process (1985)
• Chandra, Toueg: Unreliable failure detectors for reliable
distributed systems (1996)
• Chandra et al: On the Impossibility of Group Membership (1996)
@elmanu - h+ps://manuel.bernhardt.io
63. Dissemina(on
• van Renesse et al: A gossip-style failure detec8on service (1998)
• S. Ranganathan et al: Gossip-Style Failure Detec8on and
Distributed Consensus for Scalable Heterogeneous Clusters
(2000)
@elmanu - h+ps://manuel.bernhardt.io
64. Consensus - )me
• L. Lamport: Time, Clocks, and the Ordering of Events in a Distributed
System (1978)
• F. MaJern: Virtual Time and Global States of Distributed Systems (1989)
• D.S. Parker: DetecNon of mutual inconsistency in distributed systems (1983)
• hJps://haslab.wordpress.com/2011/07/08/version-vectors-are-not-vector-
clocks/
• N. Preguiça: DoJed Version Vectors: Efficient Causality Tracking for
Distributed Key-Value Stores (2012)
@elmanu - h+ps://manuel.bernhardt.io
65. Consensus - protocols
• F.B. Schneider: Implemen3ng Fault Tolerant Services Using the State
Machine Approach: A Tutorial (1990)
• L. Lamport: The Part-Time Parliament (1998)
• L. Lamport: Paxos made simple (2001)
• D. Ongaro and J. Ousterhout: In Search of an Understandable
Consensus Algorithm (Extended Version) (2014)
• D. Rystsov: CASPaxos: Replicated State Machines without logs (2018)
@elmanu - h+ps://manuel.bernhardt.io
66. Consensus - misc
• M. Shapiro: Conflict-free Replicated Data Types (2011)
@elmanu - h+ps://manuel.bernhardt.io