Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or have already deployed reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to.
In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications working as expected. I’d like to talk about typical pitfalls that might cause problems, about trade-offs that might not be fully understood and important choices that might be overlooked. These include persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or has already deployed a reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on the Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Serverless is a hot topic in the software architecture world and also one of the points of contention. Serverless let us run our code without provisioning or managing servers. We don't have to think about servers at all. Things like elasticity or resilience might not longer be our problem anymore. On the other hand, we have to embrace a little bit different approach how to design our applications. We also have to give up a lot of control we might want and the most importantly we have to use technology which just might not be ready. In this talk, I’d like to discuss if it is worth to use serverless in our applications, what are the advantages and disadvantages of this approach. Secondly, I'd like to describe various use cases we were considering serverless and what was the result. And finally, I’d like to talk about how Scala fits into this. This talk should be interesting for everyone who is considering using serverless or just heard this word somewhere and would like to learn more. The talk is a little bit more focused on AWS but the understanding of the concepts I’m going to talk about should be beneficial even if you prefer a different service provider.
Journey into Reactive Streams and Akka StreamsKevin Webber
Are streams just collections? What's the difference between Java 8 streams and Reactive Streams? How do I implement Reactive Streams with Akka? Pub/sub, dynamic push/pull, non-blocking, non-dropping; these are some of the other concepts covered. We'll also discuss how to leverage streams in a real-world application.
QCON 2015: Gearpump, Realtime Streaming on AkkaSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...confluent
Invitae is one of the fastest growing genetic information companies, whose mission is to bring comprehensive genetic information into mainstream medical practice to improve the quality of healthcare for billions of people. We have recently partnered with another lab, requiring an integration layer that was developed as part of a dizzying leap from a traditional Python service architecture to Scala Streaming applications on Kafka and Kubernetes. This presentation is our story, where we discuss challenges and solutions, error handling and resilience techniques, technology stack choices and compromises, tools and approaches we have developed, and general insights. Beyond engineering itself, our team's goal is enabling others to join in. Building an application entirely of Streams is a significant and in many ways liberating paradigm shift. In addition to learning to architect and understand how the application will behave and evolve, success depends on great tooling. We will show, for example, how we extended KStreams API to seamlessly include Avro Schema as part of our build and code infrastructure, completely automating SerDe derivation, introducing typed topics, and still supporting polyglot teams. Other highlights: - Self-healing streams with aggregation, and deciding when to crash - Connectors vs Streams for side effects - Scheduling with Streams - Deriving topology diagrams - Monitoring and metrics as Streams - Combining Avro, Swagger and code generation, plus avro4s vs avrohugger comparison - Typelevel Cats and its role in our success - http4s and hybrid testing
A dive into akka streams: from the basics to a real-world scenarioGioia Ballin
Reactive streaming is becoming the best approach to handle data flows across asynchronous boundaries. Here, we present the implementation of a real-world application based on Akka Streams. After reviewing the basics, we will discuss the development of a data processing pipeline that collects real-time sensor data and sends it to a Kinesis stream. There are various possible point of failures in this architecture. What should happen when Kinesis is unavailable? If the data flow is not handled in the correct way, some information may get lost. Akka Streams are the tools that enabled us to build a reliable processing logic for the pipeline that avoids data losses and maximizes the robustness of the entire system.
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to.
In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications working as expected. I’d like to talk about typical pitfalls that might cause problems, about trade-offs that might not be fully understood and important choices that might be overlooked. These include persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or has already deployed a reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on the Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Serverless is a hot topic in the software architecture world and also one of the points of contention. Serverless let us run our code without provisioning or managing servers. We don't have to think about servers at all. Things like elasticity or resilience might not longer be our problem anymore. On the other hand, we have to embrace a little bit different approach how to design our applications. We also have to give up a lot of control we might want and the most importantly we have to use technology which just might not be ready. In this talk, I’d like to discuss if it is worth to use serverless in our applications, what are the advantages and disadvantages of this approach. Secondly, I'd like to describe various use cases we were considering serverless and what was the result. And finally, I’d like to talk about how Scala fits into this. This talk should be interesting for everyone who is considering using serverless or just heard this word somewhere and would like to learn more. The talk is a little bit more focused on AWS but the understanding of the concepts I’m going to talk about should be beneficial even if you prefer a different service provider.
Journey into Reactive Streams and Akka StreamsKevin Webber
Are streams just collections? What's the difference between Java 8 streams and Reactive Streams? How do I implement Reactive Streams with Akka? Pub/sub, dynamic push/pull, non-blocking, non-dropping; these are some of the other concepts covered. We'll also discuss how to leverage streams in a real-world application.
QCON 2015: Gearpump, Realtime Streaming on AkkaSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...confluent
Invitae is one of the fastest growing genetic information companies, whose mission is to bring comprehensive genetic information into mainstream medical practice to improve the quality of healthcare for billions of people. We have recently partnered with another lab, requiring an integration layer that was developed as part of a dizzying leap from a traditional Python service architecture to Scala Streaming applications on Kafka and Kubernetes. This presentation is our story, where we discuss challenges and solutions, error handling and resilience techniques, technology stack choices and compromises, tools and approaches we have developed, and general insights. Beyond engineering itself, our team's goal is enabling others to join in. Building an application entirely of Streams is a significant and in many ways liberating paradigm shift. In addition to learning to architect and understand how the application will behave and evolve, success depends on great tooling. We will show, for example, how we extended KStreams API to seamlessly include Avro Schema as part of our build and code infrastructure, completely automating SerDe derivation, introducing typed topics, and still supporting polyglot teams. Other highlights: - Self-healing streams with aggregation, and deciding when to crash - Connectors vs Streams for side effects - Scheduling with Streams - Deriving topology diagrams - Monitoring and metrics as Streams - Combining Avro, Swagger and code generation, plus avro4s vs avrohugger comparison - Typelevel Cats and its role in our success - http4s and hybrid testing
A dive into akka streams: from the basics to a real-world scenarioGioia Ballin
Reactive streaming is becoming the best approach to handle data flows across asynchronous boundaries. Here, we present the implementation of a real-world application based on Akka Streams. After reviewing the basics, we will discuss the development of a data processing pipeline that collects real-time sensor data and sends it to a Kinesis stream. There are various possible point of failures in this architecture. What should happen when Kinesis is unavailable? If the data flow is not handled in the correct way, some information may get lost. Akka Streams are the tools that enabled us to build a reliable processing logic for the pipeline that avoids data losses and maximizes the robustness of the entire system.
This talk is from Distributed Data Summit SF 2018 - http://distributeddatasummit.com/2018-sf/sessions#netflix2
Operating C* can involve a lot of required manpower, complex automation, or both. Some of this complexity comes from operational/configuration activity of the underlying kernel and hardware but much of it is operation complexity stemming from C* itself. Some examples of this complexity are restarting the database in a safe way, reliability backing up and restoring snapshots, monitoring the health of the datastore, and even ensuring eventual consistency through repair. As a result of these complexities, C* operators end up with complicated operational setups, which are expensive to build, manage and monitor. As part of this talk, we will share lessons learned in managing such complexity via our Priam sidecar including recent innovations in how our sidecar ensures the highest possible uptime and correctness of Cassandra. We then use this to motivate building in the management sidecar directly as part of C* itself (CASSANDRA-14395).
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward
Flink's streaming API can be used to construct a scalable, fault tolerant framework for buffering high frequency time series data, with the goal being to output larger, immutable blocks of data. As the data is being buffered into larger blocks, Flink's queryable state feature can be used to service requests for data still in the "buffering" state. The high frequency time series data set in this example is electro cardiogram data (EKG) that is buffered from a sample rate in millisecond into multi minute blocks.
Slides from a presentation by Monal Daxini at Disney, Glendale CA about Netflix Open Source Software, Cloud Data Persistence, and Cassandra best Practices
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward
Apache Flink provides powerful stream processing capabilities which can allow organizations to move directly from batch to real time analytics, skipping the lambda architecture entirely. However, getting to production is not always as simple as rewriting your job in a new API, but requires rethinking your application design with a stream first mindset. This talk will cover MediaMath’s journey in rebuilding its reporting infrastructure using Apache Flink. We will discuss high level architectural designs when building an extensible reporting platform as well as deep dive into specific technical hurdles. Topics will include managing a Flink cluster on EC2 spot instances, reconciling Flink’s consistency model with S3’s, handling massive data skew as well as tools and techniques for building performant, fault tolerant streaming applications.
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
Akka Streams and its amazing handling of streaming with back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially in use cases where the amount of work continues to increase as you’re processing it. This is where back-pressure really shines.
In this talk for Architects and Dev Managers by Akara Sucharitakul, Principal MTS for Global Platform Frameworks at PayPal, Inc., we look at how back-pressure based on Akka Streams and Kafka is being used at PayPal to handle very bursty workloads.
In addition, Akara will also share experiences in creating a platform based on Akka and Akka Streams that currently processes over 1 billion transactions per day (on just 8 VMs), with the aim of helping teams adopt these technologies. In this webinar, you will:
*Start with a sample web crawler use case to examine what happens when each processing pass expands to a larger and larger workload to process.
*Review how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
*Look at lessons learned, plus some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Lightbend Lagom: Microservices Just Rightmircodotta
Microservices architecture are becoming a de-facto industry standard, but are you satisfied with the current state of the art? We are not, as we believe that building microservices today is more challenging than it should be. Lagom is here to take on this challenge. First, Lagom is opinionated and it will take some of the hard decisions for you, guiding you to produce microservices that adheres to the Reactive tenents. Second, Lagom was built from the ground up around you, the developer, to push your productivity to the next level. If you are familiar with the Play Framework's development environment, imagine that but tuned for building microservices; we are sure you are going to love it! Third, Lagom comes with batteries included for deploying in production: going from development to production could not be easier. In this session, you will get an introduction to the Lightbend Lagom framework. There will be code and live demos to show you in practice how it works and what you can do with it, making you fully equipped to build your next microservices with Lightbend Lagom!
Event sourcing - what could possibly go wrong ? Devoxx PL 2021Andrzej Ludwikowski
Yet another presentation about Event Sourcing? Yes and no. Event Sourcing is a really great concept. Some could say it’s a Holy Grail of the software architecture. I might agree with that, while remembering that everything comes with a price. This session is a summary of my experience with ES gathered while working on 3 different commercial products. Instead of theoretical aspects, I will focus on possible challenges with ES implementation. What could explode (very often with delayed ignition)? How and where to store events effectively? What are possible schema evolution solutions? How to achieve the highest level of scalability and live with eventual consistency? And many other interesting topics that you might face when experimenting with ES.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
Bellevue Big Data meetup: Dive Deep into Spark StreamingSantosh Sahoo
Discuss the code and architecture about building realtime streaming application using Spark and Kafka. This demo presents some use cases and patterns of different streaming frameworks.
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...confluent
At Stitch Fix, we maintain a distributed Kafka Connect cluster running several hundred connectors. Over the years, we've learned invaluable lessons for keeping our connectors going 24/7. As many conference goers probably know, event driven applications require a new way of thinking. With this new paradigm comes unique operational considerations, which I will delve into. Specifically, this talk will be an overview of: 1) Our deployment model and use case (we have a large distributed Kafka Connect cluster that powers a self-service data integration platform tailored to the needs of our Data Scientists). 2) Our favorite operational tools that we have built for making things run smoothly (the jobs, alerts and dashboards we find most useful. A quick run down of the admin service we wrote that sits on top of Kafka Connect). 3) Our approach to end-to-end integrity monitoring (our tracer bullet system that we built to constantly monitor all our sources and sinks). 4) Lessons learned from production issues and painful migrations (why, oh why did we not use schemas from the beginning?? Pausing connectors doesn't do what you think it does... rebalancing is tricky... jar hell problems are a thing of the past, upgrade and use plugin.path!). 5) Future areas of improvement. The target audience member is an engineer who is curious about Kafka Connect or currently maintains a small to medium sized Kafka Connect cluster. They should walk away from the talk with increased confidence in using and maintaining a large Kafka Connect cluster, and should be armed with the hard won experiences of our team. For the most part, we've been very happy with our Kafka Connect powered data integration platform, and we'd love to share our lessons learned with the community in order to drive adoption.
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Petr Zapletal
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But the design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more. This talk should be interesting for anyone who is thinking about, implementing, or have already deployed a reactive application. My goal is to provide is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but the understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Monal Daxini
Netflix Keystone Pipeline processing 600 billion events a day, and detailed treatise on the modification of and use of Samza for real time routing of events including docker.
This talk is from Distributed Data Summit SF 2018 - http://distributeddatasummit.com/2018-sf/sessions#netflix2
Operating C* can involve a lot of required manpower, complex automation, or both. Some of this complexity comes from operational/configuration activity of the underlying kernel and hardware but much of it is operation complexity stemming from C* itself. Some examples of this complexity are restarting the database in a safe way, reliability backing up and restoring snapshots, monitoring the health of the datastore, and even ensuring eventual consistency through repair. As a result of these complexities, C* operators end up with complicated operational setups, which are expensive to build, manage and monitor. As part of this talk, we will share lessons learned in managing such complexity via our Priam sidecar including recent innovations in how our sidecar ensures the highest possible uptime and correctness of Cassandra. We then use this to motivate building in the management sidecar directly as part of C* itself (CASSANDRA-14395).
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward
Flink's streaming API can be used to construct a scalable, fault tolerant framework for buffering high frequency time series data, with the goal being to output larger, immutable blocks of data. As the data is being buffered into larger blocks, Flink's queryable state feature can be used to service requests for data still in the "buffering" state. The high frequency time series data set in this example is electro cardiogram data (EKG) that is buffered from a sample rate in millisecond into multi minute blocks.
Slides from a presentation by Monal Daxini at Disney, Glendale CA about Netflix Open Source Software, Cloud Data Persistence, and Cassandra best Practices
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward
Apache Flink provides powerful stream processing capabilities which can allow organizations to move directly from batch to real time analytics, skipping the lambda architecture entirely. However, getting to production is not always as simple as rewriting your job in a new API, but requires rethinking your application design with a stream first mindset. This talk will cover MediaMath’s journey in rebuilding its reporting infrastructure using Apache Flink. We will discuss high level architectural designs when building an extensible reporting platform as well as deep dive into specific technical hurdles. Topics will include managing a Flink cluster on EC2 spot instances, reconciling Flink’s consistency model with S3’s, handling massive data skew as well as tools and techniques for building performant, fault tolerant streaming applications.
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
Akka Streams and its amazing handling of streaming with back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially in use cases where the amount of work continues to increase as you’re processing it. This is where back-pressure really shines.
In this talk for Architects and Dev Managers by Akara Sucharitakul, Principal MTS for Global Platform Frameworks at PayPal, Inc., we look at how back-pressure based on Akka Streams and Kafka is being used at PayPal to handle very bursty workloads.
In addition, Akara will also share experiences in creating a platform based on Akka and Akka Streams that currently processes over 1 billion transactions per day (on just 8 VMs), with the aim of helping teams adopt these technologies. In this webinar, you will:
*Start with a sample web crawler use case to examine what happens when each processing pass expands to a larger and larger workload to process.
*Review how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
*Look at lessons learned, plus some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Lightbend Lagom: Microservices Just Rightmircodotta
Microservices architecture are becoming a de-facto industry standard, but are you satisfied with the current state of the art? We are not, as we believe that building microservices today is more challenging than it should be. Lagom is here to take on this challenge. First, Lagom is opinionated and it will take some of the hard decisions for you, guiding you to produce microservices that adheres to the Reactive tenents. Second, Lagom was built from the ground up around you, the developer, to push your productivity to the next level. If you are familiar with the Play Framework's development environment, imagine that but tuned for building microservices; we are sure you are going to love it! Third, Lagom comes with batteries included for deploying in production: going from development to production could not be easier. In this session, you will get an introduction to the Lightbend Lagom framework. There will be code and live demos to show you in practice how it works and what you can do with it, making you fully equipped to build your next microservices with Lightbend Lagom!
Event sourcing - what could possibly go wrong ? Devoxx PL 2021Andrzej Ludwikowski
Yet another presentation about Event Sourcing? Yes and no. Event Sourcing is a really great concept. Some could say it’s a Holy Grail of the software architecture. I might agree with that, while remembering that everything comes with a price. This session is a summary of my experience with ES gathered while working on 3 different commercial products. Instead of theoretical aspects, I will focus on possible challenges with ES implementation. What could explode (very often with delayed ignition)? How and where to store events effectively? What are possible schema evolution solutions? How to achieve the highest level of scalability and live with eventual consistency? And many other interesting topics that you might face when experimenting with ES.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
Bellevue Big Data meetup: Dive Deep into Spark StreamingSantosh Sahoo
Discuss the code and architecture about building realtime streaming application using Spark and Kafka. This demo presents some use cases and patterns of different streaming frameworks.
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...confluent
At Stitch Fix, we maintain a distributed Kafka Connect cluster running several hundred connectors. Over the years, we've learned invaluable lessons for keeping our connectors going 24/7. As many conference goers probably know, event driven applications require a new way of thinking. With this new paradigm comes unique operational considerations, which I will delve into. Specifically, this talk will be an overview of: 1) Our deployment model and use case (we have a large distributed Kafka Connect cluster that powers a self-service data integration platform tailored to the needs of our Data Scientists). 2) Our favorite operational tools that we have built for making things run smoothly (the jobs, alerts and dashboards we find most useful. A quick run down of the admin service we wrote that sits on top of Kafka Connect). 3) Our approach to end-to-end integrity monitoring (our tracer bullet system that we built to constantly monitor all our sources and sinks). 4) Lessons learned from production issues and painful migrations (why, oh why did we not use schemas from the beginning?? Pausing connectors doesn't do what you think it does... rebalancing is tricky... jar hell problems are a thing of the past, upgrade and use plugin.path!). 5) Future areas of improvement. The target audience member is an engineer who is curious about Kafka Connect or currently maintains a small to medium sized Kafka Connect cluster. They should walk away from the talk with increased confidence in using and maintaining a large Kafka Connect cluster, and should be armed with the hard won experiences of our team. For the most part, we've been very happy with our Kafka Connect powered data integration platform, and we'd love to share our lessons learned with the community in order to drive adoption.
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Petr Zapletal
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But the design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more. This talk should be interesting for anyone who is thinking about, implementing, or have already deployed a reactive application. My goal is to provide is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but the understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Monal Daxini
Netflix Keystone Pipeline processing 600 billion events a day, and detailed treatise on the modification of and use of Samza for real time routing of events including docker.
Container Orchestration from Theory to PracticeDocker, Inc.
Join Laura Frank and Stephen Day as they explain and examine technical concepts behind container orchestration systems, like distributed consensus, object models, and node topology. These concepts build the foundation of every modern orchestration system, and each technical explanation will be illustrated using Docker’s SwarmKit as a real-world example. Gain a deeper understanding of how orchestration systems like SwarmKit work in practice and walk away with more insights into your production applications.
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITOpenStack
Audience: Advanced
About: Real world lessons and war stories about Catalyst IT’s experience in rolling out an OpenStack based public cloud in New Zealand.
This presentation will provide tips and advice that may save you a lot of time, money and nights of sleep if you are planning to run OpenStack in the future. It may also bring some insights to people that are already running OpenStack in production.
Topics covered will include: selection of hardware for optimal costs, techniques that drive quality and service levels up, common deployment mistakes, in place upgrades, how to identify the maturity level of each project and decide what is ready for production, and much more!
Speaker Bio: Bruno Lago – Entrepreneur, Catalyst IT Limited
Bruno Lago is a solutions architect that has been involved with the Catalyst Cloud (New Zealand’s first public cloud based on OpenStack) from its inception. He is passionate about open source software, cloud computing and disruptive technologies.
OpenStack Australia Day - Sydney 2016
https://events.aptira.com/openstack-australia-day-sydney-2016/
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
In this presentation I talk about our motivation to converting our microservices to run on Kubernetes. I discuss many of the technical challenges we encountered along the way, including networking issues, Java issues, monitoring and alerting, and managing all of our resources!
This talk is about Taskerman, a distributed cluster task manager built on top of AWS SQS, Zookeeper and Yelp PaaSTA. The talk was given at Imperial College, London as part of its 'Application of Computing in Industry' series: http://www.imperial.ac.uk/computing/industry/aci/yelp/
The need for gleaning answers from data in real-time is moving from nicety to a necessity. There are few options to analyze the never-ending stream of unbounded data at scale. Let’s compare and contrast the core principles and technologies the different open source solutions available to help with this endeavor, and where in the future processing engines need to evolve to solve processing needs at scale. These findings are based on the experience of continuing to build a scalable solution in the cloud to process over 700 billion events at Netflix, and how we are embarking on the next journey to evolve unbounded data processing engines.
Mirko Damiani - An Embedded soft real time distributed system in Golinuxlab_conf
An embedded system usually involves low level languages like C and highly customized hardware. In this talk we will see a use case of a soft real time system which was developed taking a very different approach, written in Go. We will see what are the advantages of this choice, along with its limits.
"Stateful app as an efficient way to build dispatching for riders and drivers...Fwdays
Within the Uklon service product line, we consistently employ a widely accepted and easily understandable programming paradigm — the stateless approach. However, there exist various business scenarios where this approach may fall short when it comes to resource utilization, efficiency, and ongoing maintenance. Furthermore, its predictability in handling peak loads can also be questioned.
In this presentation, we will explore the following key areas:
Scenarios where the stateless approach demonstrates suboptimal performance;
Scaling stateful services through effective partitioning and replication strategies;
Developing strategies for comprehensive disaster recovery, including Recovery Time Objective and Recovery Point Objective;
The evolutionary path from actor-based architectures to the more inherent classic stateful architecture.
The Java Memory Model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.
It is crucial for a programmer to know how, according to Java Language Specification, write correctly synchronized, race free programs.
Netflix Open Source Meetup Season 4 Episode 2aspyker
In this episode, we will take a close look at 2 different approaches to high-throughput/low-latency data stores, developed by Netflix.
The first, EVCache, is a battle-tested distributed memcached-backed data store, optimized for the cloud. You will also hear about the road ahead for EVCache it evolves into an L1/L2 cache over RAM and SSDs.
The second, Dynomite, is a framework to make any non-distributed data-store, distributed. Netflix's first implementation of Dynomite is based on Redis.
Come learn about the products' features and hear from Thomson and Reuters, Diego Pacheco from Ilegra and other third party speakers, internal and external to Netflix, on how these products fit in their stack and roadmap.
Similar to Reactive mistakes - ScalaDays Chicago 2017 (20)
Change Data Capture - Scale by the Bay 2019Petr Zapletal
Modern systems are usually designed as a collection of cooperating micro-services. These services commonly have their dedicated data stores for their individual needs. To support various requirements corresponding data are often stored in data stores with very different characteristics and use cases. A fundamental requirement emerging from these architectures is the need to reliably capture primary data changes. Change Data Capture (CDC) is a set of software design patterns used to determine and track the data that has changed so that action can be taken using the changed data. In this talk, I’d like to discuss the advantages and disadvantages of various CDC approaches, provide you guidance in this area and also share our experience including various samples, and recommendations.
After many years of development, Oracle finally published GraalVM and sparkled a lot of interest in the community. GraalVM is a high-performance polyglot VM with a number of potentially interesting traits we can take advantage of like increased performance and lowered cost. It can also tackle shortcomings of JVM/Scala we are struggling for years like slow-startup times or large jars. Lastly, thanks to its polyglot nature it can open interesting doors we may want to discover. On the other hand, GraalVM may still be bleeding edge technology and having a hard time to deliver the promised features. In this talk, I’d like to discuss advantages and disadvantages of adopting GraalVM, provide you guidance if you decide to do so and also share our story in this area including various samples, and recommendations. This talk is focused on JVM and Scala but should be beneficial for everyone with interested in this topic.
Adopting GraalVM - Scala eXchange London 2018Petr Zapletal
After many years of development, Oracle finally published GraalVM and sparkled a lot of interest in the community. GraalVM is a high-performance polyglot VM with a number of potentially interesting traits we can take advantage of like increased performance and lowered cost. It can also tackle shortcomings of JVM/Scala we are struggling for years like slow-startup times or large jars. Lastly, thanks to its polyglot nature it can open interesting doors we may want to discover. On the other hand, GraalVM may still be bleeding edge technology and having a hard time to deliver the promised features. In this talk, I’d like to discuss advantages and disadvantages of adopting GraalVM, provide you guidance if you decide to do so and also share our story in this area including various samples, and recommendations. This talk is focused on JVM and Scala but should be beneficial for everyone with interested in this topic.
Adopting GraalVM - Scale by the Bay 2018Petr Zapletal
After many years of development, Oracle finally published GraalVM and sparkled a lot of interest in the community. GraalVM is a high-performance polyglot VM with a number of potentially interesting traits we can take advantage of like increased performance and lowered cost. It can also tackle shortcomings of JVM/Scala we are struggling for years like slow-startup times or large jars. Lastly, thanks to its polyglot nature it can open interesting doors we may want to discover. On the other hand, GraalVM may still be bleeding edge technology and having a hard time to deliver the promised features. In this talk, I’d like to discuss advantages and disadvantages of adopting GraalVM, provide you guidance if you decide to do so and also share our story in this area including various samples, and recommendations. This talk is focused on JVM and Scala but should be beneficial for everyone with interested in this topic.
Distributed Stream Processing - Spark Summit East 2017Petr Zapletal
The demand for stream processing is increasing a lot these days. Immense amounts of data have to be processed fast from a rapidly growing set of disparate data sources. This pushes the limits of traditional data processing infrastructures. These stream-based applications include trading, social networks, Internet of things, system monitoring, and many other examples.
A number of powerful, easy-to-use open source platforms have emerged to address this. But the same problem can be solved differently, various but sometimes overlapping use-cases can be targeted or different vocabularies for similar concepts can be used. This may lead to confusion, longer development time or costly wrong decisions.
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
The demand for stream processing is increasing a lot these day. Immense amounts of data has to be processed fast from a rapidly growing set of disparate data sources. This pushes the limits of traditional data processing infrastructures. These stream-based applications include trading, social networks, Internet of things, system monitoring, and many other examples.
In this talk we are going to discuss various state of the art open-source distributed streaming frameworks, their similarities and differences, implementation trade-offs and their intended use-cases. Apart of that, I’m going to speak about Fast Data, theory of streaming, framework evaluation and so on. My goal is to provide comprehensive overview about modern streaming frameworks and to help fellow developers with picking the best possible for their particular use-case.
Distributed real time stream processing- why and howPetr Zapletal
In this talk you will discover various state-of-the-art open-source distributed streaming frameworks, their similarities and differences, implementation trade-offs, their intended use-cases, and how to choose between them. Petr will focus on the popular frameworks, including Spark Streaming, Storm, Samza and Flink. You will also explore theoretical introduction, common pitfalls, popular architectures, and much more.
The demand for stream processing is increasing. Immense amounts of data has to be processed fast from a rapidly growing set of disparate data sources. This pushes the limits of traditional data processing infrastructures. These stream-based applications, include trading, social networks, the Internet of Things, and system monitoring, are becoming more and more important. A number of powerful, easy-to-use open source platforms have emerged to address this.
Petr's goal is to provide a comprehensive overview of modern streaming solutions and to help fellow developers with picking the best possible solution for their particular use-case. Join this talk if you are thinking about, implementing, or have already deployed a streaming solution.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
5. Pick the Right Tool for The Job
Scala
Future[T]
Akka
ACTORS
Power
Constraints
Akka
Stream
6. Pick the Right Tool for The Job
Scala
Future[T]
Akka
ACTORS
Power
Constraints
Akka
TYPED
7. Pick the Right Tool for The Job
Scala
Future[T] Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
8. Pick the Right Tool for The Job
Scala
Future[T]
Local Abstractions Distribution
Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
9. Actor Use Cases
● State management
● Location transparency
● Resilience mechanisms
● Single writer
● In-memory lock-free cache
● Sharding
Akka
ACTOR
10. Future Use Cases
● Local Concurrency
● Simplicity
● Composition
● Typesafety
Scala
Future[T]
11. Avoid Java Serialization
Java Serialization is the default in Akka, since
it is easy to start with it, but is very slow and
footprint heavy
17. Java Serialization Implementation
● Serializes
○ Data
○ Entire class definition
○ Definitions of all referenced classes
● It just “works”
○ Serializes almost everything (what implements Serializable)
○ Works with different JVMs
● Performance was not the main requirement
18. Points of Interest
● Performance
● Footprint
● Schema evolution
● Implementation effort
● Human readability
● Language bindings
● Backwards & forwards compatibility
● ...
19. JSON
● Advantages:
○ Human readability
○ Simple & well known
○ Many good libraries
for all platforms
● Disadvantages:
○ Slow
○ Large
○ Object names included
○ No schema (except e.g. json
schema)
○ Format and precision issues
● json4s, circe, µPickle, spray-json, argonaut, rapture-json, play-json, …
20. Binary formats [Schema-less]
● Metadata send together with data
● Advantages:
○ Implementation effort
○ Performance
○ Footprint *
● Disadvantages:
○ No human readability
● Kryo, Binary JSON (MessagePack, BSON, ... )
21. Binary formats [Schema]
● Schema defined by some kind of DSL
● Advantages:
○ Performance
○ Footprint
○ Schema evolution
● Disadvantages:
○ Implementation effort
○ No human readability
● Protobuf (+ projects like Flatbuffers, Cap’n Proto, etc.), Thrift, Avro
22. Summary
● Should be always changed
● Depends on particular use case
● Quick tips:
○ json4s
○ kryo
○ protobuf
23. Flat Actor Hierarchies
Errors should be handled out of band in a
parallel process - they are not part of the
main app
26. Top Level Actors
The Actor Hierarchy
/a1 /a2
/b1 /b2
Root Actor
/c4/c3/c2/c1
/user
27. Top Level Actors
The Actor Hierarchy
/a1 /a2
/b1 /b2
Root Actor
/c4/c3/c2/c1
/user
/
/system
28. Two Different Battles to Win
● Separate business logic and failure handling
○ Less complexity
○ Better supportability
● Getting our application back to life after something bad happened
○ Failure isolation
○ Recovery
○ No more midnight calls :)
---> no more midnight calls :)
29. Errors & Failures
Errors
● Common events
● The current request is affected
● Will be communicated with the client/caller
● Incorrect requests, errors during validations, ...
Failures
● Unexpected events
● Service/actor is not able to operate normally
● Reports to supervisor
● Client can’t do anything, might be notified
● Database failures, network partitions, hardware
malfunctions, ...
30. Error Kernel Pattern
● Actor’s state is lost during restart and may not be recovered
● Delegating dangerous tasks to child actors and supervise them
/user/
a1
/user/
a1
/user/
a1/w1
/user/
a1
/user/
a1/w1
31. Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
32. Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
33. Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
34. Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
35. Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
36. Summary
● Create rich actor hierarchies
● Separate business logic and failure handling
● Backoff Supervisor
37. Graceful Shutdown
We have thousands of sharded actors on
multiple nodes and we want to shut one of
them down
41. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
42. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
43. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
44. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
5. Actor System & JVM Termination
45. Integration with Sharded Actors
● Handling of added messages
○ Passivate() message for graceful stop
○ Context.stop() for immediate stop
● Priority mailbox
○ Priority message handling
○ Message retrying support
46. CoordinatedShutdown Extension
● Stops actors/services in a specific order
● Allows to register tasks and execute them during the shutdown
● More generic approach
● Added in Akka 2.5 (~ a week ago)
47. Summary
● We don’t want to lose data (usually)
● Shutdown coordinator on every node & Integration
with sharded actors
● Akka’s CoordinatedShutdown
48. Distributed Transactions
Any situation where a single event results in
the mutation of two separate sources of data
which cannot be committed atomically
49. What’s Wrong With Them
● Simple happy paths
● Fallacies of Distributed Programming
○ The network is reliable.
○ Latency is zero.
○ Bandwidth is infinite.
○ The network is secure.
○ Topology doesn't change.
○ There is one administrator.
○ Transport cost is zero.
○ The network is homogeneous.
50. Two-phase commit (2PC)
Stage 1 - Prepare Stage 2 - Commit
Prepare
Prepared
Prepare
Prepared
Com
m
it
Com
m
itted
Commit
Committed
Resource
Manager
Resource
Manager
Transaction
Manager
Resource
Manager
Resource
Manager
Transaction
Manager
52. The Big Trade-Off
● Distributed transactions can be usually avoided
○ Hard, expensive, fragile and do not scale
● Every business event needs to result in a single synchronous commit
● Other data sources should be updated asynchronously
● Introducing eventual consistency
53. Longtail Latencies
Consider a system where each service
typically responds in 10ms but with a 99th
percentile latency of one second
55. Longtails really matter
● Latency accumulation
● Not just noise
● Don’t have to be power users
● Real problem
56. Investigating Longtail Latencies
● Narrow the problem
● Isolate in a test environment
● Measure & monitor everything
● Tackle the problem
● Pretty hard job
74. Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
75. Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
76. Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
77. Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
78. Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
79. Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
80. Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
81. Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
82. Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
83. Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
84. Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
85. Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
86. Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
87. Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
88. Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
89. Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
90. Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
91. Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}