A brief overview of the technologies involved and how we've used them at Fifty3North to deliver a real-time event based products using EventStore, Orleans and Redis
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
Apache Flink is a distributed stream processing framework that allows users to process and analyze data in real-time. At LinkedIn, we developed a fully managed stream processing platform on Flink running on K8s to power hundreds of stream processing pipelines in production. This platform is the backbone for other infra systems like Search, Espresso (internal document store) and feature management etc. We provide a rich authoring and testing environment which allows users to create, test, and deploy their streaming jobs in a self-serve fashion within minutes. Users can focus on their business logic, leaving the Flink platform to take care of management aspects such as split deployment, resource provisioning, auto-scaling, job monitoring, alerting, failure recovery and much more. In this talk, we will introduce the overall platform architecture, highlight the unique value propositions that it brings to stream processing at LinkedIn and share the experiences and lessons we have learned.
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
If you were to ask any developer, ""what's a schema and where is it used?"" Most likely, you'd get an answer involving a relational database. The truth is the domain objects used in applications represent a contract, an implied schema, whether developers choose to acknowledge them or not. But even if you recognize the need for a formal schema, what's the best way to manage them?
This presentation will contain some theory and primarily practical application for schemas with Schema Registry. I'll briefly explain what a schema is and how it's very relevant to any application working with Kafka today. It will go into the practical, introducing Schema Registry, describing how it works and how developers can leverage it to provide schemas across an organization. The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.
Attendees will walk away with knowledge of Schema Registry and a solid understanding of how it works, how to integrate them into Kafka clients. They'll also learn enough about the supported serialization frameworks to start implementing schemas right away in their Kafka development efforts.
Real Time UI with Apache Kafka Streaming Analytics of Fast Data and Server PushLucas Jellema
Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients. Kafka Streams and KSQL are used to analyze the real time events in real time and publish events with the live findings.
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...confluent
Microservices are seen as the way to simplify complex systems, until you need to coordinate a transaction across services, and in that instant, the dream ends. Transactions involving multiple services can lead to a spaghetti web of interactions. Protocols such as two-phase commit come with complexity and performance bottlenecks. The Saga pattern involves a simplified transactional model. In sagas, a sequence of actions are executed, and if any action fails, a compensating action is executed for each of the actions that have already succeeded. This is particularly well suited to long-running and cross-microservice transactions. In this talk we introduce the new Simple Sagas library (https://github.com/simplesourcing/simplesagas). Built using Kafka streams, it provides a scalable fault tolerance event-based transaction processing engine. We walk through a use case of coordinating a sequence of complex financial transactions. We demonstrate the easy to use DSL, show how the system copes with failure, and discuss this overall approach to building scalable transactional systems in an event-driven streaming context.
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
Apache Flink is a distributed stream processing framework that allows users to process and analyze data in real-time. At LinkedIn, we developed a fully managed stream processing platform on Flink running on K8s to power hundreds of stream processing pipelines in production. This platform is the backbone for other infra systems like Search, Espresso (internal document store) and feature management etc. We provide a rich authoring and testing environment which allows users to create, test, and deploy their streaming jobs in a self-serve fashion within minutes. Users can focus on their business logic, leaving the Flink platform to take care of management aspects such as split deployment, resource provisioning, auto-scaling, job monitoring, alerting, failure recovery and much more. In this talk, we will introduce the overall platform architecture, highlight the unique value propositions that it brings to stream processing at LinkedIn and share the experiences and lessons we have learned.
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
If you were to ask any developer, ""what's a schema and where is it used?"" Most likely, you'd get an answer involving a relational database. The truth is the domain objects used in applications represent a contract, an implied schema, whether developers choose to acknowledge them or not. But even if you recognize the need for a formal schema, what's the best way to manage them?
This presentation will contain some theory and primarily practical application for schemas with Schema Registry. I'll briefly explain what a schema is and how it's very relevant to any application working with Kafka today. It will go into the practical, introducing Schema Registry, describing how it works and how developers can leverage it to provide schemas across an organization. The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.
Attendees will walk away with knowledge of Schema Registry and a solid understanding of how it works, how to integrate them into Kafka clients. They'll also learn enough about the supported serialization frameworks to start implementing schemas right away in their Kafka development efforts.
Real Time UI with Apache Kafka Streaming Analytics of Fast Data and Server PushLucas Jellema
Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients. Kafka Streams and KSQL are used to analyze the real time events in real time and publish events with the live findings.
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...confluent
Microservices are seen as the way to simplify complex systems, until you need to coordinate a transaction across services, and in that instant, the dream ends. Transactions involving multiple services can lead to a spaghetti web of interactions. Protocols such as two-phase commit come with complexity and performance bottlenecks. The Saga pattern involves a simplified transactional model. In sagas, a sequence of actions are executed, and if any action fails, a compensating action is executed for each of the actions that have already succeeded. This is particularly well suited to long-running and cross-microservice transactions. In this talk we introduce the new Simple Sagas library (https://github.com/simplesourcing/simplesagas). Built using Kafka streams, it provides a scalable fault tolerance event-based transaction processing engine. We walk through a use case of coordinating a sequence of complex financial transactions. We demonstrate the easy to use DSL, show how the system copes with failure, and discuss this overall approach to building scalable transactional systems in an event-driven streaming context.
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
Apache Flink is the foundation for Decodable's real-time SaaS data platform. Flink runs critical data processing jobs with strong security requirements. In addition, Decodable has to scale to thousands of tenants, power various use cases, provide an intuitive user experience and maintain cost-efficiency. We've learned a lot of lessons while building and maintaining the platform. In this talk, I'll share the top 3 toughest challenges building and operating this platform with Flink, and how we solved them.
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023Steve Pember
In this presentation we will present the general philosophy of Clean Architecture, Hexagonal Architecture, and Ports & Adapters: discussing why these approaches are useful and general guidelines for introducing them to your code. Chiefly, we will show how to implement these patterns within your Spring (Boot) Applications. Through a publicly available reference app, we will demonstrate what these concepts can look like within Spring and walkthrough a handful of scenarios: isolating core business logic, ease of testing, and adding a new feature or two.
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
Apache Flink is a world class stateful stream processor presents a huge variety of optional features and configuration choices to the user. Determining out the optimal choice for any production environment and use-case be challenging. In this talk, we will explore and discuss the universe of Flink configuration with respect to state and state backends.
We will start with a closer look under the hood, at core data structures and algorithms, to build the foundation for understanding the impact of tuning parameters and the costs-benefit-tradeoffs that come with certain features and options. In particular, we will focus on state backend choices (Heap vs RocksDB), tuning checkpointing (incremental checkpoints, ...) and recovery (local recovery), serializers and Apache Flink's new state migration capabilities.
Practical learnings from running thousands of Flink jobsFlink Forward
Flink Forward San Francisco 2022.
Task Managers constantly running out of memory? Flink job keeps restarting from cryptic Akka exceptions? Flink job running but doesn’t seem to be processing any records? We share practical learnings from running thousands of Flink Jobs for different use-cases and take a look at common challenges they have experienced such as out-of-memory errors, timeouts and job stability. We will cover memory tuning, S3 and Akka configurations to address common pitfalls and the approaches that we take on automating health monitoring and management of Flink jobs at scale.
by
Hong Teoh & Usamah Jassat
Building Cloud-Native App Series - Part 2 of 11
Microservices Architecture Series
Event Sourcing & CQRS,
Kafka, Rabbit MQ
Case Studies (E-Commerce App, Movie Streaming, Ticket Booking, Restaurant, Hospital Management)
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...HostedbyConfluent
When choosing an event streaming platform, Kafka shouldn’t be the only technology you look at. There are a plethora of others in the messaging space today, including open source and proprietary software as well as a range of cloud services. So how do you know you are choosing the right one? A great way to deepen our understanding of event streaming and Kafka is exploring the trade-offs in distributed system design and learning about the choices made by the Kafka project. We’ll look at how Kafka stacks up against other technologies in the space, including traditional messaging systems like Apache ActiveMQ and RabbitMQ as well as more contemporary ones, such as BookKeeper derivatives like Apache Pulsar or Pravega. This talk focuses on the technical details such as difference in messaging models, how data is stored locally as well as across machines in a cluster, when (not) to add tiers to your system, and more. By the end of the talk, you should have a good high-level understanding of how these systems compare and which you should choose for different types of use cases.
Storing State Forever: Why It Can Be Good For Your AnalyticsYaroslav Tkachenko
State is an essential part of the modern streaming pipelines: it enables a variety of foundational capabilities like windowing, aggregation, enrichment, etc. But usually, the state is either transient, so we only keep it until the window is closed, or it's fairly small and doesn't grow much. But what if we treat the state differently? The keyed state in Flink can be scaled vertically and horizontally, it's reliable and fault-tolerant... so is scaling a stateful Flink application that different from scaling any data store like Kafka or MySQL?
At Shopify, we've worked on a massive analytical data pipeline that's needed to support complex streaming joins and correctly handle arbitrarily late-arriving data. We came up with an idea to never clear state and support joins this way. We've made a successful proof of concept, ingested all historical transactional Shopify data and ended up storing more than 10 TB of Flink state. In the end, it allowed us to achieve 100% data correctness.
Developing event-driven microservices with event sourcing and CQRS (phillyete)Chris Richardson
Modern, cloud-native applications typically use a microservices architecture in conjunction with NoSQL and/or sharded relational databases. However, in order to successfully use this approach you need to solve some distributed data management problems including how to maintain consistency between multiple databases without using 2PC. In this talk you will learn more about these issues and how to solve them by using an event-driven architecture. We will describe how event sourcing and Command Query Responsibility Separation (CQRS) are a great way to realize an event-driven architecture. You will learn about a simple yet powerful approach for building, modern, scalable applications.
Spring I_O 2024 - Flexible Spring with Event Sourcing.pptxSteve Pember
Event Sourcing is a modern but non-trivial data model for building scalable and powerful systems. Instead of mapping a single Entity to a single row in a datastore, in an Event Sourced system we persist all changes for an Entity in an append-only journal. This design provides a wealth of benefits: a built-in Audit Trail, Time-Based reporting, powerful Error Recovery, and more. It creates flexible, scalable systems and can easily evolve to meet changing organizational demands. That is, once you have some experience with it. Event Sourcing is straightforward in concept, but it does bring additional complexity and a learning curve that can be intimidating. People coming from traditional ORM systems often wonder: how does one model relations between Entities? How is Optimistic Locking handled? What about datastore constraints?
Based on over eight years of experience with building ES systems in Spring applications, we will demonstrate the basics of Event Sourcing and some of the common patterns. We will see how simple it can be to model events with available tools like Spring Data JPA, JOOQ, and the integration between Spring and Axon. We’ll walk through sample code in an application that demonstrates many of these techniques. However, it’s also not strictly about the code; we’ll see how a process called Event Modeling can be a powerful design tool to align Subject Matter Experts, Product, and Engineering. Attendees will leave with an understanding of the basic Event Sourcing patterns, and hopefully a desire to start creating their own Journals.
Stateful stream processing with Apache FlinkKnoldus Inc.
Nowadays, many stream processing applications have sophisticated business logic, strict correctness guarantees, high performance, low latency, fault-tolerant, and maintain terabytes of state. There are many stream processing frameworks available in the market which helps businesses to write robust stateful stream processing applications.
In this session, we will talk about Apache Flink, a distributed stream processor with intuitive and expressive APIs to implement stateful stream processing applications. It can efficiently run such applications at a large scale in a fault-tolerant manner. In this session, we will see what is stateful stream processing in detail, and how Flink takes on stateful stream processing. We'll get to know how checkpointing mechanism works in Flink.
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
Apache Flink is the foundation for Decodable's real-time SaaS data platform. Flink runs critical data processing jobs with strong security requirements. In addition, Decodable has to scale to thousands of tenants, power various use cases, provide an intuitive user experience and maintain cost-efficiency. We've learned a lot of lessons while building and maintaining the platform. In this talk, I'll share the top 3 toughest challenges building and operating this platform with Flink, and how we solved them.
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous.
This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.
Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023Steve Pember
In this presentation we will present the general philosophy of Clean Architecture, Hexagonal Architecture, and Ports & Adapters: discussing why these approaches are useful and general guidelines for introducing them to your code. Chiefly, we will show how to implement these patterns within your Spring (Boot) Applications. Through a publicly available reference app, we will demonstrate what these concepts can look like within Spring and walkthrough a handful of scenarios: isolating core business logic, ease of testing, and adding a new feature or two.
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
Apache Flink is a world class stateful stream processor presents a huge variety of optional features and configuration choices to the user. Determining out the optimal choice for any production environment and use-case be challenging. In this talk, we will explore and discuss the universe of Flink configuration with respect to state and state backends.
We will start with a closer look under the hood, at core data structures and algorithms, to build the foundation for understanding the impact of tuning parameters and the costs-benefit-tradeoffs that come with certain features and options. In particular, we will focus on state backend choices (Heap vs RocksDB), tuning checkpointing (incremental checkpoints, ...) and recovery (local recovery), serializers and Apache Flink's new state migration capabilities.
Practical learnings from running thousands of Flink jobsFlink Forward
Flink Forward San Francisco 2022.
Task Managers constantly running out of memory? Flink job keeps restarting from cryptic Akka exceptions? Flink job running but doesn’t seem to be processing any records? We share practical learnings from running thousands of Flink Jobs for different use-cases and take a look at common challenges they have experienced such as out-of-memory errors, timeouts and job stability. We will cover memory tuning, S3 and Akka configurations to address common pitfalls and the approaches that we take on automating health monitoring and management of Flink jobs at scale.
by
Hong Teoh & Usamah Jassat
Building Cloud-Native App Series - Part 2 of 11
Microservices Architecture Series
Event Sourcing & CQRS,
Kafka, Rabbit MQ
Case Studies (E-Commerce App, Movie Streaming, Ticket Booking, Restaurant, Hospital Management)
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...HostedbyConfluent
When choosing an event streaming platform, Kafka shouldn’t be the only technology you look at. There are a plethora of others in the messaging space today, including open source and proprietary software as well as a range of cloud services. So how do you know you are choosing the right one? A great way to deepen our understanding of event streaming and Kafka is exploring the trade-offs in distributed system design and learning about the choices made by the Kafka project. We’ll look at how Kafka stacks up against other technologies in the space, including traditional messaging systems like Apache ActiveMQ and RabbitMQ as well as more contemporary ones, such as BookKeeper derivatives like Apache Pulsar or Pravega. This talk focuses on the technical details such as difference in messaging models, how data is stored locally as well as across machines in a cluster, when (not) to add tiers to your system, and more. By the end of the talk, you should have a good high-level understanding of how these systems compare and which you should choose for different types of use cases.
Storing State Forever: Why It Can Be Good For Your AnalyticsYaroslav Tkachenko
State is an essential part of the modern streaming pipelines: it enables a variety of foundational capabilities like windowing, aggregation, enrichment, etc. But usually, the state is either transient, so we only keep it until the window is closed, or it's fairly small and doesn't grow much. But what if we treat the state differently? The keyed state in Flink can be scaled vertically and horizontally, it's reliable and fault-tolerant... so is scaling a stateful Flink application that different from scaling any data store like Kafka or MySQL?
At Shopify, we've worked on a massive analytical data pipeline that's needed to support complex streaming joins and correctly handle arbitrarily late-arriving data. We came up with an idea to never clear state and support joins this way. We've made a successful proof of concept, ingested all historical transactional Shopify data and ended up storing more than 10 TB of Flink state. In the end, it allowed us to achieve 100% data correctness.
Developing event-driven microservices with event sourcing and CQRS (phillyete)Chris Richardson
Modern, cloud-native applications typically use a microservices architecture in conjunction with NoSQL and/or sharded relational databases. However, in order to successfully use this approach you need to solve some distributed data management problems including how to maintain consistency between multiple databases without using 2PC. In this talk you will learn more about these issues and how to solve them by using an event-driven architecture. We will describe how event sourcing and Command Query Responsibility Separation (CQRS) are a great way to realize an event-driven architecture. You will learn about a simple yet powerful approach for building, modern, scalable applications.
Spring I_O 2024 - Flexible Spring with Event Sourcing.pptxSteve Pember
Event Sourcing is a modern but non-trivial data model for building scalable and powerful systems. Instead of mapping a single Entity to a single row in a datastore, in an Event Sourced system we persist all changes for an Entity in an append-only journal. This design provides a wealth of benefits: a built-in Audit Trail, Time-Based reporting, powerful Error Recovery, and more. It creates flexible, scalable systems and can easily evolve to meet changing organizational demands. That is, once you have some experience with it. Event Sourcing is straightforward in concept, but it does bring additional complexity and a learning curve that can be intimidating. People coming from traditional ORM systems often wonder: how does one model relations between Entities? How is Optimistic Locking handled? What about datastore constraints?
Based on over eight years of experience with building ES systems in Spring applications, we will demonstrate the basics of Event Sourcing and some of the common patterns. We will see how simple it can be to model events with available tools like Spring Data JPA, JOOQ, and the integration between Spring and Axon. We’ll walk through sample code in an application that demonstrates many of these techniques. However, it’s also not strictly about the code; we’ll see how a process called Event Modeling can be a powerful design tool to align Subject Matter Experts, Product, and Engineering. Attendees will leave with an understanding of the basic Event Sourcing patterns, and hopefully a desire to start creating their own Journals.
Stateful stream processing with Apache FlinkKnoldus Inc.
Nowadays, many stream processing applications have sophisticated business logic, strict correctness guarantees, high performance, low latency, fault-tolerant, and maintain terabytes of state. There are many stream processing frameworks available in the market which helps businesses to write robust stateful stream processing applications.
In this session, we will talk about Apache Flink, a distributed stream processor with intuitive and expressive APIs to implement stateful stream processing applications. It can efficiently run such applications at a large scale in a fault-tolerant manner. In this session, we will see what is stateful stream processing in detail, and how Flink takes on stateful stream processing. We'll get to know how checkpointing mechanism works in Flink.
We designed a new framework, made for Microservices. Making it easier for developers to build microservices-based systems – systems that communicate asynchronously, self-heal, scale elastically and remain responsive no matter what bad stuff is happening.
And all this without the pain of selecting and mixing components, from a plethora of libraries that were originally built for other things.
In this presentation, we reveal this new way for Java developers to not only understand and begin building microservices, but also to seamlessly push them into staging and production
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...Flink Forward
At Trackunit we have based our telematic IoT processing pipeline on Flink. We started out on version 1.2 and are now on 1.5. In this session I will share the lessons learned going from one giant Flink job to many smalls and some of the problems we have seen operating Flink on AWS EMR cluster, including topics such as:
• Why external enrichment can be challenging with Flink Async operator.
• Pattern to change external enrichment into streaming join.
• Building your own source
• Why Flink restart is great but should be avoided as it will terminate your cluster.
• Why iteration can cause deadlocking when backpressure occurs.
• Kinesis rate exceeded exception
• Why throttling Flink source read during catchup is needed.
• Why we moved from EMR/Kinesis and into DC/OS and kafka.
• And much more.
Softwerkskammer Lübeck 08/2018 Event Sourcing and CQRSDaniel Bimschas
Introductory-level talk about event sourcing and command query responsibility segregation (CQRS), held at the "Softwerkskammer Lübeck" Meetup (https://www.meetup.com/Softwerkskammer-Luebeck/events/gjsxslyxlbdb/) in August 2018. Code examples that were shown live can be found at https://github.com/danbim/ledger-example and https://github.com/danbim/pwa-scoring.
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
If you want to learn more about it, check out our webinar here: https://www.youtube.com/watch?v=EfGPY_NyYQ8&t=77s
The webinar was organized by GetinData on 2020. During the webinar, we shared our lessons learnt from building and running stream processing platform in production for over 2 years.
Watch more here: https://www.youtube.com/watch?v=EfGPY_NyYQ8
Author: Krzysztof Zarzycki
Linkedin: https://www.linkedin.com/in/kzarzycki/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Building a system for machine and event-oriented data - Velocity, Santa Clara...Eric Sammer
This talk was presented at O'Reilly's Velocity conference in Santa Clara, May 28 2015.
Abstract: http://velocityconf.com/devops-web-performance-2015/public/schedule/detail/42284
Taking the friction out of microservice frameworks with LagomMarkus Eisele
Lagom is a new framework for Java designed with microservices in mind. It aims to simplify the process of building microservice-based systems that communicate asynchronously, self-heal, scale elastically and remain responsive under load and under failure.
Many of the challenges of microservices are caused by the fact we use tools designed without them in mind. So, how can a framework made to build systems composed of microservices from the start offer us a better solution? Because Lagom is a tool that is highly opinionated and explicitly designed to make development and production with microservices easy, it brings back all the fun and productivity into programming while still enabling you to build a reactive, distributed, highly scalable and rock solid application.
By the end of this presentation, you'll have experienced first hand how creating systems of microservices on the JVM using Lagom is dead-simple, intuitive, frictionless and a lot of fun! And we’ll ask whether reactive microservices are potentially so much better than, for example, Java EE?
DevoxxUK https://cfp.devoxx.co.uk/2016/talk/UZA-8885/Taking_the_friction_out_of_microservice_frameworks_with_Lagom
Streaming of data has become the need of the hour. But do we really know how streaming exactly works? What are its benefits? Where and how to stream data in your big data architecture correctly? How to process the streamed data efficiently? What challenges do we face when we move from batch processing to stream processing? What is Stateful stream processing and what is stateless stream processing? Which one to opt and when? Let us address all these queries!
Stands for Command Query Responsibility Segregation and Event Sourcing.
● When combined together they allow us to build more resilient and elastic
hence reactive applications.
● Used for building auditable, scalable, or extremely resilient applications.
● Gives our application a competitive advantage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
18 Months of Event Sourcing and CQRS Using Microsoft Orleans
1.
2. ▪ Solution developed for Capture Education
▪ Software for nurseries to collect evidence of
learning
▪ Existing web based version built by another
supplier
3. ▪ Offline support
▪ iOS and Android
▪ Phone and tablet
▪ Companion parent app
▪ Scales to minimum 100k concurrent users
5. Command and Query
Responsibility Segregation.
Separating the reads from
the writes makes sense
because most applications
are either read heavy or
write heavy. By separating
these you can scale them
independently.
Consists of many actors made
up of their own behaviour and
optionally own private state.
Communication with actors
and between actors is message
based. By default, an actor can
process a single message at a
time and any further messages
passed to an actor will be
placed in a mailbox for future
processing. Traditionally uses
many nodes in a cluster to host
actors to scale computation.
Storing events which
triggered state changes
within the system as a
stream of facts. These facts
can be used to rehydrate
state at any time.
Greg Young’s Event Store
https://geteventstore.org is
one implementation (we
use this)
6. ▪ Virtual Actors - distributed computing with a low barrier to entry.
▪ All calls must be async using standard async await and use the standard Orleans Task Scheduler.
▪ Avoid .Wait() and .Result at all costs.
▪ All method parameters must be serializable.
▪ No need to worry about whether the actor (grain) is located
on a local or remote node (silo).
▪ Update multiple data stores concurrently
▪ Writes are instant, read models are eventually consistent.
▪ Tell (Perform an action, change state etc.)
▪ Ask (Give me information)
▪ See http://bit.do/orleankka
7. ▪ A command handler in the actor decides if the
Command can be processed.
▪ If this validation is successful, an Event is raised.
▪ This event is the source of truth for the entire
system.
▪ It is an undisputed action that happened at a point
in time. This is the most important piece of
information. All other information can be derived
from these events.
▪ Because of this, the first thing we do is write it to
the event store.
▪ We then use the event to mutate the actor’s state.
This is an in-memory update to state so is very fast.
8.
9.
10.
11. ▪ Read the list of events from EventStore for this actor
▪ Call event handler on the actor for each event in the stream
▪ Very fast – in memory updates only
▪ Snapshots enabled every 100 events
12. ▪ Forwards on events to other actors
▪ Used to create transaction as a result of a command
▪ Implicit subscriptions to actors’ Orleans streams
▪ Re-entrant & Worker attributes