In recent times, Reactive Programming has gained a lot of popularity. It is not a “silver bullet” nor it is a solution for every problem. Yet, it is a paradigm to build applications which are non-blocking, event-driven and asynchronous and require a small number of threads to scale. Spring Framework 5 embraces Reactive Streams and Reactor for its own reactive use, as well as in many of its core APIs. It also adds an ability to code in a declarative way, as opposed to imperatively, resulting in more responsive and resilient applications. On top of that, you are given an amazing toolbox of functions to combine, create and filter any data stream. It becomes easy to support input and output streaming scenarios for microservices, scatter/gather, data ingestion, and so on. This presentation is about support and building blocks for reactive programming, that come with the latest versions of Spring Framework 5 and Spring Boot 2.
The document provides an overview of reactive programming and Spring WebFlux. It defines reactive programming as an asynchronous paradigm concerned with data streams and change propagation. It discusses why reactive programming is useful for handling back-pressure, communicating change, and improving scalability and performance. It also summarizes key concepts in reactive programming like Project Reactor's Mono and Flux types, and how Spring WebFlux allows developing reactive applications with annotated controllers or functional routing.
Servlet vs Reactive Stacks in 5 Use CasesVMware Tanzu
ROSSEN STOYANCHEV SPRING FRAMEWORK DEVELOPER
In the past year Netflix shared a story about upgrading their main gateway serving 83 million users from Servlet-stack Zuul 1 to an async and non-blocking Netty-based Zuul 2. The results were interesting and nuanced with some major benefits as well as some trade-offs. Can mere mortal web applications make this journey and moreover should they? The best way to explore the answer is through a specific use case. In this talk we'll take 5 common use cases in web application development and explore the impact of building on Servlet and Reactive web application stacks. For reactive programming we'll use RxJava and Reactor. For the web stack we'll pit Spring MVC vs Spring WebFlux (new in Spring Framework 5.0) allowing us to move easily between the Servlet and Reactive worlds and drawing a meaningful, apples-to-apples comparison. Spring knowledge is not required and not assumed for this session.
This document provides an overview of Apache Flink, an open-source platform for distributed stream and batch data processing. Flink allows for unified batch and stream processing with a simple yet powerful programming model. It features native stream processing, exactly-once fault tolerance based on consistent snapshots, and high performance optimized for streaming workloads. The document outlines Flink's APIs, state management, fault tolerance approach, and roadmap for continued improvements in 2015.
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward
This document summarizes recent improvements to Flink SQL and Table API by Blink, Alibaba's distribution of Flink. Key improvements include support for stream-stream joins, user-defined functions, table functions and aggregate functions, retractable streams, and over/group aggregates. Blink aims to make Flink work well at large scale for Alibaba's search and recommendation systems. Many of the improvements will be included in upcoming Flink releases.
This document provides an overview of reactive programming in Java and Spring 5. It discusses reactive programming concepts like reactive streams specification, Reactor library, and operators. It also covers how to build reactive applications with Spring WebFlux, including creating reactive controllers, routing with functional endpoints, using WebClient for HTTP requests, and testing with WebTestClient.
The document discusses new features in Apache Flink 1.2, including queryable state and dynamic scaling. It provides an overview of Flink 1.2 features like security enhancements, metrics, and improvements to table API and SQL. It then examines queryable state and dynamic scaling in more detail, covering motivations and implementations for making state queryable and allowing jobs to scale resources dynamically in response to changing workloads. The document concludes by looking briefly beyond Flink 1.2 to future work on automatic scaling without restarts.
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Vasia Kalavri
Apache Flink is a general-purpose platform for batch and streaming distributed data processing. This talk describes how Flink’s powerful APIs, iterative operators and other unique features make it a competitive alternative for large-scale graph processing as well. We take a close look at how one can elegantly express graph analysis tasks, using common Flink operators and how different graph processing models, like vertex-centric, can be easily mapped to Flink dataflows. Next, we get a sneak preview into Flink's upcoming Graph API, Gelly, which further simplifies graph application development in Flink. Finally, we show how to perform end-to-end data analysis, mixing common Flink operators and Gelly, without having to build complex pipelines and combine different systems. We go through a step-by-step example, demonstrating how to perform loading, transformation, filtering, graph creation and analysis, with a single Flink program.
The document provides an overview of reactive programming and Spring WebFlux. It defines reactive programming as an asynchronous paradigm concerned with data streams and change propagation. It discusses why reactive programming is useful for handling back-pressure, communicating change, and improving scalability and performance. It also summarizes key concepts in reactive programming like Project Reactor's Mono and Flux types, and how Spring WebFlux allows developing reactive applications with annotated controllers or functional routing.
Servlet vs Reactive Stacks in 5 Use CasesVMware Tanzu
ROSSEN STOYANCHEV SPRING FRAMEWORK DEVELOPER
In the past year Netflix shared a story about upgrading their main gateway serving 83 million users from Servlet-stack Zuul 1 to an async and non-blocking Netty-based Zuul 2. The results were interesting and nuanced with some major benefits as well as some trade-offs. Can mere mortal web applications make this journey and moreover should they? The best way to explore the answer is through a specific use case. In this talk we'll take 5 common use cases in web application development and explore the impact of building on Servlet and Reactive web application stacks. For reactive programming we'll use RxJava and Reactor. For the web stack we'll pit Spring MVC vs Spring WebFlux (new in Spring Framework 5.0) allowing us to move easily between the Servlet and Reactive worlds and drawing a meaningful, apples-to-apples comparison. Spring knowledge is not required and not assumed for this session.
This document provides an overview of Apache Flink, an open-source platform for distributed stream and batch data processing. Flink allows for unified batch and stream processing with a simple yet powerful programming model. It features native stream processing, exactly-once fault tolerance based on consistent snapshots, and high performance optimized for streaming workloads. The document outlines Flink's APIs, state management, fault tolerance approach, and roadmap for continued improvements in 2015.
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward
This document summarizes recent improvements to Flink SQL and Table API by Blink, Alibaba's distribution of Flink. Key improvements include support for stream-stream joins, user-defined functions, table functions and aggregate functions, retractable streams, and over/group aggregates. Blink aims to make Flink work well at large scale for Alibaba's search and recommendation systems. Many of the improvements will be included in upcoming Flink releases.
This document provides an overview of reactive programming in Java and Spring 5. It discusses reactive programming concepts like reactive streams specification, Reactor library, and operators. It also covers how to build reactive applications with Spring WebFlux, including creating reactive controllers, routing with functional endpoints, using WebClient for HTTP requests, and testing with WebTestClient.
The document discusses new features in Apache Flink 1.2, including queryable state and dynamic scaling. It provides an overview of Flink 1.2 features like security enhancements, metrics, and improvements to table API and SQL. It then examines queryable state and dynamic scaling in more detail, covering motivations and implementations for making state queryable and allowing jobs to scale resources dynamically in response to changing workloads. The document concludes by looking briefly beyond Flink 1.2 to future work on automatic scaling without restarts.
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Vasia Kalavri
Apache Flink is a general-purpose platform for batch and streaming distributed data processing. This talk describes how Flink’s powerful APIs, iterative operators and other unique features make it a competitive alternative for large-scale graph processing as well. We take a close look at how one can elegantly express graph analysis tasks, using common Flink operators and how different graph processing models, like vertex-centric, can be easily mapped to Flink dataflows. Next, we get a sneak preview into Flink's upcoming Graph API, Gelly, which further simplifies graph application development in Flink. Finally, we show how to perform end-to-end data analysis, mixing common Flink operators and Gelly, without having to build complex pipelines and combine different systems. We go through a step-by-step example, demonstrating how to perform loading, transformation, filtering, graph creation and analysis, with a single Flink program.
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Flink Forward
In this talk, I will present how Flink enables enterprise customers to unify their data processing systems by using Flink to query Hive data.
Unification of streaming and batch is a main theme for Flink. Since 1.9.0, we have integrated Flink with Hive in a platform level. I will talk about:
- what features we have released so far, and what they enable our customers to do
- best practices to use Flink with Hive
- what is the latest development status of Flink-Hive integration at the time of Flink Forward Berlin (Oct 2019), and what to look for in the next release (probably 1.11)
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward
Apache Beam is Flink’s sibling in the Apache family of streaming processing frameworks. The Beam and Flink teams work closely together on advancing what is possible in streaming processing, including Streaming SQL extensions and code interoperability on both platforms.
Beam was originally developed at Google as the amalgamation of its internal batch and streaming frameworks to power the exabyte-scale data processing for Gmail, YouTube and Ads. It now powers a fully-managed, serverless service Google Cloud Dataflow, as well as is available to run in other Public Clouds and on-premises when deployed in portability mode on Apache Flink, Spark, Samza and other runners. Users regularly run distributed data processing jobs on Beam spanning tens of thousands of CPU cores and processing millions of events per second.
In this session, Sergei Sokolenko, Cloud Dataflow product manager, and Reuven Lax, the founding member of the Dataflow and Beam team, will share Google’s learnings from building and operating a global streaming processing infrastructure shared by thousands of customers, including:
safe deployment to dozens of geographic locations,
resource autoscaling to minimize processing costs,
separating compute and state storage for better scaling behavior,
dynamic work rebalancing of work items away from overutilized worker nodes,
offering a throughput-optimized batch processing capability with the same API as streaming,
grouping and joining of 100s of Terabytes in a hybrid in-memory/on-desk file system,
integrating with the Google Cloud security ecosystem, and other lessons.
Customers benefit from these advances through faster execution of jobs, resource savings, and a fully managed data processing environment that runs in the Cloud and removes the need to manage infrastructure.
Apache Beam is a unified programming model for batch and streaming data processing. It defines concepts for describing what computations to perform (the transformations), where the data is located in time (windowing), when to emit results (triggering), and how to accumulate results over time (accumulation mode). Beam aims to provide portable pipelines across multiple execution engines, including Apache Flink, Apache Spark, and Google Cloud Dataflow. The talk will cover the key concepts of the Beam model and how it provides unified, efficient, and portable data processing pipelines.
Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkFlink Forward
Flink provides fault tolerance guarantees through checkpointing and recovery mechanisms. Checkpoints take consistent snapshots of distributed state and data, while barriers mark checkpoints in the data flow. This allows Flink to recover jobs from failures and resume processing from the last completed checkpoint. Flink also implements high availability by persisting metadata like the execution graph and checkpoints to Apache Zookeeper, enabling a standby JobManager to take over if the active one fails.
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...Flink Forward
http://flink-forward.org/kb_sessions/flink-in-zalandos-world-of-microservices/
In this talk we present Zalando’s microservices architecture, introduce Saiki – our next generation data integration and distribution platform on AWS and show how we employ stream processing with Apache Flink for near-real time business intelligence.
Zalando is one of the largest online fashion retailers in Europe. In order to secure our future growth and remain competitive in this dynamic market, we are transitioning from a monolithic to a microservices architecture and from a hierarchical to an agile organization.
We first have a look at how business intelligence processes have been working inside Zalando for the last years and present our current approach – Saiki. It is a scalable, cloud-based data integration and distribution infrastructure that makes data from our many microservices readily available for analytical teams.
We no longer live in a world of static data sets, but are instead confronted with endless streams of events that constantly inform us about relevant happenings from all over the enterprise. The processing of these event streams enables us to do near-real time business intelligence. In this context we have evaluated Apache Flink vs. Apache Spark in order to choose the right stream processing framework. Given our requirements, we decided to use Flink as part of our technology stack, alongside with Kafka and Elasticsearch.
With these technologies we are currently working on two use cases: a near real-time business process monitoring solution and streaming ETL.
Monitoring our business processes enables us to check if technically the Zalando platform works. It also helps us analyze data streams on the fly, e.g. order velocities, delivery velocities and to control service level agreements.
On the other hand, streaming ETL is used to relinquish resources from our relational data warehouse, as it struggles with increasingly high loads. In addition to that, it also reduces the latency and facilitates the platform scalability.
Finally, we have an outlook on our future use cases, e.g. near-real time sales and price monitoring. Another aspect to be addressed is to lower the entry barrier of stream processing for our colleagues coming from a relational database background.
RxJava allows building reactive applications by treating everything as a stream of messages. Observables represent message producers and observers consume messages. Observables provide asynchronous and parallel execution via operators like subscribeOn and observeOn. This makes applications resilient to failure, scalable under varying workloads, and responsive to clients. RxJava also promotes message-driven architectures, functional programming, and handling errors as regular messages to improve these characteristics. Developers must also unsubscribe to prevent leaking resources and ensure observables only run when needed.
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
Apache Beam (unified Batch and strEAM processing!) is a new Apache incubator project. Originally based on years of experience developing Big Data infrastructure within Google (such as MapReduce, FlumeJava, and MillWheel), it has now been donated to the OSS community at large.
Come learn about the fundamentals of out-of-order stream processing, and how Beam’s powerful tools for reasoning about time greatly simplify this complex task. Beam provides a model that allows developers to focus on the four important questions that must be answered by any stream processing pipeline:
What results are being calculated?
Where in event time are they calculated?
When in processing time are they materialized?
How do refinements of results relate?
Furthermore, by cleanly separating these questions from runtime characteristics, Beam programs become portable across multiple runtime environments, both proprietary (e.g., Google Cloud Dataflow) and open-source (e.g., Flink, Spark, et al).
Dennis Wittekind, Confluent, Senior Customer Success Engineer
Perhaps you have heard of Kafka Connect and think it would be a great fit in your application's architecture, but you like to know how things work before you propose them to your team? Perhaps you know enough Connect to be dangerous, but you haven't had the time to really understand all the moving pieces? This meetup talk is for you! We'll briefly introduce Connect to the uninitiated, and then jump in to underlying concepts and considerations you should make when running Connect in production! We'll even run a live demo! What could go wrong!?
https://www.meetup.com/Saint-Louis-Kafka-meetup-group/events/272687113/
Event sourcing - what could possibly go wrong ? Devoxx PL 2021Andrzej Ludwikowski
Yet another presentation about Event Sourcing? Yes and no. Event Sourcing is a really great concept. Some could say it’s a Holy Grail of the software architecture. I might agree with that, while remembering that everything comes with a price. This session is a summary of my experience with ES gathered while working on 3 different commercial products. Instead of theoretical aspects, I will focus on possible challenges with ES implementation. What could explode (very often with delayed ignition)? How and where to store events effectively? What are possible schema evolution solutions? How to achieve the highest level of scalability and live with eventual consistency? And many other interesting topics that you might face when experimenting with ES.
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points:
Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program.
How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics.
We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics.
We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured.
We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...confluent
This document provides an overview of a company's first Kafka Streams project to build a streaming data pipeline. Some key lessons learned include adopting a data-first mindset where the data defines the application behavior and architecture. All business logic is modeled as data transformations. Testing was done using TopologyTestDriver for unit tests and emulators for external systems. Kafka Streams was determined to be a good fit as it provided an ordered, fault-tolerant processing pipeline with exactly-once guarantees. Future work includes open sourcing components and improving the declarative side effect handling in the KStreams DSL.
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Dan Halperin
Apache Beam is a unified programming model for efficient and portable data processing pipelines. It provides abstractions like PCollections, sources/readers, ParDo, GroupByKey, side inputs, and windowing that hide complexity and allow runners to optimize efficiency. Beam supports both batch and streaming workloads on different distributed processing backends. It gives runners control over bundle size, splitting, and triggering to make tradeoffs between latency, throughput, and efficiency based on workload and cluster resources. This allows the same pipeline to be executed efficiently in different contexts without changes to the user code.
In this talk, we describe the design and implementation of the Python Streaming API support that has been submitted for inclusion in mainline Flink. Python is one of the most popular programming languages for data analysis. Its readability emphasizes development productivity and as a scripting language, it does not require a compilation nor complex development environment setup. Flink already has support for Python APIs for batch programming and unfortunately, the mechanism used to support batch programs (i.e., DataSet APIs) do does not work for Streaming API. We describe the limitations with the batch implementation and provide insights into how we solved this using Jython. We will walk through some of the examples programs using the new Python API and compare programmability and performance with the Java and Scala streaming APIs.
Spring 5 Webflux - Advances in Java 2018Trayan Iliev
The document discusses a presentation on functional reactive services with Spring 5 WebFlux. It introduces functional reactive programming (FRP), Project Reactor, building REST services with Spring 5 WebFlux including routers, handlers, filters, and reactive repositories. It also covers end-to-end non-blocking reactive service-oriented architecture with Netty, reactive WebClients, and real-time event streaming to JavaScript clients using server-sent events (SSE). The presentation code examples are available on GitHub.
Reactive programming by spring webflux - DN Scrum Breakfast - Nov 2018Scrum Breakfast Vietnam
Are you struggling to create a non-blocking REST application or a reactive micro-services? Spring WebFlux, a new module introduced by Spring 5 may help.
This new module introduces:
- Fully non-blocking
- Supports Reactive Streams back pressure
- Runs on such servers as Netty, Undertow, and Servlet 3.1+ containers
- Its support for the reactive programming model
In our next Scrum Breakfast, we will discuss Spring WebFlux, its benefit and how we implement it.
Our workshop will be including the following:
- What is reactive programming
- Introduction to Spring Webflux
- Tea break
- The details in Spring Webflux
- Reactive stack demonstration
- Q&A
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward
Apache Mesos allows operators to run distributed applications across an entire datacenter and is attracting ever increasing interest. As much as distributed applications see increased use enabled by Mesos, Mesos also sees increasing use due to a growing ecosystem of well integrated applications. One of the latest additions to the Mesos family is Apache Flink. Flink is one of the most popular open source systems for real-time high scale data processing and allows users to deal with low-latency streaming analytical workloads on Mesos. In this talk we explain the challenges solved while integrating Flink with Mesos, including how Flink’s distributed architecture can be modeled as a Mesos framework, and how Flink was integrated with Fenzo. Next, we describe how Flink was packaged to easily run on DC/OS.
Introducing Arc: A Common Intermediate Language for Unified Batch and Stream...Flink Forward
Today's end-to-end data pipelines need to combine many diverse workloads such as machine learning, relational operations, stream dataflows, tensor transformations, and graphs. For each of these workload types exist several frontends (e.g., DataFrames/SQL, Beam, Keras) based on different programming languages as well as different runtimes (e.g., Spark, Flink, Tensorflow) that target a particular frontend and possibly a hardware architecture (e.g., GPUs). Putting all the pieces of a data pipeline together simply leads to excessive data materialisation, type conversions and hardware utilisation as well as miss-matches of processing guarantees.
Our research group at RISE and KTH in Sweden has founded Arc, an intermediate language that bridges the gap between any frontend and a dataflow runtime (e.g., Flink) through a set of fundamental building blocks for expressing data pipelines. Arc incorporates Flink and Beam-inspired stream semantics such as windows, state and out of order processing as well as concepts found in batch computation models. With Arc, we can cross- compile and optimise diverse tasks written in any programming language into a unified dataflow program. Arc programs can run on various hardware backends efficiently as well as allowing seamless, distributed execution on dataflow runtimes. To that end, we showcase Arcon a concept runtime built in Rust that can execute Arc programs natively as well as presenting a minimal set of extensions to make Flink an Arc-ready runtime.
20160609 nike techtalks reactive applications tools of the tradeshinolajla
An update to my talk about concurrency abstractions, including event loops (node.js and Vert.x), CSP (Go, Clojure), Futures, CPS/Dataflow (RxJava) and Actors (Erlang, Akka)
This document summarizes key concepts and implementation patterns related to microservices and modularity with Java. It defines microservices as smaller, separated services that communicate with lightweight mechanisms like REST APIs. Core patterns discussed include API gateways, service registries, configuration services, monitoring, and distributed tracing. While microservices improve modularity, the document advocates first considering a modular monolith approach for new projects before splitting into separate services and processes.
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Flink Forward
In this talk, I will present how Flink enables enterprise customers to unify their data processing systems by using Flink to query Hive data.
Unification of streaming and batch is a main theme for Flink. Since 1.9.0, we have integrated Flink with Hive in a platform level. I will talk about:
- what features we have released so far, and what they enable our customers to do
- best practices to use Flink with Hive
- what is the latest development status of Flink-Hive integration at the time of Flink Forward Berlin (Oct 2019), and what to look for in the next release (probably 1.11)
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward
Apache Beam is Flink’s sibling in the Apache family of streaming processing frameworks. The Beam and Flink teams work closely together on advancing what is possible in streaming processing, including Streaming SQL extensions and code interoperability on both platforms.
Beam was originally developed at Google as the amalgamation of its internal batch and streaming frameworks to power the exabyte-scale data processing for Gmail, YouTube and Ads. It now powers a fully-managed, serverless service Google Cloud Dataflow, as well as is available to run in other Public Clouds and on-premises when deployed in portability mode on Apache Flink, Spark, Samza and other runners. Users regularly run distributed data processing jobs on Beam spanning tens of thousands of CPU cores and processing millions of events per second.
In this session, Sergei Sokolenko, Cloud Dataflow product manager, and Reuven Lax, the founding member of the Dataflow and Beam team, will share Google’s learnings from building and operating a global streaming processing infrastructure shared by thousands of customers, including:
safe deployment to dozens of geographic locations,
resource autoscaling to minimize processing costs,
separating compute and state storage for better scaling behavior,
dynamic work rebalancing of work items away from overutilized worker nodes,
offering a throughput-optimized batch processing capability with the same API as streaming,
grouping and joining of 100s of Terabytes in a hybrid in-memory/on-desk file system,
integrating with the Google Cloud security ecosystem, and other lessons.
Customers benefit from these advances through faster execution of jobs, resource savings, and a fully managed data processing environment that runs in the Cloud and removes the need to manage infrastructure.
Apache Beam is a unified programming model for batch and streaming data processing. It defines concepts for describing what computations to perform (the transformations), where the data is located in time (windowing), when to emit results (triggering), and how to accumulate results over time (accumulation mode). Beam aims to provide portable pipelines across multiple execution engines, including Apache Flink, Apache Spark, and Google Cloud Dataflow. The talk will cover the key concepts of the Beam model and how it provides unified, efficient, and portable data processing pipelines.
Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkFlink Forward
Flink provides fault tolerance guarantees through checkpointing and recovery mechanisms. Checkpoints take consistent snapshots of distributed state and data, while barriers mark checkpoints in the data flow. This allows Flink to recover jobs from failures and resume processing from the last completed checkpoint. Flink also implements high availability by persisting metadata like the execution graph and checkpoints to Apache Zookeeper, enabling a standby JobManager to take over if the active one fails.
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...Flink Forward
http://flink-forward.org/kb_sessions/flink-in-zalandos-world-of-microservices/
In this talk we present Zalando’s microservices architecture, introduce Saiki – our next generation data integration and distribution platform on AWS and show how we employ stream processing with Apache Flink for near-real time business intelligence.
Zalando is one of the largest online fashion retailers in Europe. In order to secure our future growth and remain competitive in this dynamic market, we are transitioning from a monolithic to a microservices architecture and from a hierarchical to an agile organization.
We first have a look at how business intelligence processes have been working inside Zalando for the last years and present our current approach – Saiki. It is a scalable, cloud-based data integration and distribution infrastructure that makes data from our many microservices readily available for analytical teams.
We no longer live in a world of static data sets, but are instead confronted with endless streams of events that constantly inform us about relevant happenings from all over the enterprise. The processing of these event streams enables us to do near-real time business intelligence. In this context we have evaluated Apache Flink vs. Apache Spark in order to choose the right stream processing framework. Given our requirements, we decided to use Flink as part of our technology stack, alongside with Kafka and Elasticsearch.
With these technologies we are currently working on two use cases: a near real-time business process monitoring solution and streaming ETL.
Monitoring our business processes enables us to check if technically the Zalando platform works. It also helps us analyze data streams on the fly, e.g. order velocities, delivery velocities and to control service level agreements.
On the other hand, streaming ETL is used to relinquish resources from our relational data warehouse, as it struggles with increasingly high loads. In addition to that, it also reduces the latency and facilitates the platform scalability.
Finally, we have an outlook on our future use cases, e.g. near-real time sales and price monitoring. Another aspect to be addressed is to lower the entry barrier of stream processing for our colleagues coming from a relational database background.
RxJava allows building reactive applications by treating everything as a stream of messages. Observables represent message producers and observers consume messages. Observables provide asynchronous and parallel execution via operators like subscribeOn and observeOn. This makes applications resilient to failure, scalable under varying workloads, and responsive to clients. RxJava also promotes message-driven architectures, functional programming, and handling errors as regular messages to improve these characteristics. Developers must also unsubscribe to prevent leaking resources and ensure observables only run when needed.
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
Apache Beam (unified Batch and strEAM processing!) is a new Apache incubator project. Originally based on years of experience developing Big Data infrastructure within Google (such as MapReduce, FlumeJava, and MillWheel), it has now been donated to the OSS community at large.
Come learn about the fundamentals of out-of-order stream processing, and how Beam’s powerful tools for reasoning about time greatly simplify this complex task. Beam provides a model that allows developers to focus on the four important questions that must be answered by any stream processing pipeline:
What results are being calculated?
Where in event time are they calculated?
When in processing time are they materialized?
How do refinements of results relate?
Furthermore, by cleanly separating these questions from runtime characteristics, Beam programs become portable across multiple runtime environments, both proprietary (e.g., Google Cloud Dataflow) and open-source (e.g., Flink, Spark, et al).
Dennis Wittekind, Confluent, Senior Customer Success Engineer
Perhaps you have heard of Kafka Connect and think it would be a great fit in your application's architecture, but you like to know how things work before you propose them to your team? Perhaps you know enough Connect to be dangerous, but you haven't had the time to really understand all the moving pieces? This meetup talk is for you! We'll briefly introduce Connect to the uninitiated, and then jump in to underlying concepts and considerations you should make when running Connect in production! We'll even run a live demo! What could go wrong!?
https://www.meetup.com/Saint-Louis-Kafka-meetup-group/events/272687113/
Event sourcing - what could possibly go wrong ? Devoxx PL 2021Andrzej Ludwikowski
Yet another presentation about Event Sourcing? Yes and no. Event Sourcing is a really great concept. Some could say it’s a Holy Grail of the software architecture. I might agree with that, while remembering that everything comes with a price. This session is a summary of my experience with ES gathered while working on 3 different commercial products. Instead of theoretical aspects, I will focus on possible challenges with ES implementation. What could explode (very often with delayed ignition)? How and where to store events effectively? What are possible schema evolution solutions? How to achieve the highest level of scalability and live with eventual consistency? And many other interesting topics that you might face when experimenting with ES.
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points:
Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program.
How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics.
We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics.
We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured.
We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...confluent
This document provides an overview of a company's first Kafka Streams project to build a streaming data pipeline. Some key lessons learned include adopting a data-first mindset where the data defines the application behavior and architecture. All business logic is modeled as data transformations. Testing was done using TopologyTestDriver for unit tests and emulators for external systems. Kafka Streams was determined to be a good fit as it provided an ordered, fault-tolerant processing pipeline with exactly-once guarantees. Future work includes open sourcing components and improving the declarative side effect handling in the KStreams DSL.
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Dan Halperin
Apache Beam is a unified programming model for efficient and portable data processing pipelines. It provides abstractions like PCollections, sources/readers, ParDo, GroupByKey, side inputs, and windowing that hide complexity and allow runners to optimize efficiency. Beam supports both batch and streaming workloads on different distributed processing backends. It gives runners control over bundle size, splitting, and triggering to make tradeoffs between latency, throughput, and efficiency based on workload and cluster resources. This allows the same pipeline to be executed efficiently in different contexts without changes to the user code.
In this talk, we describe the design and implementation of the Python Streaming API support that has been submitted for inclusion in mainline Flink. Python is one of the most popular programming languages for data analysis. Its readability emphasizes development productivity and as a scripting language, it does not require a compilation nor complex development environment setup. Flink already has support for Python APIs for batch programming and unfortunately, the mechanism used to support batch programs (i.e., DataSet APIs) do does not work for Streaming API. We describe the limitations with the batch implementation and provide insights into how we solved this using Jython. We will walk through some of the examples programs using the new Python API and compare programmability and performance with the Java and Scala streaming APIs.
Spring 5 Webflux - Advances in Java 2018Trayan Iliev
The document discusses a presentation on functional reactive services with Spring 5 WebFlux. It introduces functional reactive programming (FRP), Project Reactor, building REST services with Spring 5 WebFlux including routers, handlers, filters, and reactive repositories. It also covers end-to-end non-blocking reactive service-oriented architecture with Netty, reactive WebClients, and real-time event streaming to JavaScript clients using server-sent events (SSE). The presentation code examples are available on GitHub.
Reactive programming by spring webflux - DN Scrum Breakfast - Nov 2018Scrum Breakfast Vietnam
Are you struggling to create a non-blocking REST application or a reactive micro-services? Spring WebFlux, a new module introduced by Spring 5 may help.
This new module introduces:
- Fully non-blocking
- Supports Reactive Streams back pressure
- Runs on such servers as Netty, Undertow, and Servlet 3.1+ containers
- Its support for the reactive programming model
In our next Scrum Breakfast, we will discuss Spring WebFlux, its benefit and how we implement it.
Our workshop will be including the following:
- What is reactive programming
- Introduction to Spring Webflux
- Tea break
- The details in Spring Webflux
- Reactive stack demonstration
- Q&A
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward
Apache Mesos allows operators to run distributed applications across an entire datacenter and is attracting ever increasing interest. As much as distributed applications see increased use enabled by Mesos, Mesos also sees increasing use due to a growing ecosystem of well integrated applications. One of the latest additions to the Mesos family is Apache Flink. Flink is one of the most popular open source systems for real-time high scale data processing and allows users to deal with low-latency streaming analytical workloads on Mesos. In this talk we explain the challenges solved while integrating Flink with Mesos, including how Flink’s distributed architecture can be modeled as a Mesos framework, and how Flink was integrated with Fenzo. Next, we describe how Flink was packaged to easily run on DC/OS.
Introducing Arc: A Common Intermediate Language for Unified Batch and Stream...Flink Forward
Today's end-to-end data pipelines need to combine many diverse workloads such as machine learning, relational operations, stream dataflows, tensor transformations, and graphs. For each of these workload types exist several frontends (e.g., DataFrames/SQL, Beam, Keras) based on different programming languages as well as different runtimes (e.g., Spark, Flink, Tensorflow) that target a particular frontend and possibly a hardware architecture (e.g., GPUs). Putting all the pieces of a data pipeline together simply leads to excessive data materialisation, type conversions and hardware utilisation as well as miss-matches of processing guarantees.
Our research group at RISE and KTH in Sweden has founded Arc, an intermediate language that bridges the gap between any frontend and a dataflow runtime (e.g., Flink) through a set of fundamental building blocks for expressing data pipelines. Arc incorporates Flink and Beam-inspired stream semantics such as windows, state and out of order processing as well as concepts found in batch computation models. With Arc, we can cross- compile and optimise diverse tasks written in any programming language into a unified dataflow program. Arc programs can run on various hardware backends efficiently as well as allowing seamless, distributed execution on dataflow runtimes. To that end, we showcase Arcon a concept runtime built in Rust that can execute Arc programs natively as well as presenting a minimal set of extensions to make Flink an Arc-ready runtime.
20160609 nike techtalks reactive applications tools of the tradeshinolajla
An update to my talk about concurrency abstractions, including event loops (node.js and Vert.x), CSP (Go, Clojure), Futures, CPS/Dataflow (RxJava) and Actors (Erlang, Akka)
This document summarizes key concepts and implementation patterns related to microservices and modularity with Java. It defines microservices as smaller, separated services that communicate with lightweight mechanisms like REST APIs. Core patterns discussed include API gateways, service registries, configuration services, monitoring, and distributed tracing. While microservices improve modularity, the document advocates first considering a modular monolith approach for new projects before splitting into separate services and processes.
This document introduces Akka, an open-source toolkit for building distributed, concurrent, and resilient message-driven applications for Java and Scala. It discusses how application requirements have changed to require clustering, concurrency, elasticity, and resilience. Akka uses an actor model with message-driven actors that can be distributed and made fault-tolerant. The document provides examples of creating and communicating between actors using messages, managing failures with supervision, and load balancing with routers.
One of the most boring thing in software development in large companies is following a bureaucracy. Tons of developers were melted down by that ruthless machine with its not always obvious rules. That’s why we decided to delegate all the boring work to machines instead of humans and the talk will cover the achieved results.
Reactive microservices with eclipse vert.xRam Maddali
This document summarizes a hands-on technical workshop on transforming applications from monoliths to microservices using reactive architectures with Eclipse Vert.x. The workshop introduces reactive systems and programming, demonstrates how Vert.x uses non-blocking execution models to handle high volumes with few threads, and guides participants through a lab to develop reactive microservices with Vert.x that interact asynchronously without blocking.
This document provides an overview of microservices and how to develop them using Spring. It discusses the challenges of distributed systems and how Spring Boot and Spring Cloud Netflix address areas like configuration, service registration, load balancing, fault tolerance, and monitoring. Examples are provided for building microservices with Spring Boot, integrating configuration with Spring Cloud Config, registering services with Eureka, load balancing with Ribbon and Feign, handling faults with Hystrix, and monitoring with Hystrix Dashboard. Reactive programming with RxJava is also introduced as an approach for concurrent API integration.
This document provides an overview of reactive applications in Java using Project Reactor. It discusses the challenges of modern applications and how reactive programming addresses these challenges through asynchronous, non-blocking architectures. It introduces key concepts of reactive programming like Flux, Mono, operators, and backpressure. It also covers Project Reactor specifics like threading model, debugging, testing and learning resources. The goal is to explain why reactive programming is useful and provide an introduction to building reactive applications in Java with Project Reactor.
We will introduce Airflow, an Apache Project for scheduling and workflow orchestration. We will discuss use cases, applicability and how best to use Airflow, mainly in the context of building data engineering pipelines. We have been running Airflow in production for about 2 years, we will also go over some learnings, best practices and some tools we have built around it.
Speakers: Robert Sanders, Shekhar Vemuri
The document discusses Reactive Slick, a new version of the Slick database access library for Scala that provides reactive capabilities. It allows parallel database execution and streaming of large query results using Reactive Streams. Reactive Slick is suitable for composite database tasks, combining async tasks, and processing large datasets through reactive streams.
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedis Labs
Dynomite is a framework that makes non-distributed databases distributed by adding a proxy layer, auto-sharding, replication across datacenters, and more. It is used at Netflix to power various services by sitting on top of Redis and providing high availability, scalability, and tunable consistency. Conductor is a workflow orchestration engine used by Netflix that stores workflow definitions and state in Dynomite to allow reusable and controllable workflow processes.
This document summarizes Slack's transition from Graphite to Prometheus for monitoring. It describes the issues with Graphite including difficulty discovering metrics, slow queries, lack of tagging, and inability to scale. Prometheus was chosen because it meets requirements for high availability, ease of use, fast queries, scaling, and customization. The document outlines Slack's Prometheus architecture with HA clusters and discusses challenges of monitoring many metrics from web apps and jobs. It also previews future plans including Consul for service discovery and adopting Thanos and per-service Prometheus instances.
This document discusses scalable JavaScript applications using Project Nashorn. It covers why JavaScript is useful for servers, benefits of the Java virtual machine, an overview of Nashorn and its capabilities, and how frameworks like Vert.x and Avatar.js allow building scalable architectures. It also includes a benchmark comparison and questions.
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
Prometheus is a next-generation monitoring system. Since being publicly announced last year it has seen wide-spread interest and adoption. This talk will look at the concepts behind monitoring with Prometheus, and how to use it with Kubernetes which has direct support for Prometheus.
Apache Samza is a stream processing framework that provides high-level APIs and powerful stream processing capabilities. It is used by many large companies for real-time stream processing. The document discusses Samza's stream processing architecture at LinkedIn, how it scales to process billions of messages per day across thousands of machines, and new features around faster onboarding, powerful APIs including Apache Beam support, easier development through high-level APIs and tables, and better operability in YARN and standalone clusters.
The document discusses DevOps practices for TYPO3 projects. It defines DevOps as the confluence of development and operations. It highlights the importance of communication between different roles like developers, system administrators, and integrators. It also provides examples of tools and techniques that can be used at different stages of a TYPO3 project to facilitate DevOps practices, such as automated testing, deployment automation, and content synchronization.
Comparison between zookeeper, etcd 3 and other distributed coordination systemsImesha Sudasingha
This is a comparison between popular distributed coordination systems including zookeeper (which powers Apache Hadoop), etcd 3 (which powers Kubernetes), consul and hazelcast. This comparison was made in second half of 2016. Therefore, please note that some of these technologies have improved immensely over the time. Anyway, this presentation will provide an initial idea of each distributed coordination systems.
How to make a high-quality Node.js app, Nikita GalkinSigma Software
This document discusses how to build high quality Node.js applications. It covers attributes of quality like understandability, modifiability, portability, reliability, efficiency, usability, and testability. For each attribute, it provides examples of what could go wrong and best practices to achieve that attribute, such as using dependency injection for modifiability, environment variables for portability, and graceful shutdown for reliability. It also discusses Node.js programming paradigms like callbacks, promises, and async/await and recommends best practices for testing Node.js applications.
Creando microservicios con Java MicroProfile y TomEE - OGBTCésar Hernández
En esta sesión los asistentes presenciaron la base teórica y práctica para la creación de micro servicios con Java, JakartaEE, MicroProfile utilizando TomEE como servidor de aplicaciones.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
2. Who am I?
Java Technical Lead at Seavus
17 years in the industry
Spring Certified Professional
You can find me at:
● drazen.nikolic@seavus.com
● @drazenis
● programminghints.com
3. Changing Requirements (then and now)
10 years ago Now
Server nodes 10’s 1000’s
Response times seconds milliseconds
Maintenance downtimes hours none
Data volume GBs TBs → PBs
5. Reactive Programming
Event-driven systems
Moves imperative logic to:
● asynchronous
● non-blocking
● functional-style code
Allows stable, scalable
access to external systems
Example use-cases:
Monitoring stock prices
Streaming videos
Spreadsheets
Fraud detection
6. When to Use Reactive?
● Handling networking issues, like latency or failures
● Scalability concerns
● Clients getting overwhelmed by the sent messages
(handling backpressure)
● Highly concurrent operations
8. Reactive Streams Specification
● A spec based on Reactive Manifesto prescription
● Intention to scale vertically (within JVM),
rather then horizontally (through clustering)
● A standard for async data stream processing
● Non-blocking flow control (backpressure)
● The Exceptions are first-class citizens
9. Reactive Streams Specification
public interface Publisher<T> {
public void subscribe(Subscriber<? super T> s);
}
public interface Subscriber<T> {
public void onSubscribe(Subscription s);
public void onNext(T t);
public void onError(Throwable t);
public void onComplete();
}
public interface Subscription {
public void request(long n);
public void cancel();
}
public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {}
10. How it works?
Reactive Streams
Implementations for
Java:
RxJava
Project Reactor
Akka Streams
Ratpack
Vert.x 3
11. Spring Framework 5
Another major release, became GA in September 2017
A lots of improvements and new concepts introduced:
● Support for JDK 9
● Support Java EE 8 API (e.g. Servlet 4.0)
● Integration with Project Reactor 3.1
● JUnit 5
● Comprehensive support for Kotlin language
● Dedicated reactive web framework - Spring WebFlux
17. Reactor Pipeline
● Lazy evaluated
● Nothing is produced until there is a subscriber
userService.getFavorites(userId)
.timeout(Duration.ofMillis(800))
.onErrorResume(cacheService.cachedFavoritesFor(userId))
.flatMap(favoriteService::getDetails)
.switchIfEmpty(suggestionService.getSuggestions())
.take(5)
.publishOn(UiUtils.uiThreadScheduler())
.subscribe(uiList::show, UiUtils::errorPopup);
24. Spring Data Reactive
public interface BookRepository
extends ReactiveCrudRepository<Book, String> {
Flux<Book> findByAuthor(String author);
}
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-mongodb-reactive</artifactId>
</dependency>
25. WebFlux Spring Security
@EnableWebFluxSecurity
public class HelloWebfluxSecurityConfig {
@Bean
public MapReactiveUserDetailsService userDetailsService() {
UserDetails user = User.withDefaultPasswordEncoder()
.username("user")
.password("user")
.roles("USER")
.build();
return new MapReactiveUserDetailsService(user);
}
}
26. Reactive Method Security
@EnableWebFluxSecurity
@EnableReactiveMethodSecurity
public class SecurityConfig {
@Bean
public MapReactiveUserDetailsService userDetailsService() {...}
}
@Component
public class HelloWorldMessageService {
@PreAuthorize("hasRole('ADMIN')")
public Mono<String> findMessage() {
return Mono.just("Hello World!");
}
}
29. References & Attributions
Reactive Streams Specification for the JVM
Reactive Spring - Josh Long, Mark Heckler
Reactive Programming by Venkat Subramaniam
What is Reactive Programming by Martin Oderski
Reactive Streams: Handling Data-Flow the Reactive Way by Roland Kuhn
What Are Reactive Streams in Java? by John Thompson
Spring Boot Reactive Tutorial by Mohit Sinha
Doing Reactive Programming with Spring 5 by Eugen Paraschiv
30. Be proactive, go Reactive!
Spring will help you on this journey!
Thank you
Where applicable...
Editor's Notes
# Reactive is in hype lately, like microservices 2-3 years ago
# New hypes - old concepts, but resonating with modern enterprise
# 17 years of working experience implementing solutions in domains e-commerce, insurance, digital marketing
# Questions - near the end of the presentation
… things got changed in the meanwhile!
# Fight with increased load, we are thought to just spin up more threads
# Even cellphones have multi-core processors
# Problem solved? Not quite!
# Many applications just fine using the traditional threading model, thread per request
# Problems:
shared, synced state;
blocking;
strong coupling;
hard to compose; inefficient use of system resources
# Analogy: city streets with traffic lights vs highway
# Reactive programming not a “silver bullet”
# We got used to implementing CRUD for everything
# Sometimes we’re just observing how data change
# Reactive used for even-driven systems
# imperative style to: async, non-blocking, functional
# Stable, scaleable
>> # Examples: stock prices, streming videos, spreadsheets, fraud detection
# “Old model” is good enough for many use-cases. You will used it in the future
# Use Reactive when dealing with network latency/failures, scalability, backpressure, highly concurrent operations
# 5 years ago, people observed scaling through buying larger servers and multithreading is: too complex, inefficient and expensive
# Sat down and wrote main characteristics needed to satisfy today’s app demands - Reactive Manifesto
# Responsive (react to users); Resilient (react to failures; Elastic (react to load); Message Driver (react to events)
# In general, Reactive programming is about non-blocking apps, that are async and event-driven, require small number of threads to scale vertically (within JVM)
# Reactive Streams is spec based on Reactive Manifesto
# Standard for async data stream processing with non-blocking backpressure (e.g. stock updates)
# Key aspect: backpressure (client: give me more; do not send me anymore)
# Exceptions are first-class citizens
# Major shift from imperative to declarative (func.) async composition of logic
# java.util.concurrent.Flow (Java 9 API)
>> # 4 interfaces: Publisher, Subscriber, Subscription, Processor
# Describe how it works, from the graph
# Not meant for you to implement yourself, not trivial
>> # Different implementations: RxJava, Reactor, Akka Streams…
# At first, all this might look complicated but becomes straightforward
# Infuses asynchrony into the system core
# Naturally leads to microservices
# Functional programming to manipulate data streams
# I’m using Spring framework from version 1, and fall in-love from the start
# No need to reinvent the wheel, saving time + smarter people implemented it
# You can learn a lot from observing framework source code - recommended
# Nowadays Spring grown to a giant beast (help in day-to-day work; also shape the future)
# Reactive Streams are not enough; you need higher-order implementations - Project Reactor good fit
>> # New in Spring 5: support for JDK 9, Servlet 4, Project Reactor, Kotling, WebFlux
# Project Reactor defines two Publisher types: Mono and Flux.
# Mono emits 0 or 1 element, successful or an error
# Reactive equivalent of CompletableFuture
# Flux emits 0 to N elements, completes successfully or error
# Reactor Mono and Flux intended in implementation and return types
# Input parameters keep with raw Publisher (if possible)
# Avoid using state in lambdas with Flux operator - can be shared by multiple Subscribers
# Deferred Pull/Push model
# On the next slides, there are various operations examples
# Are combined to form a pipeline
...
...
# Example of UserService produces Flux<Favorite> for a given user
# Describe the code (before subscribe() line)
>> # As with Java 8 Streams API - lazy evaluated
# Nothing is produced until there is a subscriber
# Difference to Java 8 Streams: func. Prog. for collections vs arbitrary input source; push vs pull model; execute once vs infinite stream from external resource, multiple subscribers, throttle, backpressure
# Project Reactor is only a foundation - you need HTTP client/server, websocket endpoints etc. -> Spring 5 WebFlux
# Spring framework 5 released in Sept 2017 with spring-webflux module
# Module contains reactive support for HTTP, WebSocket clients
# Also support reactive server web apps
# Runs on Java 8 (takes advantage of lambdas and streams)
# WebClient class (instead of RestTemplate) - reactive requests + applying transformations on response (without blocking)
# Server-side 2 programming models: Annotation-based @Controller; Functional routing and handling
# Both models execute on the same reactive foundation (non-blocking HTTP)
# WebFlux run on Servlet 3.1containers (non-blocking IO API), Tomcat, Jetty
# Main difference: use of HandlerMapping and HandlerAdapter instead of blocking HttpServletRequest / -Response
# Req/Resp body exposed as Flux<DataBuffer> rather than InputStream / OutputStream, with backpressure
# Controllers can run on Netty and Undertow, too
# New functional web framework - HandlerFunction<T>
# It is Function<Request, Response<T>>
# Endpoints (routes) can be defined in functional (builder-like) way
# Routes are linked to a Handler functions
>> # Here you can see how to define routes and start Netty
>> … and Tomcat
# To write a non-blocking, reactive HTTP Client use WebClient (instead of RestTemplate)
# Allows reactive requests to server and to apply transformations and actions to the response (once it is returned)
# exchange() method immediately returns Mono<ClientResponse>, which will eventually represent response from server
# Eventual nature of this is important
# We execute this code, then go and do some other processing and confident it will be executed in the background for us
# Reactive paradigm even more useful with WebSockets
# Easy implementation using WebSocketClient interface (of WebFlux)
# Example (on slide): simple client calls WebSocket Echo service and log messages are received
# Even if complete web server logic is reactive, DB access is usually a blocking operation
# Spring Data Kay release train added support for reactive NoSQL DB drivers
# Reactive support for Cassandra, MongoDB and Redis
# You can use reactive repositories and reactive templates
# There are 3rd party reactive drivers for relational database like PostgreSQL, but not supported yet by Spring Data
# E.g. user ReactiveCrudRepository instead of normal CrudRepository, returns Mono and Flux
# Spring Security’s WebFlux support relies on WebFilter
# Minimal configuration (as shown on slide) provides:
form and http basic authentication
sets up auth/auth for accessing any page
default login/logout pages (nice looking CSS) Security HTTP headers, CSRF protection and more
# Spring Boot 2 auto-config for OAuth 2.0 Login
# Reactive Method Security works the similar way as normal method security
# Just add @EnableReactiveMethodSecurity annotation to Spring Boot config
# As always, best to see the code in action
# Simple web endpoints, which relies on a slow remote service
Show domain model - Employee.java
Show EmployeeService
Show pom.xml (spring boot 2, spring-boot-starter-webflux)
Show ReactivespringApplication with Flux of Employees, delayed by 2 seconds
Show RemoteEmployeeServiceImpl and how it uses reactive Mono and Flux
Show EmployeeController with different endpoints
Start the spring-boot server -> show that Netty has started on port 8080
Execute (in Google Chrome or curl -s --no-buffer http://localhost:8080/employees):
http://localhost:8080/employees (returned with delay)
http://localhost:8080/employees/1
http://localhost:8080/employees/2
http://localhost:8080/employees/search/byGender/FEMALE
Show Unit tests with reactor-test library
# Before closing the presentation, if you have some questions (and they are not too hard) I would be happy to answer them.
# You can also find me during the day at the venue networking areas, to start a chat
# If you want to know more about reactive programming and Spring 5, there are plenty of resources online
# Recommend: search for talks on this topic by:
Dr. Venkat Subramaniam (award-winning author, founder of Agile Developer, Inc.),
Prof. Martin Odersky (one of Scala founders),
Josh Long (Spring Developer Advocate at Pivotal) and many others.
# So, whenenver it makes sense
# Be proactive, go Reactive!
# Spring will help you on this journey!
Thank you!