Deploying a robust streaming data pipeline can be a daunting task when your company’s financial information is at risk. For starters, how do you ensure proper provisioning of resources? How do you preserve end-to-end application and data consistency? How do you make all of this work in the cloud with Kubernetes and avoid YAML hell? Answer: Cloudflow, a new open-source toolkit for simplifying the development, deployment, and operation of streaming data pipelines.
In this guest webinar by Kevin Webber, we cover the entire architecture of a Reactive system, from a responsive UI implemented with Vue.js, to a fully event sourced collection of microservices implemented with Java, Lagom, Cassandra, and Kafka.
For the full recording, visit: https://www.lightbend.com/blog/full-stack-reactive-in-practice-webinar
Live Event Debugging With ksqlDB at Reddit | Hannah Hagen and Paul Kiernan, R...HostedbyConfluent
Convincing developers to write tests for new code is hard; convincing developers to write tests for new event data is even harder. At Reddit, engineers have often deployed new app versions, only to find out later that the event wasn’t firing at all, or it was missing critical fields. So this begs the question, “How can engineers at Reddit be confident that the events they instrument are accurate and complete?”
In this session, we will learn about an internal tool developed at Reddit to QA events in real-time. This KSQL-powered web app streams events from our pipeline, allowing developers to filter events they care about using criteria like User ID, Device ID or the type of user interaction. With a backbone of KSQL and Kafka Streams, engineers can get real-time feedback on how accurate (or how erroneous) their event data is.
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSLightbend
Apache Kafka–part of Lightbend Fast Data Platform–is a distributed streaming platform that is best suited to run close to the metal on dedicated machines in statically defined clusters. For most enterprises, however, these fixed clusters are quickly becoming extinct in favor of mixed-use clusters that take advantage of all infrastructure resources available.
In this webinar by Sean Glover, Fast Data Engineer at Lightbend, we will review leading Kafka implementations on DC/OS and Kubernetes to see how they reliably run Kafka in container orchestrated clusters and reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You will learn specifically about concerns like:
* The need for greater operational knowhow to do common tasks with Kafka in static clusters, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers.
* The best way to provide resources to stateful technologies while in a mixed-use cluster, noting the importance of disk space as one of Kafka’s most important resource requirements.
* How to address the particular needs of stateful services in a model that natively favors stateless, transient services.
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
Organizations like Starbucks, HPE, and PayPal (see our customers) have selected the Akka toolkit for their enterprise scale distributed applications; and when it comes to squeezing out the best possible performance, the secret is using two particular modules in tandem: Akka Cluster and Akka Streams.
In this webinar by Nolan Grace, Senior Solution Architect at Lightbend, we look at these two Akka modules and discuss the features that will push your application architecture to the next tier of performance.
For the full blog post, including the video, visit: https://www.lightbend.com/blog/akka-at-enterprise-scale-performance-tuning-distributed-applications
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetesconfluent
Speakers: Joe Beda, Co-founder and CTO, Heptio + Gwen Shapira, Principal Data Architect, Confluent
With the rapid adoption of microservices, there is a growing need for solutions to manage deployment, resources and data for fleets of microservices. Kubernetes is a resource management framework for containers that is rapidly growing in popularity. Apache Kafka is a streaming platform that makes data accessible to the edges of an organization. It's no wonder the question of running Kafka on Kubernetes keeps coming up!
In this online talk, Joe Beda, CTO of Heptio and co-creator of Kubernetes, and Gwen Shapira, principal data architect at Confluent and Kafka PMC member, will help you navigate through the hype, address frequently asked questions and deliver critical information to help you decide if running Kafka on Kubernetes is the right approach for your organization.
You will:
-Get an introduction to the basic concepts you need to know as you plan to deploy services on Kubernetes.
-Learn which parts of the Kafka ecosystem fit Kubernetes like a glove, and which require special attention.
-Pick up useful tips for getting started.
-See why Confluent Platform for Kubernetes is the simplest solution to deploying and orchestrating Kafka on Kubernetes, using container images and a Kubernetes operator.
Watch the recording: https://videos.confluent.io/watch/yoZcuazDjDDTcj1sRnaD3J?.
(Mike Graham + Dan Carroll, Comcast) Kafka Summit SF 2018
Comcast manages over 2 million miles of fiber and coax, and over 40 million in home devices. This “outside plant” is subject to adverse conditions from severe weather to power grid outages to construction-related disruptions. Maintaining the health of this large and important infrastructure requires a distributed, scalable, reliable and fast information system capable of real-time processing and rapid analysis and response. Using Apache Kafka and the Kafka Streams Processor API, Comcast built an innovative new system for monitoring, problem analysis, metrics reporting and action response for the outside plant.
In this talk, you’ll learn how topic partitions, state stores, key mapping, source and sink topics and processors from the Kafka Streams Processor API work together to build a powerful dynamic system. We will dive into the details about the inner workings of the state store—how it is backed by a Kafka “changelog” topic, how it is scaled horizontally by partition and how the instances are rebuilt on startup or on processor failure. We will discuss how these state stores essentially become like materialized views in a SQL database but are updated incrementally as data flows through the system, and how this allows the developers to maintain the data in the optimal structures for performing the processing. The best part is that the data is readily available when needed by the processors. You will see how a REST API using Kafka Streams “interactive queries” can be used to retrieve the data in the state stores. We will explore the deployment and monitoring mechanisms used to deliver this system as a set of independently deployed components.
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...confluent
Is your organization adopting Kafka as their messaging bus but you've found that it will take too long to migrate your existing service-oriented architecture to a log-oriented architecture? Some of the biggest challenges in building a new stream processor can be implementing all the business logic again. It has become increasingly common for companies with high-throughput source streams and change-data-capture logs to want to build systems fast. At Ticketmaster, we have found a solution to the problem by leveraging the business logic in our existing services and calling them from our Java based KafkaStreams processor applications in an efficient manner. In this talk, we will examine the initial challenges we faced in our transition, then we will explore the solutions we built to address the use cases at Ticketmaster. The primary focus will address our workflow around calling services to bring stream processor applications to market fast. We will review our challenges and share tips for success.
In this guest webinar by Kevin Webber, we cover the entire architecture of a Reactive system, from a responsive UI implemented with Vue.js, to a fully event sourced collection of microservices implemented with Java, Lagom, Cassandra, and Kafka.
For the full recording, visit: https://www.lightbend.com/blog/full-stack-reactive-in-practice-webinar
Live Event Debugging With ksqlDB at Reddit | Hannah Hagen and Paul Kiernan, R...HostedbyConfluent
Convincing developers to write tests for new code is hard; convincing developers to write tests for new event data is even harder. At Reddit, engineers have often deployed new app versions, only to find out later that the event wasn’t firing at all, or it was missing critical fields. So this begs the question, “How can engineers at Reddit be confident that the events they instrument are accurate and complete?”
In this session, we will learn about an internal tool developed at Reddit to QA events in real-time. This KSQL-powered web app streams events from our pipeline, allowing developers to filter events they care about using criteria like User ID, Device ID or the type of user interaction. With a backbone of KSQL and Kafka Streams, engineers can get real-time feedback on how accurate (or how erroneous) their event data is.
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSLightbend
Apache Kafka–part of Lightbend Fast Data Platform–is a distributed streaming platform that is best suited to run close to the metal on dedicated machines in statically defined clusters. For most enterprises, however, these fixed clusters are quickly becoming extinct in favor of mixed-use clusters that take advantage of all infrastructure resources available.
In this webinar by Sean Glover, Fast Data Engineer at Lightbend, we will review leading Kafka implementations on DC/OS and Kubernetes to see how they reliably run Kafka in container orchestrated clusters and reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You will learn specifically about concerns like:
* The need for greater operational knowhow to do common tasks with Kafka in static clusters, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers.
* The best way to provide resources to stateful technologies while in a mixed-use cluster, noting the importance of disk space as one of Kafka’s most important resource requirements.
* How to address the particular needs of stateful services in a model that natively favors stateless, transient services.
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
Organizations like Starbucks, HPE, and PayPal (see our customers) have selected the Akka toolkit for their enterprise scale distributed applications; and when it comes to squeezing out the best possible performance, the secret is using two particular modules in tandem: Akka Cluster and Akka Streams.
In this webinar by Nolan Grace, Senior Solution Architect at Lightbend, we look at these two Akka modules and discuss the features that will push your application architecture to the next tier of performance.
For the full blog post, including the video, visit: https://www.lightbend.com/blog/akka-at-enterprise-scale-performance-tuning-distributed-applications
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetesconfluent
Speakers: Joe Beda, Co-founder and CTO, Heptio + Gwen Shapira, Principal Data Architect, Confluent
With the rapid adoption of microservices, there is a growing need for solutions to manage deployment, resources and data for fleets of microservices. Kubernetes is a resource management framework for containers that is rapidly growing in popularity. Apache Kafka is a streaming platform that makes data accessible to the edges of an organization. It's no wonder the question of running Kafka on Kubernetes keeps coming up!
In this online talk, Joe Beda, CTO of Heptio and co-creator of Kubernetes, and Gwen Shapira, principal data architect at Confluent and Kafka PMC member, will help you navigate through the hype, address frequently asked questions and deliver critical information to help you decide if running Kafka on Kubernetes is the right approach for your organization.
You will:
-Get an introduction to the basic concepts you need to know as you plan to deploy services on Kubernetes.
-Learn which parts of the Kafka ecosystem fit Kubernetes like a glove, and which require special attention.
-Pick up useful tips for getting started.
-See why Confluent Platform for Kubernetes is the simplest solution to deploying and orchestrating Kafka on Kubernetes, using container images and a Kubernetes operator.
Watch the recording: https://videos.confluent.io/watch/yoZcuazDjDDTcj1sRnaD3J?.
(Mike Graham + Dan Carroll, Comcast) Kafka Summit SF 2018
Comcast manages over 2 million miles of fiber and coax, and over 40 million in home devices. This “outside plant” is subject to adverse conditions from severe weather to power grid outages to construction-related disruptions. Maintaining the health of this large and important infrastructure requires a distributed, scalable, reliable and fast information system capable of real-time processing and rapid analysis and response. Using Apache Kafka and the Kafka Streams Processor API, Comcast built an innovative new system for monitoring, problem analysis, metrics reporting and action response for the outside plant.
In this talk, you’ll learn how topic partitions, state stores, key mapping, source and sink topics and processors from the Kafka Streams Processor API work together to build a powerful dynamic system. We will dive into the details about the inner workings of the state store—how it is backed by a Kafka “changelog” topic, how it is scaled horizontally by partition and how the instances are rebuilt on startup or on processor failure. We will discuss how these state stores essentially become like materialized views in a SQL database but are updated incrementally as data flows through the system, and how this allows the developers to maintain the data in the optimal structures for performing the processing. The best part is that the data is readily available when needed by the processors. You will see how a REST API using Kafka Streams “interactive queries” can be used to retrieve the data in the state stores. We will explore the deployment and monitoring mechanisms used to deliver this system as a set of independently deployed components.
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...confluent
Is your organization adopting Kafka as their messaging bus but you've found that it will take too long to migrate your existing service-oriented architecture to a log-oriented architecture? Some of the biggest challenges in building a new stream processor can be implementing all the business logic again. It has become increasingly common for companies with high-throughput source streams and change-data-capture logs to want to build systems fast. At Ticketmaster, we have found a solution to the problem by leveraging the business logic in our existing services and calling them from our Java based KafkaStreams processor applications in an efficient manner. In this talk, we will examine the initial challenges we faced in our transition, then we will explore the solutions we built to address the use cases at Ticketmaster. The primary focus will address our workflow around calling services to bring stream processor applications to market fast. We will review our challenges and share tips for success.
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020HostedbyConfluent
One of the key metrics to monitor when working with Apache Kafka, as a data pipeline or a streaming platform, is Consumer Groups Lag.
Lag is the delta between the last produced message and the last committed message of a partition. In other words, lag indicates how far behind your application is in processing up-to-date information.
For a long time, we used our own service to keep track of these metrics, collect them and visualize them. But this didn’t scale well.
You had to perform many manual operations, redeploy it and to do other tedious manual tasks, but most importantly, the biggest gap for us, was that its output was represented in absolute numbers (e.g - your lag is 30K), which basically tells you nothing as a human being.
We understood that we had to find a more suitable solution that will give us better visibility and will allow us to measure the lag in a time-based format that we all understand.
In this talk, I’m going to go over the core concepts of Kafka offsets and lags, and explain why lag even matters and is an important KPI to measure. I’ll also talk about the kind of research we did to find the right tool, what the options in the market were at the time, and eventually why we chose Linkedin’s Burrow as the right tool for us. And finally, I’ll take a closer look at Burrow, its building blocks, how we build and deploy it, how we monitor better with it, and eventually the most important improvement - how we transformed its output from numbers to time-based metrics.
Watch this webcast here: https://www.confluent.io/online-talks/whats-new-in-confluent-platform-55/
Join the Confluent Product Marketing team as we provide an overview of Confluent Platform 5.5, which makes Apache Kafka and event streaming more broadly accessible to developers with enhancements to data compatibility, multi-language development, and ksqlDB.
Building an event-driven architecture with Apache Kafka allows you to transition from traditional silos and monolithic applications to modern microservices and event streaming applications. With these benefits has come an increased demand for Kafka developers from a wide range of industries. The Dice Tech Salary Report recently ranked Kafka as the highest-paid technological skill of 2019, a year removed from ranking it second.
With Confluent Platform 5.5, we are making it even simpler for developers to connect to Kafka and start building event streaming applications, regardless of their preferred programming languages or the underlying data formats used in their applications.
This session will cover the key features of this latest release, including:
-Support for Protobuf and JSON schemas in Confluent Schema Registry and throughout our entire platform
-Exactly once semantics for non-Java clients
-Admin functions in REST Proxy (preview)
-ksqlDB 0.7 and ksqlDB Flow View in Confluent Control Center
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
Presenter: Tim Berglund, Senior Director of Developer Experience, Confluent
It has become a truism in the past decade that building systems at scale, using non-relational databases, requires giving up on the transactional guarantees afforded by the relational databases of yore. ACID transactional semantics are fine, but we all know you can’t have them all in a distributed system. Or can we?
In this talk, I will argue that by designing our systems around a distributed log like Apache Kafka®, we can in fact achieve ACID semantics at scale. We can ensure that distributed write operations can be applied atomically, consistently, in isolation between services, and of course with durability. What seems to be a counterintuitive conclusion ends up being straightforwardly achievable using existing technologies, as an elusive set of properties becomes relatively easy to achieve with the right architectural paradigm underlying the application.
Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...confluent
"Use of techniques to services decomposition into a set of stages allowing code modularity and reuse. Good practices for dealing with DeadLetter, Monitoring, CorrelationID, Log, Base classes to control all software development best practices, Buffer Control in Apache Kafka and aspects related to Apache Kafka scalability and fault tolerance. Processing and management of high messages streaming on Black Friday (~ 25.4 million / day)
After a retrospective of how our structure behaved during the last Black Friday, we learned a few lessons and decided to adopt a new approach to address some specific scenarios which have millions of messages, ensuring resilience, uptime of at least 99.9%, monitoring and alerts for each module. We decided to adopt the SEDA architecture standard to traffic these millions of messages as closely as possible and deliver the desired quality to the target systems with scalability and reliability. By separating the pipeline processing modules, we were able to scale each of these modules horizontally, increasing the number of PODs (Openshift) and partitions of Kafka topics in order to process a given pipeline step faster. In addition, we also need to apply tunnings to Apache Kafka, one of which concerns the guarantee of delivery of the message. The focus of this presentation is to show the solution designed and how we use Apache Kafka and the SEDA architecture standard to orchestrate this massive stream of data we face."
Integrating Apache Kafka and Elastic Using the Connect Frameworkconfluent
As a streaming platform, Apache Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and excels at processing streams of real-time events. Kafka provides reliable, millisecond delivery for connecting downstream systems with real-time data.
In this talk, we will show how easy it is to leverage Kafka and the Elasticsearch connector to keep your indices populated with the latest data from the rest of your enterprise, as it changes.
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
Activision Data team has been running a data pipeline for a variety of Activision games for many years. Historically we used a mix of micro-batch microservices coupled with classic Big Data tools like Hadoop and Hive for ETL. As a result, it could take up to 4-6 hours for data to be available to the end customers.
In the last few years, the adoption of data in the organization skyrocketed. We needed to de-legacy our data pipeline and provide near-realtime access to data in order to improve reporting, gather insights faster, power web and mobile applications. I want to tell a story about heavily leveraging Kafka Streams and Kafka Connect to reduce the end latency to minutes, at the same time making the pipeline easier and cheaper to run. We were able to successfully validate the new data pipeline by launching two massive games just 4 weeks apart.
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
In this talk by Sean Glover, Principal Engineer at Lightbend, we will review how the Strimzi Kafka Operator, a supported technology in Lightbend Platform, makes many operational tasks in Kafka easy, such as the initial deployment and updates of a Kafka and ZooKeeper cluster.
See the blog post containing the YouTube video here: https://www.lightbend.com/blog/running-kafka-on-kubernetes-with-strimzi-for-real-time-streaming-applications
Real-Time Stream Processing with KSQL and Apache Kafkaconfluent
Real Time Stream Processing with KSQL and Kafka
David Peterson, Confluent APAC
APIdays Melbourne 2018
Unordered, unbounded and massive datasets are increasingly common in day-to-day business. Using this to your advantage is incredibly difficult with current system designs. We are stuck in a model where we can only take advantage of this *after* it has happened. Many times, this is too late to be useful in the enterprise.
KSQL is a streaming SQL engine for Apache Kafka. KSQL lowers the entry bar to the world of stream processing, providing a simple and completely interactive SQL interface for processing data in Kafka. KSQL (like Kafka) is open-source, distributed, scalable, and reliable.
A real time Kafka platform moves your data up the stack, closer to the heart of your business, allowing you to build scalable, mission-critical services by quickly deploying SQL-like queries in a severless pattern.
This talk will highlight key use cases for real time data, and stream processing with KSQL: Real time analytics, security and anomaly detection, real time ETL / data integration, Internet of Things, application development, and deploying Machine Learning models with KSQ.
Real time data and stream processing means that Kafka is just as important to the disrupted as it is to the disruptors.
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...HostedbyConfluent
Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...HostedbyConfluent
In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA.
We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing.
We'll discuss:
1. How to architect a data sink for failed records.
2. How topic compaction can reduce duplicate data and enable idempotency.
3. Building a tombstoning system for removing successfully reprocessed records from the queues.
4. Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?
Streaming ETL to Elastic with Apache Kafka and KSQLconfluent
Companies are recognizing the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, enableing low latency analytics, event-driven architectures and the population of multiple downstream systems. These data pipelines can be built using configuration alone.
In this talk we’ll see how easy it is to stream data from sources such as databases and into Kafka using the Kafka Connect API. We’ll use KSQL to filter, aggregate and join it to other data, and then stream this enriched data from Kafka out into targets such as Elasticsearch. All of this can be accomplished without a single line of code!
In this session, Neil Avery covers the planning and operation of your KSQL deployment, including under-the-hood architectural details. You will learn about the various deployment models, how to track and monitor your KSQL applications, how to scale in and out and how to think about capacity planning. This is part 3 out of 3 in the Empowering Streams through KSQL series.
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
KSQL is the streaming SQL engine for Apache Kafka. It provides an easy and completely interactive SQL interface for stream processing on Kafka. Users can express their processing logic in SQL like statements and KSQL will compile and execute them as Kafka Streams applications. Although KSQL provides a rich set of features and built in functions, many use cases require more domain specific processing logic that cannot be expressed in pure SQL. To enable users to use KSQL in such scenarios, KSQL provides a framework to define complex processing logic as User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs). In this talk, we provide a deep dive into the UDF/UDAF framework in KSQL. We explain how users can define their custom UDFs/UDAFs and use them in their queries. We also describe how KSQL utilizes the provided UDFs/UDAFs under the hood to process streams and tables. This deep dive will include an insight into how UDFs process data and how UDAFs keep track of their state. Armed with such knowledge, KSQL users will be able to define and utilize complex data processing logic in their KSQL queries. They will also be able to diagnose and fix issues in defining and deploying their UDFs/UDAFs more efficiently.
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...confluent
Application teams in JPMC have started shifting towards building event driven architectures and real time steaming pipelines and Kafka has been at core in this journey. As application teams have started adopting Kafka rapidly, need for a centrally managed Kafka as a service has emerged. We have started delivering Kafka as a service in early 2018 and running in production for more than an year now operating 80+ clusters (and growing) in all environments together. One of the key requirements is to provide truly segregated, secured multi-tenant environment with RBAC model while satisfying financial regulations and controls at the same time. Operating clusters at large scale requires scalable self-service capabilities and cluster management orchestration. In this talk we will present - Our experiences in delivering and operating secured, multi-tenant and resilient Kafka clusters at scale. - Internals of our service framework/control plane which enables self-service capabilities for application teams, cluster build/patch orchestration and capacity management capabilities for TSE/admin teams. - Our approach in enabling automated Cross Datacenter failover for application teams using service framework and confluent replicator.
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...confluent
Mark Teehan, Principal Solutions Engineer, Confluent
Use the Debezium CDC connector to capture database changes from a Postgres database - or MySQL or Oracle; streaming into Kafka topics and onwards to an external data store. Examine how to setup this pipeline using Docker Compose and Confluent Cloud; and how to use various payload formats, such as avro, protobuf and json-schema.
https://www.meetup.com/Singapore-Kafka-Meetup/events/276822852/
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal
Building data pipelines is pretty hard! Building a multi-datacenter active-active real time data pipeline for multiple classes of data with different durability, latency and availability guarantees is much harder.
Real time infrastructure powers critical pieces of Uber (think Surge) and in this talk we will discuss our architecture, technical challenges, learnings and how a blend of open source infrastructure (Apache Kafka and Samza) and in-house technologies have helped Uber scale.
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020HostedbyConfluent
One of the key metrics to monitor when working with Apache Kafka, as a data pipeline or a streaming platform, is Consumer Groups Lag.
Lag is the delta between the last produced message and the last committed message of a partition. In other words, lag indicates how far behind your application is in processing up-to-date information.
For a long time, we used our own service to keep track of these metrics, collect them and visualize them. But this didn’t scale well.
You had to perform many manual operations, redeploy it and to do other tedious manual tasks, but most importantly, the biggest gap for us, was that its output was represented in absolute numbers (e.g - your lag is 30K), which basically tells you nothing as a human being.
We understood that we had to find a more suitable solution that will give us better visibility and will allow us to measure the lag in a time-based format that we all understand.
In this talk, I’m going to go over the core concepts of Kafka offsets and lags, and explain why lag even matters and is an important KPI to measure. I’ll also talk about the kind of research we did to find the right tool, what the options in the market were at the time, and eventually why we chose Linkedin’s Burrow as the right tool for us. And finally, I’ll take a closer look at Burrow, its building blocks, how we build and deploy it, how we monitor better with it, and eventually the most important improvement - how we transformed its output from numbers to time-based metrics.
Watch this webcast here: https://www.confluent.io/online-talks/whats-new-in-confluent-platform-55/
Join the Confluent Product Marketing team as we provide an overview of Confluent Platform 5.5, which makes Apache Kafka and event streaming more broadly accessible to developers with enhancements to data compatibility, multi-language development, and ksqlDB.
Building an event-driven architecture with Apache Kafka allows you to transition from traditional silos and monolithic applications to modern microservices and event streaming applications. With these benefits has come an increased demand for Kafka developers from a wide range of industries. The Dice Tech Salary Report recently ranked Kafka as the highest-paid technological skill of 2019, a year removed from ranking it second.
With Confluent Platform 5.5, we are making it even simpler for developers to connect to Kafka and start building event streaming applications, regardless of their preferred programming languages or the underlying data formats used in their applications.
This session will cover the key features of this latest release, including:
-Support for Protobuf and JSON schemas in Confluent Schema Registry and throughout our entire platform
-Exactly once semantics for non-Java clients
-Admin functions in REST Proxy (preview)
-ksqlDB 0.7 and ksqlDB Flow View in Confluent Control Center
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
Presenter: Tim Berglund, Senior Director of Developer Experience, Confluent
It has become a truism in the past decade that building systems at scale, using non-relational databases, requires giving up on the transactional guarantees afforded by the relational databases of yore. ACID transactional semantics are fine, but we all know you can’t have them all in a distributed system. Or can we?
In this talk, I will argue that by designing our systems around a distributed log like Apache Kafka®, we can in fact achieve ACID semantics at scale. We can ensure that distributed write operations can be applied atomically, consistently, in isolation between services, and of course with durability. What seems to be a counterintuitive conclusion ends up being straightforwardly achievable using existing technologies, as an elusive set of properties becomes relatively easy to achieve with the right architectural paradigm underlying the application.
Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...confluent
"Use of techniques to services decomposition into a set of stages allowing code modularity and reuse. Good practices for dealing with DeadLetter, Monitoring, CorrelationID, Log, Base classes to control all software development best practices, Buffer Control in Apache Kafka and aspects related to Apache Kafka scalability and fault tolerance. Processing and management of high messages streaming on Black Friday (~ 25.4 million / day)
After a retrospective of how our structure behaved during the last Black Friday, we learned a few lessons and decided to adopt a new approach to address some specific scenarios which have millions of messages, ensuring resilience, uptime of at least 99.9%, monitoring and alerts for each module. We decided to adopt the SEDA architecture standard to traffic these millions of messages as closely as possible and deliver the desired quality to the target systems with scalability and reliability. By separating the pipeline processing modules, we were able to scale each of these modules horizontally, increasing the number of PODs (Openshift) and partitions of Kafka topics in order to process a given pipeline step faster. In addition, we also need to apply tunnings to Apache Kafka, one of which concerns the guarantee of delivery of the message. The focus of this presentation is to show the solution designed and how we use Apache Kafka and the SEDA architecture standard to orchestrate this massive stream of data we face."
Integrating Apache Kafka and Elastic Using the Connect Frameworkconfluent
As a streaming platform, Apache Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and excels at processing streams of real-time events. Kafka provides reliable, millisecond delivery for connecting downstream systems with real-time data.
In this talk, we will show how easy it is to leverage Kafka and the Elasticsearch connector to keep your indices populated with the latest data from the rest of your enterprise, as it changes.
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
Activision Data team has been running a data pipeline for a variety of Activision games for many years. Historically we used a mix of micro-batch microservices coupled with classic Big Data tools like Hadoop and Hive for ETL. As a result, it could take up to 4-6 hours for data to be available to the end customers.
In the last few years, the adoption of data in the organization skyrocketed. We needed to de-legacy our data pipeline and provide near-realtime access to data in order to improve reporting, gather insights faster, power web and mobile applications. I want to tell a story about heavily leveraging Kafka Streams and Kafka Connect to reduce the end latency to minutes, at the same time making the pipeline easier and cheaper to run. We were able to successfully validate the new data pipeline by launching two massive games just 4 weeks apart.
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
In this talk by Sean Glover, Principal Engineer at Lightbend, we will review how the Strimzi Kafka Operator, a supported technology in Lightbend Platform, makes many operational tasks in Kafka easy, such as the initial deployment and updates of a Kafka and ZooKeeper cluster.
See the blog post containing the YouTube video here: https://www.lightbend.com/blog/running-kafka-on-kubernetes-with-strimzi-for-real-time-streaming-applications
Real-Time Stream Processing with KSQL and Apache Kafkaconfluent
Real Time Stream Processing with KSQL and Kafka
David Peterson, Confluent APAC
APIdays Melbourne 2018
Unordered, unbounded and massive datasets are increasingly common in day-to-day business. Using this to your advantage is incredibly difficult with current system designs. We are stuck in a model where we can only take advantage of this *after* it has happened. Many times, this is too late to be useful in the enterprise.
KSQL is a streaming SQL engine for Apache Kafka. KSQL lowers the entry bar to the world of stream processing, providing a simple and completely interactive SQL interface for processing data in Kafka. KSQL (like Kafka) is open-source, distributed, scalable, and reliable.
A real time Kafka platform moves your data up the stack, closer to the heart of your business, allowing you to build scalable, mission-critical services by quickly deploying SQL-like queries in a severless pattern.
This talk will highlight key use cases for real time data, and stream processing with KSQL: Real time analytics, security and anomaly detection, real time ETL / data integration, Internet of Things, application development, and deploying Machine Learning models with KSQ.
Real time data and stream processing means that Kafka is just as important to the disrupted as it is to the disruptors.
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...HostedbyConfluent
Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...HostedbyConfluent
In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA.
We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing.
We'll discuss:
1. How to architect a data sink for failed records.
2. How topic compaction can reduce duplicate data and enable idempotency.
3. Building a tombstoning system for removing successfully reprocessed records from the queues.
4. Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?
Streaming ETL to Elastic with Apache Kafka and KSQLconfluent
Companies are recognizing the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, enableing low latency analytics, event-driven architectures and the population of multiple downstream systems. These data pipelines can be built using configuration alone.
In this talk we’ll see how easy it is to stream data from sources such as databases and into Kafka using the Kafka Connect API. We’ll use KSQL to filter, aggregate and join it to other data, and then stream this enriched data from Kafka out into targets such as Elasticsearch. All of this can be accomplished without a single line of code!
In this session, Neil Avery covers the planning and operation of your KSQL deployment, including under-the-hood architectural details. You will learn about the various deployment models, how to track and monitor your KSQL applications, how to scale in and out and how to think about capacity planning. This is part 3 out of 3 in the Empowering Streams through KSQL series.
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
KSQL is the streaming SQL engine for Apache Kafka. It provides an easy and completely interactive SQL interface for stream processing on Kafka. Users can express their processing logic in SQL like statements and KSQL will compile and execute them as Kafka Streams applications. Although KSQL provides a rich set of features and built in functions, many use cases require more domain specific processing logic that cannot be expressed in pure SQL. To enable users to use KSQL in such scenarios, KSQL provides a framework to define complex processing logic as User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs). In this talk, we provide a deep dive into the UDF/UDAF framework in KSQL. We explain how users can define their custom UDFs/UDAFs and use them in their queries. We also describe how KSQL utilizes the provided UDFs/UDAFs under the hood to process streams and tables. This deep dive will include an insight into how UDFs process data and how UDAFs keep track of their state. Armed with such knowledge, KSQL users will be able to define and utilize complex data processing logic in their KSQL queries. They will also be able to diagnose and fix issues in defining and deploying their UDFs/UDAFs more efficiently.
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...confluent
Application teams in JPMC have started shifting towards building event driven architectures and real time steaming pipelines and Kafka has been at core in this journey. As application teams have started adopting Kafka rapidly, need for a centrally managed Kafka as a service has emerged. We have started delivering Kafka as a service in early 2018 and running in production for more than an year now operating 80+ clusters (and growing) in all environments together. One of the key requirements is to provide truly segregated, secured multi-tenant environment with RBAC model while satisfying financial regulations and controls at the same time. Operating clusters at large scale requires scalable self-service capabilities and cluster management orchestration. In this talk we will present - Our experiences in delivering and operating secured, multi-tenant and resilient Kafka clusters at scale. - Internals of our service framework/control plane which enables self-service capabilities for application teams, cluster build/patch orchestration and capacity management capabilities for TSE/admin teams. - Our approach in enabling automated Cross Datacenter failover for application teams using service framework and confluent replicator.
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...confluent
Mark Teehan, Principal Solutions Engineer, Confluent
Use the Debezium CDC connector to capture database changes from a Postgres database - or MySQL or Oracle; streaming into Kafka topics and onwards to an external data store. Examine how to setup this pipeline using Docker Compose and Confluent Cloud; and how to use various payload formats, such as avro, protobuf and json-schema.
https://www.meetup.com/Singapore-Kafka-Meetup/events/276822852/
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal
Building data pipelines is pretty hard! Building a multi-datacenter active-active real time data pipeline for multiple classes of data with different durability, latency and availability guarantees is much harder.
Real time infrastructure powers critical pieces of Uber (think Surge) and in this talk we will discuss our architecture, technical challenges, learnings and how a blend of open source infrastructure (Apache Kafka and Samza) and in-house technologies have helped Uber scale.
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLScyllaDB
Event streaming applications unlock new benefits by combining various data feeds. However, getting actionable insights in a timely fashion has remained a challenge, as the data has been siloed in disparate systems. ksqlDB solves this by providing an interactive SQL interface that can seamlessly combine and transform data from various sources.
In this webinar, we will show how streaming queries of high throughput NoSQL systems can derive insights from various push/pull queries via ksqlDB's User-Defined Functions, Aggregate Functions and Table Functions.
How do you apply modern Cloud-native patterns to your apps? In this talk, you'll find how to use frameworks like Spring Boot & Spring Cloud to build agile & resilient apps, leveraging Cloud platforms. Get the app source code here: https://github.com/alexandreroman/yatc.
Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen. @BASTAcon #BASTA17 @qaware #CloudNativeNerd
https://basta.net/microservices-services/cloud-native-net-microservices-mit-kubernetes/
Cloud-native .NET Microservices mit KubernetesQAware GmbH
BASTA! 2017, Mainz: Talk von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware).
Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen.
Event streaming applications unlock new benefits by combining various data feeds. However, getting actionable insights in a timely fashion has remained a challenge, as the data has been siloed in disparate systems. ksqlDB solves this by providing an interactive SQL interface that can seamlessly combine and transform data from various sources.
In this webinar, we will show how streaming queries of high throughput NoSQL systems can derive insights from various push/pull queries via ksqlDB's User-Defined Functions, Aggregate Functions and Table Functions.
Watch this to learn:
Real-world examples of the benefits of using a streaming database like ksqlDB and seamlessly combining data from Kafka & Cassandra/Scylla (NoSQL).
The functionality of ksqlDB via push/pull queries and UDFs/UDAFs/UDTFs.
The ease with which data stored in a NoSQL database can be transformed using ksqlDB and then persisted back for long-term storage.
Bridge Your Kafka Streams to Azure Webinarconfluent
With a fully managed Apache Kafka(R) as-a-service on Microsoft Azure, businesses can focus on building applications and not managing clusters. Build a persistent bridge from on-premises data systems to the cloud with a hybrid Kafka service or stream across public clouds for multi-cloud data pipelines.
In this session for business and technical data leaders, you can learn about powering business applications with the managed Kafka service that streams data into Azure SQL Data Warehouse, Cosmos DB, Azure Data Lake Storage and Azure Blob Storage.
New Approaches for Fraud Detection on Apache Kafka and KSQLconfluent
Speakers: Dale Kim, Sr. Director, Products/Solutions, Arcadia Data + Chong Yan, Solutions Architect, Confluent
When it comes to corporate fraud, early detection is integral to mitigating and preventing drastic damage.
Modern streaming data technologies like Apache Kafka® and Confluent KSQL, the streaming SQL engine for Apache Kafka, can help companies catch and detect fraud in real time instead of after the fact. Kafka is ideal for managing fast, incoming data points, and KSQL provides the de facto standard for reading that data. Combine this with Arcadia Data visualizations designed for modern data types, and you have a powerful foundation for combating fraud.
You will learn:
-Why traditional batch-driven approaches to fraud detection are insufficient today
-Why Apache Kafka is widely used for real-time fraud detection
-How KSQL and real-time visualizations open more opportunities for searching for fraud
First Steps with Apache Kafka on Google Cloud Platformconfluent
Speakers: Jay Smith, Cloud Customer Engineer, Google Cloud + Gwen Shapira, Product Manager, Confluent
Curious about Apache Kafka®? Find out why you would want to use the de facto standard for real-time streaming, the easiest way to get started and how to leverage the extensive Apache Kafka ecosystem. In this chat, we'll talk about three common use cases, review stream processing patterns and discuss integration with important GCP services such as BigQuery. We'll also demo how to implement real-time clickstream analytics on Confluent Cloud, fully managed Apache Kafka as a service.
stackconf 2020 | The path to a Serverless-native era with Kubernetes by Paolo...NETWAYS
Serverless is one of the hottest design patterns in the cloud today, i’ll cover how the Serverless paradigms are changing the way we develop applications and the cloud infrastructures and how to implement Serveless-kind workloads with Kubernetes.
We’ll go through the latest Kubernetes-based serverless technologies, covering the most important aspects including pricing, scalability, observability and best practices
Data Engineer's Lunch #56: Spring Cloud Data Flow with CassandraAnant Corporation
In Data Engineer's Lunch #55 we will be going over how to integrate Spring Cloud Data Flow with Cassandra.
Accompanying Blog: Coming Soon!
Accompanying YouTube: Coming Soon!
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
solutions@anant.us
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
KSQL is a stream processing SQL engine, which allows stream processing on top of Apache Kafka. KSQL is based on Kafka Stream and provides capabilities for consuming messages from Kafka, analysing these messages in near-realtime with a SQL like language and produce results again to a Kafka topic. By that, no single line of Java code has to be written and you can reuse your SQL knowhow. This lowers the bar for starting with stream processing significantly.
KSQL offers powerful capabilities of stream processing, such as joins, aggregations, time windows and support for event time. In this talk I will present how KSQL integrates with the Kafka ecosystem and demonstrate how easy it is to implement a solution using KSQL for most part. This will be done in a live demo on a fictitious IoT sample.
SpringBoot and Spring Cloud Service for MSAOracle Korea
Cloud 환경에서 MSA를 하기 위해서 Service Discovery, Circuit Breaker 등을 사용하여 Application을 개발하는 방법과 SpringBoot 와 Spring Cloud Service 를 사용하는데, Cloud에서 Kubernetes를 위시한 Container 생태계가 어떻게 MSA에 영향을 미치는지 알아봅니다.
Similar to Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes (20)
IoT 'Megaservices' - High Throughput Microservices with AkkaLightbend
**********
Watch this presentation on-demand!
https://info.lightbend.com/iot-megaservices-high-throughput-microservices-with-akka-register.html
**********
In this interactive presentation by Hugh McKee, Developer Advocate at Lightbend, we’ll share our experiences helping our clients create a system architecture that can support high throughput microservices (aka "Megaservices"). We’ll do that using IoT demo applications designed to push cloud service providers like Amazon and Google to their limits. Using sample code that you can later run on your own machine, we’ll look at:
* Modeling real-life digital twins for hundreds of thousands of IoT devices in the field, looking into how these megaservices are implemented in Akka.
* Visualizing Akka Actors–which represent IoT digital twins–in a “crop circle” formation that represents a complete distributed Reactive application, and watching at messages are processed across Akka Cluster nodes using cluster sharding.
* Some code behind the whole set up, which is built using OSS like Akka, Java, JavaScript, and Kubernetes.
Follow us on social:
TW: https://twitter.com/lightbend
LI: https://www.linkedin.com/company/lightbend-inc-/
FB: https://www.facebook.com/lightbendOfficial/
For more about Lightbend:
Blog: https://www.lightbend.com/blog
Newsletter: https://www.lightbend.com/newsletter
How Akka Cluster Works: Actors Living in a ClusterLightbend
Hugh McKee, Developer Advocate at Lightbend, demonstrates how Akka Actors work inside of a cluster, including the code and in-browser visualizations you need to grok it.
See the full content with videos here: https://www.lightbend.com/blog/how-akka-cluster-works-actors-living-in-a-cluster
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
In this presentation by Jonas Bonér, creator of Akka and founder/CTO of Lightbend, we review a set of eight Reactive Principles that enable the design and implementation of Cloud Native applications–applications that are highly concurrent, distributed, performant, scalable, and resilient, while at the same time conserving resources when deploying, operating, and maintaining them.
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesLightbend
In this webinar with Hugh McKee, Developer Advocate for Akka Platform, we’ll look at “What on Earth”, a demo exploring how Akka Microservices serves as an ideal solution for high-scale digital twinning for IoT.
For the full presentation, including video, visit: https://www.lightbend.com/blog/iot-building-digital-twins-with-akka-microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesLightbend
See the full presentation here: https://www.lightbend.com/blog/digital-transformation-kubernetes-containers-microservices
In this talk by David Ogren, Principal Enterprise Architect at Lightbend, we draw from experiences helping our clients successfully create, migrate to, and manage cloud-native system architectures.
In this webinar by Jonas Bonér, creator of Akka and CTO/Co-Founder of Lightbend, we take a look at Cloudstate, an OSS tool built on Akka, gRPC, Knative, GraalVM, and Kubernetes. Cloudstate lets you model, manage, and scale stateful services while preserving responsiveness by designing for resilience and elasticity.
Digital Transformation from Monoliths to Microservices to Serverless and BeyondLightbend
Join this highly-visual presentation by Hugh McKee, Developer Advocate at Lightbend, to learn more about the ramifications and opportunities along the evolution from monolithic systems, to microservices architectures, to serverless (FaaS).
See the video presentation on the Lightbend blog at: https://www.lightbend.com/blog/digital-transformation-from-monoliths-to-microservices-to-serverless-and-beyond
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Lightbend
In this special guest webinar with Akka expert and Reactive System Consultant, Manuel Bernhardt, we review Akka 2.6 release highlights and a selection of 6 former anti-patterns that have now been rendered impossible by design.
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lightbend
In this guest webinar with Chris McDermott, Lead Data Engineer at HPE, learn how HPE InfoSight–powered by Lightbend Platform–has emerged as the go-to solution for providing real-time metrics and predictive analytics across various network, server, storage, and data center technologies.
Microservices, Kubernetes, and Application Modernization Done RightLightbend
In this talk by David Ogren, Enterprise Architect at Lightbend, we draw from experiences helping our clients successfully create, migrate to, and manage cloud-native system architectures. We look at some of the common pitfalls and anti-patterns of modernization efforts, and some of the best practices for taking an incremental approach to transforming legacy systems.
See the full post with video on the Lightbend blog: https://www.lightbend.com/blog/microservices-kubernetes-application-modernization
Akka and Kubernetes: A Symbiotic Love StoryLightbend
In this webinar by Hugh McKee, Developer Advocate at Lightbend, we take a look at how Akka and Kubernetes enjoy a symbiotic relationship, using live “crop circle” visuals to help. See the full video, slides, and additional resources here:
https://www.lightbend.com/blog/akka-and-kubernetes-a-symbiotic-love-story
Scala 3 Is Coming: Martin Odersky Shares What To KnowLightbend
Join Dr. Martin Odersky, the creator of Scala and co-founder of Lightbend, on a tour of what is in store and highlight some of his favorite features of Scala 3!
Migrating From Java EE To Cloud-Native Reactive SystemsLightbend
A lot of businesses that never before considered themselves as “technology companies” are now faced with digital modernization imperatives that force them to rethink their application and infrastructure architecture. On the path to becoming a digital, on-demand provider, development speed is the ultimate competitive advantage.
This presents challenges to many organizations that have huge investments in legacy Java EE infrastructure, where technical debt and monolithic system architectures require modernization in order to confront various business risks. Usually, changes need to be made within existing frameworks to keep pace with new web-scale organizations.
If your legacy monolith is no longer serving the expanding needs of your business, then join Markus Eisele, Director of Developer Advocacy at Lightbend, to learn what you can do to migrate from Java EE to cloud-native, Reactive systems—as defined by the Reactive Manifesto.
Designing Events-First Microservices For A Cloud Native WorldLightbend
In this talk by Jonas Bonér, Lightbend CTO/Co-Founder and creator of Akka, we will explore the nature of events, what it means to be event-driven, and how we can unleash the power of events and commands by applying an events first, domain-driven design to microservices-based architectures.
For more information, head over to lightbend.com/blog!
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaLightbend
Join Jeremy Daggett, Solutions Architect at Lightbend, to see how Fortify SCA for Scala works differently from existing Static Code Analysis tools to help you uncover security issues early in the SDLC of your mission-critical applications.
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesLightbend
In this webinar with Craig Blitz and Kiki Carter of Lightbend, we review how Lightbend’s Pipelines module enables you to develop components ("streamlets") using the appropriate technology, wire them together as pipelines, and deploy them with Kubernetes without all the manual, time-consuming labor.
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And KubernetesLightbend
In this special guest webinar with Holden Karau, speaker, author and Developer Advocate at Google, we’ll take a walk through some of the interesting JIRAs, look at external components being developed (like deep learning support), and also talk about the future of running real-time Spark workloads on Kubernetes.
Akka and Kubernetes: Reactive From Code To CloudLightbend
In this webinar with special guest Fabio Tiriticco, we will explore how Akka is the perfect companion to Kubernetes, providing the application level requirements needed to successfully deploy and manage your cloud-native services with technologies built specifically for cloud-native applications, like Kubernetes.
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...Lightbend
In this talk by Gerard Maas, O’Reilly author and Senior Software Engineer at Lightbend, we focus on choosing the right Fast Data stream processing features of Apache Spark, taking a practical, code-driven look at the two APIs available for this: the mature Spark Streaming and its younger sibling, Structured Streaming.
How Akka Works: Visualize And Demo Akka With A Raspberry-Pi ClusterLightbend
In this webinar by Lightbend’s Eric Loots, Scala & Tooling Practice Lead, and Kikia Carter, Principal Enterprise Architect, we use a simple yet powerful visualization of a 5-node, Raspberry Pi-based cluster to reveal the inner workings of Akka Cluster. In a matter of minutes, you will gain a strong understanding of clustering, even if you don’t know anything about Akka.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
16. Easily integrate streamlets written in Akka Streams, Spark Structured
Streaming, and Flink
Merge
different
input streams
Validate
record
formats, field
values
Use ML for more
sophisticated analysis
Compute aggregations
(e.g., statistics)
Send results
downstream
Cloudflow API :: Blueprints
17. The Data Science Process
we’re here
Img src: https://randalscottking.com/machine-learning-overview/
27. External Batch training
Embedded Model
E.g. SparkML
External Batch training
External Model Service
E.g. TFServing
External Batch training
Managed Streams
E.g. TFjvm
28. The Mission of ML in Cloudflow
Infuse “AI/ML” or smarter real-time analytics to existing apps
▪ Loan approval, device maintenance, next best offer, recommendation engine
Mix domain logic with streaming analytics and model serving
▪ Offer a programming model and runtime that facilitates the creation of new
data-driven services
Enable the productization of ML in the Enterprise
Create a flowing “stream” between Data Science and Data Engineering
29. Get started with Cloudflow at cloudflow.io
Join our contributor community at:
http://github.com/lightbend/cloudflow