Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes

•

0 likes•1,904 views

Deploying a robust streaming data pipeline can be a daunting task when your company’s financial information is at risk. For starters, how do you ensure proper provisioning of resources? How do you preserve end-to-end application and data consistency? How do you make all of this work in the cloud with Kubernetes and avoid YAML hell? Answer: Cloudflow, a new open-source toolkit for simplifying the development, deployment, and operation of streaming data pipelines.

Detecting Real-Time Financial Fraud with
Cloudflow on Kubernetes
Gerard Maas, Principal Engineer at Lightbend.

● Intro: Productizing Data Science
● What’s Cloudﬂow
● Building a Fraud Detection Model
● Running the Model With Cloudﬂow
Agenda

Cloudflow is a development toolkit that enables you to quickly
develop, orchestrate, and operate distributed streaming
applications on Kubernetes.

$> Streamlet API
$> Blueprint
$> Sandbox
$> build tool extensions
$> kubectl cloudflow

$> Streamlet API
$> Blueprints
$> build extensions
$> kubectl cloudflow
Operator

Streamlet
inlet(s) outlet(s)
{ Schema }
Streamlets
Logic{ Schema }
{ Schema }

Streamlet
Streamlets
Logic
Streamlet
Logic
Streamlet
Logic
✔
❌

Streamlet
inlet(s) outlet(s)
Streamlets
Logic

Easily integrate streamlets written in Akka Streams, Spark Structured
Streaming, and Flink
Merge
diﬀerent
input streams
Validate
record
formats, ﬁeld
values
Use ML for more
sophisticated analysis
Compute aggregations
(e.g., statistics)
Send results
downstream
Cloudflow API :: Blueprints

The Data Science Process
we’re here
Img src: https://randalscottking.com/machine-learning-overview/

Transactions
Fraud Detection
Model
Data
Understanding
Data
Preparation
Data
Modelling
Validation Deployment

Transactions
Fraud Detection
Model
Data
Understanding
Data
Preparation
Data
Modelling
Validation Deployment
Data
Cleaning &
Enrichment
Data
Ingestion
Result
Propagation
Model
Scoring
model

@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions

@maasg
HTTP
ingress
[schema]
Enrichment
(+features)
Fraud ML
Scoring
Egress
(console)
Transactions
py
DT
Model
Persistent
Volume

External Batch training
Embedded Model
E.g. SparkML
External Batch training
External Model Service
E.g. TFServing
External Batch training
Managed Streams
E.g. TFjvm

The Mission of ML in Cloudflow
Infuse “AI/ML” or smarter real-time analytics to existing apps
▪ Loan approval, device maintenance, next best oﬀer, recommendation engine
Mix domain logic with streaming analytics and model serving
▪ Oﬀer a programming model and runtime that facilitates the creation of new
data-driven services
Enable the productization of ML in the Enterprise
Create a flowing “stream” between Data Science and Data Engineering

Get started with Cloudflow at cloudflow.io
Join our contributor community at:
http://github.com/lightbend/cloudflow

Thank You
Gerard Maas
Principal Engineer
gerard.maas@lightbend.com
@maasg

Convincing developers to write tests for new code is hard; convincing developers to write tests for new event data is even harder. At Reddit, engineers have often deployed new app versions, only to find out later that the event wasn’t firing at all, or it was missing critical fields. So this begs the question, “How can engineers at Reddit be confident that the events they instrument are accurate and complete?” In this session, we will learn about an internal tool developed at Reddit to QA events in real-time. This KSQL-powered web app streams events from our pipeline, allowing developers to filter events they care about using criteria like User ID, Device ID or the type of user interaction. With a backbone of KSQL and Kafka Streams, engineers can get real-time feedback on how accurate (or how erroneous) their event data is.

Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS

Lightbend

Apache Kafka–part of Lightbend Fast Data Platform–is a distributed streaming platform that is best suited to run close to the metal on dedicated machines in statically defined clusters. For most enterprises, however, these fixed clusters are quickly becoming extinct in favor of mixed-use clusters that take advantage of all infrastructure resources available. In this webinar by Sean Glover, Fast Data Engineer at Lightbend, we will review leading Kafka implementations on DC/OS and Kubernetes to see how they reliably run Kafka in container orchestrated clusters and reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You will learn specifically about concerns like: * The need for greater operational knowhow to do common tasks with Kafka in static clusters, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers. * The best way to provide resources to stateful technologies while in a mixed-use cluster, noting the importance of disk space as one of Kafka’s most important resource requirements. * How to address the particular needs of stateful services in a model that natively favors stateless, transient services.

Akka at Enterprise Scale: Performance Tuning Distributed Applications

Lightbend

Organizations like Starbucks, HPE, and PayPal (see our customers) have selected the Akka toolkit for their enterprise scale distributed applications; and when it comes to squeezing out the best possible performance, the secret is using two particular modules in tandem: Akka Cluster and Akka Streams. In this webinar by Nolan Grace, Senior Solution Architect at Lightbend, we look at these two Akka modules and discuss the features that will push your application architecture to the next tier of performance. For the full blog post, including the video, visit: https://www.lightbend.com/blog/akka-at-enterprise-scale-performance-tuning-distributed-applications

IoT and Event Streaming at Scale with Apache Kafka

confluent

Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes

confluent

Speakers: Joe Beda, Co-founder and CTO, Heptio + Gwen Shapira, Principal Data Architect, Confluent With the rapid adoption of microservices, there is a growing need for solutions to manage deployment, resources and data for fleets of microservices. Kubernetes is a resource management framework for containers that is rapidly growing in popularity. Apache Kafka is a streaming platform that makes data accessible to the edges of an organization. It's no wonder the question of running Kafka on Kubernetes keeps coming up! In this online talk, Joe Beda, CTO of Heptio and co-creator of Kubernetes, and Gwen Shapira, principal data architect at Confluent and Kafka PMC member, will help you navigate through the hype, address frequently asked questions and deliver critical information to help you decide if running Kafka on Kubernetes is the right approach for your organization. You will: -Get an introduction to the basic concepts you need to know as you plan to deploy services on Kubernetes. -Learn which parts of the Kafka ecosystem fit Kubernetes like a glove, and which require special attention. -Pick up useful tips for getting started. -See why Confluent Platform for Kubernetes is the simplest solution to deploying and orchestrating Kafka on Kubernetes, using container images and a Kubernetes operator. Watch the recording: https://videos.confluent.io/watch/yoZcuazDjDDTcj1sRnaD3J?.

Inside Kafka Streams—Monitoring Comcast’s Outside Plant

confluent

(Mike Graham + Dan Carroll, Comcast) Kafka Summit SF 2018 Comcast manages over 2 million miles of fiber and coax, and over 40 million in home devices. This “outside plant” is subject to adverse conditions from severe weather to power grid outages to construction-related disruptions. Maintaining the health of this large and important infrastructure requires a distributed, scalable, reliable and fast information system capable of real-time processing and rapid analysis and response. Using Apache Kafka and the Kafka Streams Processor API, Comcast built an innovative new system for monitoring, problem analysis, metrics reporting and action response for the outside plant. In this talk, you’ll learn how topic partitions, state stores, key mapping, source and sink topics and processors from the Kafka Streams Processor API work together to build a powerful dynamic system. We will dive into the details about the inner workings of the state store—how it is backed by a Kafka “changelog” topic, how it is scaled horizontally by partition and how the instances are rebuilt on startup or on processor failure. We will discuss how these state stores essentially become like materialized views in a SQL database but are updated incrementally as data flows through the system, and how this allows the developers to maintain the data in the optimal structures for performing the processing. The best part is that the data is readily available when needed by the processors. You will see how a REST API using Kafka Streams “interactive queries” can be used to retrieve the data in the state stores. We will explore the deployment and monitoring mechanisms used to deliver this system as a set of independently deployed components.

Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...

confluent

Is your organization adopting Kafka as their messaging bus but you've found that it will take too long to migrate your existing service-oriented architecture to a log-oriented architecture? Some of the biggest challenges in building a new stream processor can be implementing all the business logic again. It has become increasingly common for companies with high-throughput source streams and change-data-capture logs to want to build systems fast. At Ticketmaster, we have found a solution to the problem by leveraging the business logic in our existing services and calling them from our Java based KafkaStreams processor applications in an efficient manner. In this talk, we will examine the initial challenges we faced in our transition, then we will explore the solutions we built to address the use cases at Ticketmaster. The primary focus will address our workflow around calling services to bring stream processor applications to market fast. We will review our challenges and share tips for success.

One of the key metrics to monitor when working with Apache Kafka, as a data pipeline or a streaming platform, is Consumer Groups Lag. Lag is the delta between the last produced message and the last committed message of a partition. In other words, lag indicates how far behind your application is in processing up-to-date information. For a long time, we used our own service to keep track of these metrics, collect them and visualize them. But this didn’t scale well. You had to perform many manual operations, redeploy it and to do other tedious manual tasks, but most importantly, the biggest gap for us, was that its output was represented in absolute numbers (e.g - your lag is 30K), which basically tells you nothing as a human being. We understood that we had to find a more suitable solution that will give us better visibility and will allow us to measure the lag in a time-based format that we all understand. In this talk, I’m going to go over the core concepts of Kafka offsets and lags, and explain why lag even matters and is an important KPI to measure. I’ll also talk about the kind of research we did to find the right tool, what the options in the market were at the time, and eventually why we chose Linkedin’s Burrow as the right tool for us. And finally, I’ll take a closer look at Burrow, its building blocks, how we build and deploy it, how we monitor better with it, and eventually the most important improvement - how we transformed its output from numbers to time-based metrics.

What's New in Confluent Platform 5.5

confluent

Watch this webcast here: https://www.confluent.io/online-talks/whats-new-in-confluent-platform-55/ Join the Confluent Product Marketing team as we provide an overview of Confluent Platform 5.5, which makes Apache Kafka and event streaming more broadly accessible to developers with enhancements to data compatibility, multi-language development, and ksqlDB. Building an event-driven architecture with Apache Kafka allows you to transition from traditional silos and monolithic applications to modern microservices and event streaming applications. With these benefits has come an increased demand for Kafka developers from a wide range of industries. The Dice Tech Salary Report recently ranked Kafka as the highest-paid technological skill of 2019, a year removed from ranking it second. With Confluent Platform 5.5, we are making it even simpler for developers to connect to Kafka and start building event streaming applications, regardless of their preferred programming languages or the underlying data formats used in their applications. This session will cover the key features of this latest release, including: -Support for Protobuf and JSON schemas in Confluent Schema Registry and throughout our entire platform -Exactly once semantics for non-Java clients -Admin functions in REST Proxy (preview) -ksqlDB 0.7 and ksqlDB Flow View in Confluent Control Center

Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)

confluent

Presenter: Tim Berglund, Senior Director of Developer Experience, Confluent It has become a truism in the past decade that building systems at scale, using non-relational databases, requires giving up on the transactional guarantees afforded by the relational databases of yore. ACID transactional semantics are fine, but we all know you can’t have them all in a distributed system. Or can we? In this talk, I will argue that by designing our systems around a distributed log like Apache Kafka®, we can in fact achieve ACID semantics at scale. We can ensure that distributed write operations can be applied atomically, consistently, in isolation between services, and of course with durability. What seems to be a counterintuitive conclusion ends up being straightforwardly achievable using existing technologies, as an elusive set of properties becomes relatively easy to achieve with the right architectural paradigm underlying the application.

Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...

confluent

"Use of techniques to services decomposition into a set of stages allowing code modularity and reuse. Good practices for dealing with DeadLetter, Monitoring, CorrelationID, Log, Base classes to control all software development best practices, Buffer Control in Apache Kafka and aspects related to Apache Kafka scalability and fault tolerance. Processing and management of high messages streaming on Black Friday (~ 25.4 million / day) After a retrospective of how our structure behaved during the last Black Friday, we learned a few lessons and decided to adopt a new approach to address some specific scenarios which have millions of messages, ensuring resilience, uptime of at least 99.9%, monitoring and alerts for each module. We decided to adopt the SEDA architecture standard to traffic these millions of messages as closely as possible and deliver the desired quality to the target systems with scalability and reliability. By separating the pipeline processing modules, we were able to scale each of these modules horizontally, increasing the number of PODs (Openshift) and partitions of Kafka topics in order to process a given pipeline step faster. In addition, we also need to apply tunnings to Apache Kafka, one of which concerns the guarantee of delivery of the message. The focus of this presentation is to show the solution designed and how we use Apache Kafka and the SEDA architecture standard to orchestrate this massive stream of data we face."

Building a Web Application with Kafka as your Database

confluent

Integrating Apache Kafka and Elastic Using the Connect Framework

confluent

As a streaming platform, Apache Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and excels at processing streams of real-time events. Kafka provides reliable, millisecond delivery for connecting downstream systems with real-time data. In this talk, we will show how easy it is to leverage Kafka and the Elasticsearch connector to keep your indices populated with the latest data from the rest of your enterprise, as it changes.

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...

HostedbyConfluent

Activision Data team has been running a data pipeline for a variety of Activision games for many years. Historically we used a mix of micro-batch microservices coupled with classic Big Data tools like Hadoop and Hive for ETL. As a result, it could take up to 4-6 hours for data to be available to the end customers. In the last few years, the adoption of data in the organization skyrocketed. We needed to de-legacy our data pipeline and provide near-realtime access to data in order to improve reporting, gather insights faster, power web and mobile applications. I want to tell a story about heavily leveraging Kafka Streams and Kafka Connect to reduce the end latency to minutes, at the same time making the pipeline easier and cheaper to run. We were able to successfully validate the new data pipeline by launching two massive games just 4 weeks apart.

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Lightbend

In this talk by Sean Glover, Principal Engineer at Lightbend, we will review how the Strimzi Kafka Operator, a supported technology in Lightbend Platform, makes many operational tasks in Kafka easy, such as the initial deployment and updates of a Kafka and ZooKeeper cluster. See the blog post containing the YouTube video here: https://www.lightbend.com/blog/running-kafka-on-kubernetes-with-strimzi-for-real-time-streaming-applications

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL

confluent

Deploying Confluent Platform for Production

confluent

Real-Time Stream Processing with KSQL and Apache Kafka

confluent

Real Time Stream Processing with KSQL and Kafka David Peterson, Confluent APAC APIdays Melbourne 2018 Unordered, unbounded and massive datasets are increasingly common in day-to-day business. Using this to your advantage is incredibly difficult with current system designs. We are stuck in a model where we can only take advantage of this *after* it has happened. Many times, this is too late to be useful in the enterprise. KSQL is a streaming SQL engine for Apache Kafka. KSQL lowers the entry bar to the world of stream processing, providing a simple and completely interactive SQL interface for processing data in Kafka. KSQL (like Kafka) is open-source, distributed, scalable, and reliable. A real time Kafka platform moves your data up the stack, closer to the heart of your business, allowing you to build scalable, mission-critical services by quickly deploying SQL-like queries in a severless pattern. This talk will highlight key use cases for real time data, and stream processing with KSQL: Real time analytics, security and anomaly detection, real time ETL / data integration, Internet of Things, application development, and deploying Machine Learning models with KSQ. Real time data and stream processing means that Kafka is just as important to the disrupted as it is to the disruptors.

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...

HostedbyConfluent

Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.

How to over-engineer things and have fun? | Oto Brglez, OPALAB

HostedbyConfluent

Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...

HostedbyConfluent

In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA. We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing. We'll discuss: 1. How to architect a data sink for failed records. 2. How topic compaction can reduce duplicate data and enable idempotency. 3. Building a tombstoning system for removing successfully reprocessed records from the queues. 4. Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?

Streaming ETL to Elastic with Apache Kafka and KSQL

confluent

Companies are recognizing the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, enableing low latency analytics, event-driven architectures and the population of multiple downstream systems. These data pipelines can be built using configuration alone. In this talk we’ll see how easy it is to stream data from sources such as databases and into Kafka using the Kafka Connect API. We’ll use KSQL to filter, aggregate and join it to other data, and then stream this enriched data from Kafka out into targets such as Elasticsearch. All of this can be accomplished without a single line of code!

Deploying and Operating KSQL

confluent

UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...

confluent

KSQL is the streaming SQL engine for Apache Kafka. It provides an easy and completely interactive SQL interface for stream processing on Kafka. Users can express their processing logic in SQL like statements and KSQL will compile and execute them as Kafka Streams applications. Although KSQL provides a rich set of features and built in functions, many use cases require more domain specific processing logic that cannot be expressed in pure SQL. To enable users to use KSQL in such scenarios, KSQL provides a framework to define complex processing logic as User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs). In this talk, we provide a deep dive into the UDF/UDAF framework in KSQL. We explain how users can define their custom UDFs/UDAFs and use them in their queries. We also describe how KSQL utilizes the provided UDFs/UDAFs under the hood to process streams and tables. This deep dive will include an insight into how UDFs process data and how UDAFs keep track of their state. Armed with such knowledge, KSQL users will be able to define and utilize complex data processing logic in their KSQL queries. They will also be able to diagnose and fix issues in defining and deploying their UDFs/UDAFs more efficiently.

Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...

confluent

Application teams in JPMC have started shifting towards building event driven architectures and real time steaming pipelines and Kafka has been at core in this journey. As application teams have started adopting Kafka rapidly, need for a centrally managed Kafka as a service has emerged. We have started delivering Kafka as a service in early 2018 and running in production for more than an year now operating 80+ clusters (and growing) in all environments together. One of the key requirements is to provide truly segregated, secured multi-tenant environment with RBAC model while satisfying financial regulations and controls at the same time. Operating clusters at large scale requires scalable self-service capabilities and cluster management orchestration. In this talk we will present - Our experiences in delivering and operating secured, multi-tenant and resilient Kafka clusters at scale. - Internals of our service framework/control plane which enables self-service capabilities for application teams, cluster build/patch orchestration and capacity management capabilities for TSE/admin teams. - Our approach in enabling automated Cross Datacenter failover for application teams using service framework and confluent replicator.

From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...

confluent

Mark Teehan, Principal Solutions Engineer, Confluent Use the Debezium CDC connector to capture database changes from a Postgres database - or MySQL or Oracle; streaming into Kafka topics and onwards to an external data store. Examine how to setup this pipeline using Docker Compose and Confluent Cloud; and how to use various payload formats, such as avro, protobuf and json-schema. https://www.meetup.com/Singapore-Kafka-Meetup/events/276822852/

Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day

Ankur Bansal

Building data pipelines is pretty hard! Building a multi-datacenter active-active real time data pipeline for multiple classes of data with different durability, latency and availability guarantees is much harder. Real time infrastructure powers critical pieces of Uber (think Surge) and in this talk we will discuss our architecture, technical challenges, learnings and how a blend of open source infrastructure (Apache Kafka and Samza) and in-house technologies have helped Uber scale.

How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...

Lightbend

Google Cloud Dataflow

Alex Van Boxel

What's hot

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

HostedbyConfluent

What's New in Confluent Platform 5.5

confluent

Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)

confluent

Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...

confluent

Building a Web Application with Kafka as your Database

confluent

Integrating Apache Kafka and Elastic Using the Connect Framework

confluent

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...

HostedbyConfluent

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Lightbend

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL

confluent

Deploying Confluent Platform for Production

confluent

Real-Time Stream Processing with KSQL and Apache Kafka

confluent

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...

HostedbyConfluent

How to over-engineer things and have fun? | Oto Brglez, OPALAB

HostedbyConfluent

Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...

HostedbyConfluent

Streaming ETL to Elastic with Apache Kafka and KSQL

confluent

Deploying and Operating KSQL

confluent

UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...

confluent

Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...

confluent

From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...

confluent

Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day

Ankur Bansal

What's hot (20)

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

What's New in Confluent Platform 5.5

Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)

Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...

Building a Web Application with Kafka as your Database

Integrating Apache Kafka and Elastic Using the Connect Framework

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL

Deploying Confluent Platform for Production

Real-Time Stream Processing with KSQL and Apache Kafka

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...

How to over-engineer things and have fun? | Oto Brglez, OPALAB

Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...

Streaming ETL to Elastic with Apache Kafka and KSQL

Deploying and Operating KSQL

UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...

Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...

From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...

Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day

Similar to Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes

How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...

Lightbend

Google Cloud Dataflow

Alex Van Boxel

Dsdt meetup 2017 11-21

JDA Labs MTL

DSDT Meetup Nov 2017

DSDT_MTL

Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL

ScyllaDB

Cloud-native Patterns

VMware Tanzu

Cloud-native Patterns (July 4th, 2019)

Alexandre Roman

Cloud-native .NET-Microservices mit Kubernetes @BASTAcon

Mario-Leander Reimer

Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen. @BASTAcon #BASTA17 @qaware #CloudNativeNerd https://basta.net/microservices-services/cloud-native-net-microservices-mit-kubernetes/

Cloud-native .NET Microservices mit Kubernetes

QAware GmbH

BASTA! 2017, Mainz: Talk von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware). Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen.

Event streaming webinar feb 2020

Maheedhar Gunturu

Event streaming applications unlock new benefits by combining various data feeds. However, getting actionable insights in a timely fashion has remained a challenge, as the data has been siloed in disparate systems. ksqlDB solves this by providing an interactive SQL interface that can seamlessly combine and transform data from various sources. In this webinar, we will show how streaming queries of high throughput NoSQL systems can derive insights from various push/pull queries via ksqlDB's User-Defined Functions, Aggregate Functions and Table Functions. Watch this to learn: Real-world examples of the benefits of using a streaming database like ksqlDB and seamlessly combining data from Kafka & Cassandra/Scylla (NoSQL). The functionality of ksqlDB via push/pull queries and UDFs/UDAFs/UDTFs. The ease with which data stored in a NoSQL database can be transformed using ksqlDB and then persisted back for long-term storage.

Bridge Your Kafka Streams to Azure Webinar

confluent

With a fully managed Apache Kafka(R) as-a-service on Microsoft Azure, businesses can focus on building applications and not managing clusters. Build a persistent bridge from on-premises data systems to the cloud with a hybrid Kafka service or stream across public clouds for multi-cloud data pipelines. In this session for business and technical data leaders, you can learn about powering business applications with the managed Kafka service that streams data into Azure SQL Data Warehouse, Cosmos DB, Azure Data Lake Storage and Azure Blob Storage.

New Approaches for Fraud Detection on Apache Kafka and KSQL

confluent

Speakers: Dale Kim, Sr. Director, Products/Solutions, Arcadia Data + Chong Yan, Solutions Architect, Confluent When it comes to corporate fraud, early detection is integral to mitigating and preventing drastic damage. Modern streaming data technologies like Apache Kafka® and Confluent KSQL, the streaming SQL engine for Apache Kafka, can help companies catch and detect fraud in real time instead of after the fact. Kafka is ideal for managing fast, incoming data points, and KSQL provides the de facto standard for reading that data. Combine this with Arcadia Data visualizations designed for modern data types, and you have a powerful foundation for combating fraud. You will learn: -Why traditional batch-driven approaches to fraud detection are insufficient today -Why Apache Kafka is widely used for real-time fraud detection -How KSQL and real-time visualizations open more opportunities for searching for fraud

First Steps with Apache Kafka on Google Cloud Platform

confluent

Speakers: Jay Smith, Cloud Customer Engineer, Google Cloud + Gwen Shapira, Product Manager, Confluent Curious about Apache Kafka®? Find out why you would want to use the de facto standard for real-time streaming, the easiest way to get started and how to leverage the extensive Apache Kafka ecosystem. In this chat, we'll talk about three common use cases, review stream processing patterns and discuss integration with important GCP services such as BigQuery. We'll also demo how to implement real-time clickstream analytics on Confluent Cloud, fully managed Apache Kafka as a service.

stackconf 2020 | The path to a Serverless-native era with Kubernetes by Paolo...

NETWAYS

Serverless is one of the hottest design patterns in the cloud today, i’ll cover how the Serverless paradigms are changing the way we develop applications and the cloud infrastructures and how to implement Serveless-kind workloads with Kubernetes. We’ll go through the latest Kubernetes-based serverless technologies, covering the most important aspects including pricing, scalability, observability and best practices

Data Engineer's Lunch #56: Spring Cloud Data Flow with Cassandra

Anant Corporation

In Data Engineer's Lunch #55 we will be going over how to integrate Spring Cloud Data Flow with Cassandra. Accompanying Blog: Coming Soon! Accompanying YouTube: Coming Soon! Sign Up For Our Newsletter: http://eepurl.com/grdMkn Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday: https://www.meetup.com/Data-Wranglers-DC/events/ Cassandra.Link: https://cassandra.link/ Follow Us and Reach Us At: Anant: https://www.anant.us/ Awesome Cassandra: https://github.com/Anant/awesome-cassandra Email: solutions@anant.us LinkedIn: https://www.linkedin.com/company/anant/ Twitter: https://twitter.com/anantcorp Eventbrite: https://www.eventbrite.com/o/anant-1072927283 Facebook: https://www.facebook.com/AnantCorp/ Join The Anant Team: https://www.careers.anant.us

Airflow techtonic template

Sampath Kumar

Microservices with kubernetes @190316

Jupil Hwang

Chef and OpenStack Workshop from ChefConf 2013

Matt Ray

KSQL - Stream Processing simplified!

Guido Schmutz

KSQL is a stream processing SQL engine, which allows stream processing on top of Apache Kafka. KSQL is based on Kafka Stream and provides capabilities for consuming messages from Kafka, analysing these messages in near-realtime with a SQL like language and produce results again to a Kafka topic. By that, no single line of Java code has to be written and you can reuse your SQL knowhow. This lowers the bar for starting with stream processing significantly. KSQL offers powerful capabilities of stream processing, such as joins, aggregations, time windows and support for event time. In this talk I will present how KSQL integrates with the Kafka ecosystem and demonstrate how easy it is to implement a solution using KSQL for most part. This will be done in a live demo on a fictitious IoT sample.

SpringBoot and Spring Cloud Service for MSA

Oracle Korea

Similar to Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes (20)

How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...

Google Cloud Dataflow

Dsdt meetup 2017 11-21

DSDT Meetup Nov 2017

Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL

Cloud-native Patterns

Cloud-native Patterns (July 4th, 2019)

Cloud-native .NET-Microservices mit Kubernetes @BASTAcon

Cloud-native .NET Microservices mit Kubernetes

Event streaming webinar feb 2020

Bridge Your Kafka Streams to Azure Webinar

New Approaches for Fraud Detection on Apache Kafka and KSQL

First Steps with Apache Kafka on Google Cloud Platform

stackconf 2020 | The path to a Serverless-native era with Kubernetes by Paolo...

Data Engineer's Lunch #56: Spring Cloud Data Flow with Cassandra

Airflow techtonic template

Microservices with kubernetes @190316

Chef and OpenStack Workshop from ChefConf 2013

KSQL - Stream Processing simplified!

SpringBoot and Spring Cloud Service for MSA

More from Lightbend

IoT 'Megaservices' - High Throughput Microservices with Akka

Lightbend

********** Watch this presentation on-demand! https://info.lightbend.com/iot-megaservices-high-throughput-microservices-with-akka-register.html ********** In this interactive presentation by Hugh McKee, Developer Advocate at Lightbend, we’ll share our experiences helping our clients create a system architecture that can support high throughput microservices (aka "Megaservices"). We’ll do that using IoT demo applications designed to push cloud service providers like Amazon and Google to their limits. Using sample code that you can later run on your own machine, we’ll look at: * Modeling real-life digital twins for hundreds of thousands of IoT devices in the field, looking into how these megaservices are implemented in Akka. * Visualizing Akka Actors–which represent IoT digital twins–in a “crop circle” formation that represents a complete distributed Reactive application, and watching at messages are processed across Akka Cluster nodes using cluster sharding. * Some code behind the whole set up, which is built using OSS like Akka, Java, JavaScript, and Kubernetes. Follow us on social: TW: https://twitter.com/lightbend LI: https://www.linkedin.com/company/lightbend-inc-/ FB: https://www.facebook.com/lightbendOfficial/ For more about Lightbend: Blog: https://www.lightbend.com/blog  Newsletter: https://www.lightbend.com/newsletter 

How Akka Cluster Works: Actors Living in a Cluster

Lightbend

The Reactive Principles: Eight Tenets For Building Cloud Native Applications

Lightbend

Putting the 'I' in IoT - Building Digital Twins with Akka Microservices

Lightbend

Digital Transformation with Kubernetes, Containers, and Microservices

Lightbend

Cloudstate - Towards Stateful Serverless

Lightbend

Digital Transformation from Monoliths to Microservices to Serverless and Beyond

Lightbend

Join this highly-visual presentation by Hugh McKee, Developer Advocate at Lightbend, to learn more about the ramifications and opportunities along the evolution from monolithic systems, to microservices architectures, to serverless (FaaS). See the video presentation on the Lightbend blog at: https://www.lightbend.com/blog/digital-transformation-from-monoliths-to-microservices-to-serverless-and-beyond

Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6

Lightbend

Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...

Lightbend

Microservices, Kubernetes, and Application Modernization Done Right

Lightbend

In this talk by David Ogren, Enterprise Architect at Lightbend, we draw from experiences helping our clients successfully create, migrate to, and manage cloud-native system architectures. We look at some of the common pitfalls and anti-patterns of modernization efforts, and some of the best practices for taking an incremental approach to transforming legacy systems. See the full post with video on the Lightbend blog: https://www.lightbend.com/blog/microservices-kubernetes-application-modernization

Akka and Kubernetes: A Symbiotic Love Story

Lightbend

Scala 3 Is Coming: Martin Odersky Shares What To Know

Lightbend

Migrating From Java EE To Cloud-Native Reactive Systems

Lightbend

A lot of businesses that never before considered themselves as “technology companies” are now faced with digital modernization imperatives that force them to rethink their application and infrastructure architecture. On the path to becoming a digital, on-demand provider, development speed is the ultimate competitive advantage. This presents challenges to many organizations that have huge investments in legacy Java EE infrastructure, where technical debt and monolithic system architectures require modernization in order to confront various business risks. Usually, changes need to be made within existing frameworks to keep pace with new web-scale organizations. If your legacy monolith is no longer serving the expanding needs of your business, then join Markus Eisele, Director of Developer Advocacy at Lightbend, to learn what you can do to migrate from Java EE to cloud-native, Reactive systems—as defined by the Reactive Manifesto.

Designing Events-First Microservices For A Cloud Native World

Lightbend

Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala

Lightbend

How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes

Lightbend

A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes

Lightbend

Akka and Kubernetes: Reactive From Code To Cloud

Lightbend

Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...

Lightbend

How Akka Works: Visualize And Demo Akka With A Raspberry-Pi Cluster

Lightbend

More from Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka

How Akka Cluster Works: Actors Living in a Cluster

The Reactive Principles: Eight Tenets For Building Cloud Native Applications

Putting the 'I' in IoT - Building Digital Twins with Akka Microservices

Digital Transformation with Kubernetes, Containers, and Microservices

Cloudstate - Towards Stateful Serverless

Digital Transformation from Monoliths to Microservices to Serverless and Beyond

Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6

Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...

Microservices, Kubernetes, and Application Modernization Done Right

Akka and Kubernetes: A Symbiotic Love Story

Scala 3 Is Coming: Martin Odersky Shares What To Know

Migrating From Java EE To Cloud-Native Reactive Systems

Designing Events-First Microservices For A Cloud Native World

Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala

How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes

A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes

Akka and Kubernetes: Reactive From Code To Cloud

Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...

How Akka Works: Visualize And Demo Akka With A Raspberry-Pi Cluster

Recently uploaded

Vitthal Shirke Microservices Resume Montevideo

Vitthal Shirke

Enhancing Research Orchestration Capabilities at ORNL.pdf

Globus

Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...

Shahin Sheidaei

Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.

Corporate Management | Session 3 of 3 | Tendenci AMS

Tendenci - The Open Source AMS (Association Management Software)

Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have. For more Tendenci AMS events, check out www.tendenci.com/events

SOCRadar Research Team: Latest Activities of IntelBroker

SOCRadar

The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month. The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies. However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News. Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!

Globus Connect Server Deep Dive - GlobusWorld 2024

Globus

How to Position Your Globus Data Portal for Success Ten Good Practices

Globus

Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.

Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf

AMB-Review

Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos https://www.amb-review.com/tubetrivia-ai Exclusive Features: AI-Powered Questions, Wide Range of Categories, Adaptive Difficulty, User-Friendly Interface, Multiplayer Mode, Regular Updates. #TubeTriviaAI #QuizVideoMagic #ViralQuizVideos #AIQuizGenerator #EngageExciteExplode #MarketingRevolution #BoostYourTraffic #SocialMediaSuccess #AIContentCreation #UnlimitedTraffic

2024 RoOUG Security model for the cloud.pptx

Georgi Kodinov

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...

Anthony Dahanne

Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ? Venez le découvrir lors de cette session ignite

GlobusWorld 2024 Opening Keynote session

Globus

Into the Box 2024 - Keynote Day 2 Slides.pdf

Ortus Solutions, Corp

How Recreation Management Software Can Streamline Your Operations.pptx

wottaspaceseo

Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.

Globus Compute Introduction - GlobusWorld 2024

Globus

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better

XfilesPro

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

Globus

The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.

Large Language Models and the End of Programming

Matt Welsh

Software Testing Exam imp Ques Notes.pdf

MayankTawar1

Cracking the code review at SpringIO 2024

Paco van Beckhoven

Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production. Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process? In this session we will cover: - The Art of Effective Code Reviews - Streamlining the Review Process - Elevating Reviews with Automated Tools By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces

Strategies for Successful Data Migration Tools.pptx

varshanayak241

Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.

Recently uploaded (20)

Vitthal Shirke Microservices Resume Montevideo

Enhancing Research Orchestration Capabilities at ORNL.pdf

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...

Corporate Management | Session 3 of 3 | Tendenci AMS

SOCRadar Research Team: Latest Activities of IntelBroker

Globus Connect Server Deep Dive - GlobusWorld 2024

How to Position Your Globus Data Portal for Success Ten Good Practices

Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf

2024 RoOUG Security model for the cloud.pptx

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...

GlobusWorld 2024 Opening Keynote session

Into the Box 2024 - Keynote Day 2 Slides.pdf

How Recreation Management Software Can Streamline Your Operations.pptx

Globus Compute Introduction - GlobusWorld 2024

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

Large Language Models and the End of Programming

Software Testing Exam imp Ques Notes.pdf

Cracking the code review at SpringIO 2024

Strategies for Successful Data Migration Tools.pptx

Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes

1. Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes Gerard Maas, Principal Engineer at Lightbend.

7. ● Intro: Productizing Data Science ● What’s Cloudﬂow ● Building a Fraud Detection Model ● Running the Model With Cloudﬂow Agenda

8. Cloudflow is a development toolkit that enables you to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

9. $dev>

10. $> Streamlet API $> Blueprint $> Sandbox $> build tool extensions $> kubectl cloudflow

11. Operator

12. $> Streamlet API $> Blueprints $> build extensions $> kubectl cloudflow Operator

13. Streamlet inlet(s) outlet(s) { Schema } Streamlets Logic{ Schema } { Schema }

14. Streamlet Streamlets Logic Streamlet Logic Streamlet Logic ✔ ❌

15. Streamlet inlet(s) outlet(s) Streamlets Logic

16. Easily integrate streamlets written in Akka Streams, Spark Structured Streaming, and Flink Merge diﬀerent input streams Validate record formats, ﬁeld values Use ML for more sophisticated analysis Compute aggregations (e.g., statistics) Send results downstream Cloudflow API :: Blueprints

17. The Data Science Process we’re here Img src: https://randalscottking.com/machine-learning-overview/

18. Transactions Fraud Detection Model Data Understanding Data Preparation Data Modelling Validation Deployment

19. Transactions Fraud Detection Model Data Understanding Data Preparation Data Modelling Validation Deployment Data Cleaning & Enrichment Data Ingestion Result Propagation Model Scoring model

20. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions

21. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

22. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

23. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

24. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

25. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

26. @maasg HTTP ingress [schema] Enrichment (+features) Fraud ML Scoring Egress (console) Transactions py DT Model Persistent Volume

27. External Batch training Embedded Model E.g. SparkML External Batch training External Model Service E.g. TFServing External Batch training Managed Streams E.g. TFjvm

28. The Mission of ML in Cloudflow Infuse “AI/ML” or smarter real-time analytics to existing apps ▪ Loan approval, device maintenance, next best oﬀer, recommendation engine Mix domain logic with streaming analytics and model serving ▪ Oﬀer a programming model and runtime that facilitates the creation of new data-driven services Enable the productization of ML in the Enterprise Create a flowing “stream” between Data Science and Data Engineering

29. Get started with Cloudflow at cloudflow.io Join our contributor community at: http://github.com/lightbend/cloudflow

30. Thank You Gerard Maas Principal Engineer gerard.maas@lightbend.com @maasg

Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes

Similar to Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes (20)

More from Lightbend

More from Lightbend (20)

Recently uploaded

Recently uploaded (20)

Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes