SlideShare a Scribd company logo
1 of 16
Download to read offline
20/03/2024
Apache Flink: Building
a company-wide
self-service Streaming
Data Platform
Gleb Shipilov (Data Integration Team Leader, Exness)
20/03/2024
Data as Bedrock
20/03/2024
Kafka Summit 2024
● Exness is the largest CFD broker by
trading volume and active clients
● Every millisecond counts
● As Exness delved deeper into event-driven
architecture, the need for processing
streaming data became paramount
● Each team had to deal with processing
streaming data on their own, solving all
the problems with:
○ Scalability
○ Fault tolerance
○ State management
○ Security
2
20/03/2024
Streaming Data Platform components
20/03/2024
Kafka Summit 2024 3
20/03/2024
Why Apache Flink?
● Support for several Kafka instances
● Performance
● Fault tolerance
● Support of very large state
● Java based framework
20/03/2024
Kafka Summit 2024 4
20/03/2024
What challenges have we faced
How to provide
Python and Go
developers with a
self-service platform
based on a Java
framework?
01
How to provide
developers with a
unified deployment
process for all the
components and
make it simple?
02
How to ensure
security?
03
How to flexibly
manage and isolate
resources between
teams?
04
20/03/2024
Kafka Summit 2024 5
20/03/2024
Flink SQL usage
20/03/2024
Kafka Summit 2024 6
20/03/2024
Flink SQL challenges
● Perfect for simple cases:
○ Aggregate;
○ Union data;
○ Flat data.
● Doesn’t work so perfect
with complex cases:
○ A lot of enrichments;
○ Complex business logic;
○ No tracing support.
20/03/2024
Kafka Summit 2024 7
Go developer
20/03/2024
PyFlink usage
20/03/2024
Kafka Summit 2024 8
20/03/2024
PyFlink challenges
● Lack of Apple silicon support (M1 / M2)
● No tracing support out of the box
● Necessity of both Table API and Data
Stream API usage in one PyFlink job
20/03/2024
Kafka Summit 2024 9
● At least the 1.17 version of Flink
● Tracing using:
○ OpenTelemetry;
○ Jaeger.
● Stream table environment to work with
both Table and Data Stream APIs
20/03/2024
Unified deployment process
● Main components of the deployment
process:
○ Terraform;
○ GitLab pipeline;
○ K8S operators.
20/03/2024
Kafka Summit 2024 10
20/03/2024
Unified deployment process
20/03/2024
Kafka Summit 2024 11
20/03/2024
Streaming Data replication using
Terraform over Flink
● Templated Terraform modules
instead of multiple and similar
SQL artefacts
● One module defines
configuration of the whole
pipeline from Kafka topic to S3
20/03/2024
Kafka Summit 2024 12
20/03/2024
One team–one Flink Cluster
● Security
● Resource management
● Own development environment
● Observability
20/03/2024
Kafka Summit 2024 13
20/03/2024
Monitoring and alerting
● Separate monitoring
for each team
● Slack channels with alerts
for each Flink cluster
● One Slack channel for technical
support of all the users
20/03/2024
Kafka Summit 2024 14
20/03/2024
The most important projects delivered
on our self-service platform
● Trading data processing lag
decrease from 2 hours to 2
minutes during peak times
● 1 MLN+ bots’ activity events
are prevented in real-time
● Fraud and abuse prevention
based on real-time data
● Marketing campaigns based
on real-time data
20/03/2024
Kafka Summit 2024 15
20/03/2024
Special thanks to:
Kafka Summit 2024 16
https://medium.com/exness-blog
● Alexey Perminov
● Ilya Soin
● Yury Smirnov
● Igor Matcko

More Related Content

Similar to Apache Flink: Building a Company-wide Self-service Streaming Data Platform

Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)EOSC-hub project
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Elasticsearch
 
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...Miguel Pérez Colino
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdfWhat_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdfchalermpany
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramièreconfluent
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
Flink September 2015 Community Update
Flink September 2015 Community UpdateFlink September 2015 Community Update
Flink September 2015 Community UpdateRobert Metzger
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming appsTimothy Spann
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Appsssuser73434e
 
Edge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different PiecesEdge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different PiecesCloudify Community
 
Opnfv & odl case study slides
Opnfv & odl case study slidesOpnfv & odl case study slides
Opnfv & odl case study slidesChristopher Price
 
OpenStack and Kubernetes - A match made for Telco Heaven
OpenStack and Kubernetes - A match made for Telco HeavenOpenStack and Kubernetes - A match made for Telco Heaven
OpenStack and Kubernetes - A match made for Telco HeavenTrinath Somanchi
 
Serverless Kafka Patterns
Serverless Kafka PatternsServerless Kafka Patterns
Serverless Kafka PatternsTaras Slipets
 
Using Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersMirantis
 
data Artisans Product Announcement
data Artisans Product Announcementdata Artisans Product Announcement
data Artisans Product AnnouncementFlink Forward
 
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...Anoop Ramachandran
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharingconfluent
 

Similar to Apache Flink: Building a Company-wide Self-service Streaming Data Platform (20)

Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
 
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
 
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdfWhat_s_New_in_OpenShift_Container_Platform_4.6.pdf
What_s_New_in_OpenShift_Container_Platform_4.6.pdf
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
Flink September 2015 Community Update
Flink September 2015 Community UpdateFlink September 2015 Community Update
Flink September 2015 Community Update
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
 
Edge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different PiecesEdge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different Pieces
 
Opnfv & odl case study slides
Opnfv & odl case study slidesOpnfv & odl case study slides
Opnfv & odl case study slides
 
OpenStack and Kubernetes - A match made for Telco Heaven
OpenStack and Kubernetes - A match made for Telco HeavenOpenStack and Kubernetes - A match made for Telco Heaven
OpenStack and Kubernetes - A match made for Telco Heaven
 
Serverless Kafka Patterns
Serverless Kafka PatternsServerless Kafka Patterns
Serverless Kafka Patterns
 
Using Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M users
 
data Artisans Product Announcement
data Artisans Product Announcementdata Artisans Product Announcement
data Artisans Product Announcement
 
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...
Introduction to Anypoint Runtime Fabric on Amazon Elastic Kubernetes Service ...
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 

Apache Flink: Building a Company-wide Self-service Streaming Data Platform

  • 1. 20/03/2024 Apache Flink: Building a company-wide self-service Streaming Data Platform Gleb Shipilov (Data Integration Team Leader, Exness)
  • 2. 20/03/2024 Data as Bedrock 20/03/2024 Kafka Summit 2024 ● Exness is the largest CFD broker by trading volume and active clients ● Every millisecond counts ● As Exness delved deeper into event-driven architecture, the need for processing streaming data became paramount ● Each team had to deal with processing streaming data on their own, solving all the problems with: ○ Scalability ○ Fault tolerance ○ State management ○ Security 2
  • 3. 20/03/2024 Streaming Data Platform components 20/03/2024 Kafka Summit 2024 3
  • 4. 20/03/2024 Why Apache Flink? ● Support for several Kafka instances ● Performance ● Fault tolerance ● Support of very large state ● Java based framework 20/03/2024 Kafka Summit 2024 4
  • 5. 20/03/2024 What challenges have we faced How to provide Python and Go developers with a self-service platform based on a Java framework? 01 How to provide developers with a unified deployment process for all the components and make it simple? 02 How to ensure security? 03 How to flexibly manage and isolate resources between teams? 04 20/03/2024 Kafka Summit 2024 5
  • 7. 20/03/2024 Flink SQL challenges ● Perfect for simple cases: ○ Aggregate; ○ Union data; ○ Flat data. ● Doesn’t work so perfect with complex cases: ○ A lot of enrichments; ○ Complex business logic; ○ No tracing support. 20/03/2024 Kafka Summit 2024 7 Go developer
  • 9. 20/03/2024 PyFlink challenges ● Lack of Apple silicon support (M1 / M2) ● No tracing support out of the box ● Necessity of both Table API and Data Stream API usage in one PyFlink job 20/03/2024 Kafka Summit 2024 9 ● At least the 1.17 version of Flink ● Tracing using: ○ OpenTelemetry; ○ Jaeger. ● Stream table environment to work with both Table and Data Stream APIs
  • 10. 20/03/2024 Unified deployment process ● Main components of the deployment process: ○ Terraform; ○ GitLab pipeline; ○ K8S operators. 20/03/2024 Kafka Summit 2024 10
  • 12. 20/03/2024 Streaming Data replication using Terraform over Flink ● Templated Terraform modules instead of multiple and similar SQL artefacts ● One module defines configuration of the whole pipeline from Kafka topic to S3 20/03/2024 Kafka Summit 2024 12
  • 13. 20/03/2024 One team–one Flink Cluster ● Security ● Resource management ● Own development environment ● Observability 20/03/2024 Kafka Summit 2024 13
  • 14. 20/03/2024 Monitoring and alerting ● Separate monitoring for each team ● Slack channels with alerts for each Flink cluster ● One Slack channel for technical support of all the users 20/03/2024 Kafka Summit 2024 14
  • 15. 20/03/2024 The most important projects delivered on our self-service platform ● Trading data processing lag decrease from 2 hours to 2 minutes during peak times ● 1 MLN+ bots’ activity events are prevented in real-time ● Fraud and abuse prevention based on real-time data ● Marketing campaigns based on real-time data 20/03/2024 Kafka Summit 2024 15
  • 16. 20/03/2024 Special thanks to: Kafka Summit 2024 16 https://medium.com/exness-blog ● Alexey Perminov ● Ilya Soin ● Yury Smirnov ● Igor Matcko