SlideShare a Scribd company logo
Kubernetes and
real-time analytics
How to connect these two worlds with
Apache Flink?
Author: Albert Lewandowski
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
About me
● Big Data DevOps Engineer - GetInData
● Focused on infrastructure, cloud, Big Data, AI, scalable
web applications
● Certified Google Cloud Architect
● Certified Kubernetes Administrator
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Content
● Principles in the Big Data world on Kubernetes
● Why real-time data streaming?
● Different faces of Apache Flink.
● Flink and Kubernetes - real life scenarios.
● Observability of the platform.
● Quick start on your computer.
Introduction to the
jungle
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What is Kubernetes?
Open-source platform for managing
containerized workloads and services
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - Operators
Method of deploying and managing app
Automated provisioning of resources
One setup for multiple environments
Examples: pulsar-operator, postgres-operator, prometheus-operator
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - Custom Resource Definitions
Defining custom APIs as add-ons
Dynamic registration with Kubernetes API
CRDs can be accessed with kubectl
A CRD represents the desired state and an operator makes it
happen.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What is Apache Flink?
Flink is an open-source stream processing framework that
supports both batch processing and data streaming programs
Flink’s Savepoints are different from Checkpoints in a similar way that
backups are different from recovery logs in traditional database systems.
A savepoint is a consistent image of the execution state of a streaming job
State of the Flink job
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What is Apache Flink?
Job Diagram State of Flink job
Diagram
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Apache Flink Roadmap
Source: Roadmap - Apache Flink
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Perception
Business
logic
CI/CD
Idempotency
Reprocessing
Explainability
Monitoring
Testing
Serving
Infrastructure
Data Ingestion
Security
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Reality
Business logic
CI/CD
Idempotency
Reprocessing
Explainability
Monitoring
Testing
Serving
Infrastructure
Data Ingestion
Security
Real-time
data streaming
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Streaming vs. Batch
Events
1 2 3 4 5 6
Batch
Stream
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use cases
User activity
Fraud detection
Logistics
Industrial IoT
Location data
Recommendations
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell - Telecom company in Kazakhstan
500K
Events / second
10M
Subscribers
165K
Events / s
10M
Subscribers 22.5
TB / month
2020
2018
40
TB / month
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell
SMS events
Voice usage events
Data usage events
Roaming events
Location events
Input Process Actions
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell - some scenarios for Flink
Balance Top Up Case
If subscriber top-ups her balance too often in short period of time. We
can offer her a less expensive tariff or auto-payment services.
Fraud case in roaming
Send an email to the anti-fraud unit if subscriber registered in roaming
but his balance at the moment is equal to 0.
This situation is impossible in standard case.
Automatic SIM card activation
Send an email to the anti-fraud unit if subscriber registered in roaming
but his balance at the moment is equal to 0.
This situation is impossible in standard case.
Dealer Motivation Case
Trigger bonus for a dealer when we discover that purchase happened
attributable to him/her.
Apache Flink
One tool, multiple versions
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
One tool, multiple languages
Java 8 or 11
Python
SQL
Scala 2.11 or 2.12
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Where should I install?
Standalone
Kubernetes
YARN cluster
● CICD process
● Service Discovery - monitoring with Prometheus
● Scalability
● Managing resources
● A/B Testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
High Availability of Flink
JobManager level
Storage level
● ZooKeeper
● Kubernetes (beta)
● High Availability of
storage to/from
which Flink
writes/reads
savepoints and
checkpoints
● Performance of
storage
Job Strategy
● Data reprocessing
policy
● How to deploy new
job?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Flink K8S Operator
Kubernetes Operator for
Apache Flink
Ververica Platform
Native Kubernetes -
Apache Flink
CRDs Yes Yes No No
CICD Kubernetes API Kubernetes API REST API or Web UI Kubernetes API
Installation
Helm chart or raw
Kubernetes manifests
Helm chart or raw
Kubernetes manifests
Helm chart or raw
Kubernetes manifests
No need to install any
component
SQL Editor No No Yes No
Dependencies No No
Persistence volume for
database
Object storage for
artifactory
No
Status beta beta production beta
Flink + Kubernetes = ?
Overview in the article here
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Why Flink on Kubernetes?
Simpler deployment process Flexible jobs management
Simple Service Discovery -
Prometheus
Flexible testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Installation & Configuration
Helm
A package manager
for Kubernetes
CICD tool
Example: Gitlab CI
Kubernetes API
Flink jobs
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Testing
Incubating Mode
A/B Testing
Blue Green
Deployment
Production
data
Production
job
Job
Incubating
mode
Separated
output
Standard
output
Dedicated TaskManagers
Dedicated TaskManagers
savepoi
nt
Proxy
Flink Job
#1
Flink Job
#2
Result #1
Result #2
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Deployment process
Versioning images
Unit & Integration tests
Git Flow
Deployment process
Monitoring
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes aspects
Dedicated namespaces
Secured access to Flink (RBAC) Configuration files
Storage for
savepoints&checkpoints
Resources
Secrets
Network performance
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Self-healing and autoscaling
Scale based
on metrics
Flink restarts
External tool
for fixing
Re-create
cluster
Automate
manual tasks
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Job Cluster & Session Cluster
Job Cluster Session Cluster
Full set of Flink cluster for
each individual job
Standalone Flink cluster on
Kubernetes
Short running tasks Ad-hoc queries
Long running tasks
Separate images for
different jobs
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Stories from production
● Automate in the beginning
● CICD pipeline is a must
● Verify JVM metrics
● Test different Flink configurations to get the best
performance and no restarts
● Secure access to Flink jobs
● Get logs from Flink TMs and JMs
Local Setup
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
How to start locally?
● Minikube / Docker Desktop or any different local K8s env
● Ververica Platform
● Locally started Kafka cluster or use a Datagen
APACHE FLINK
KUBERNETES
STREAMING SQL
Observability
Whitepaper - here
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Observability
Observability is about measuring how well internal states of the
system can be inferred from knowledge of its external outputs
(according to the control theory).
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Part One: Metrics
Get metrics from environment and application - but how?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - Kubernetes-native solution
open-source systems
monitoring and alerting toolkit
joined the Cloud Native Computing
Foundation in 2016 as the second
hosted project, after Kubernetes
a lot of exporters
you can write your own easily
mature ecosystem
PushGateway, Blackbox, AlertManager, etc.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - simple or complex High Availability?
Simple Complex
Example solutions: Cortex (above), Thanos, M3DB
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Pull vs. push-based monitoring
Pull Push
Collector takes metrics Agents push metrics
Workload on central poller increases with the number of
devices polled.
Polling task fully distributed among agents, resulting in
linear scalability.
Polling protocol can potentially open up system to
remote access and denial of service attacks.
Push agents are inherently secure against remote
attacks since they do not listen for network connections.
Flexible: poller can ask for any metric at any time.
Relatively inflexible: pre-determined, fixed set of
measurements are periodically exported.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - Stories
service discovery
simple on k8s
limited security
archived data
how old data is required?
monitor monitoring
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Part Two: Logs analytics
1. Get logs from app or environment.
2. Save logs.
3. Query them.
4. Make your system self-healing and discover what’s
happening inside your platform.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Logs analytics - which tool should I choose?
Logs Analytics for Developers Logs Analytics for Business
Loki ElasticSearch
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
ELK vs. Loki
ELK Loki + Promtail/Fluentd
Indexing Keys and content of each key Only labels
Query language Query DSL or Lucene QL LogQL
Tool for data visualisation Kibana Grafana
Query performances Faster due to indexed all the data Slower due to indexing only labels
Resource requirements Higher due to the need of indexing Lower due to index only labels
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What about alerts?
Alerts signify that
a human needs to take action
immediately
in response to something that is
either happening or about to
happen, in order to improve the
situation.
Quick start
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Flink - Complex Event Processing
Article.
Codebase for example.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
DevOps best practises
Article.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - first setup
● Minikube
● Kind
● Use Kubernetes service from public cloud provider like
AWS, GCP, Azure during free tier
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes + Flink - Operator
$ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/crd.yaml
$ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/namespace.yaml
$ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/role.yaml
$ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/role-binding.yaml
$ kubectl create -f
https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/flinkk8soperator.yaml
Verify if it works:
$ kubectl -n flink-operator get po
Run the example job:
$ kubectl create -f
https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/examples/wordcount/flink-operator-custom-resou
rce.yaml
Verify if it is running and its status:
$ kubectl get flinkapplication.flink.k8s.io -n flink-operator wordcount-operator-example -o yaml
Requirements: Kubernetes cluster, kubectl
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes + Ververica Platform
Install Ververica Platform locally with Helm
$ helm repo add ververica https://charts.ververica.com
$ helm install vvp ververica/ververica-platform
$ helm install vvp ververica/ververica-platform --set acceptCommunityEditionLicense=true
Verify if Ververica is up
$ kubectl get po
Access the web user interface and REST API
$ kubectl port-forward service/vvp-ververica-platform 8080:80
Do you want to test Flink SQL feature? Use Flink Faker (a data generator source connector)
https://github.com/knaufk/flink-faker/
It requires changing used image for vvp-gateway.
Requirements: Kubernetes cluster, kubectl, Helm
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Join Us!
Data Engineer
Spark, Kafka, Airflow, public cloud
Link
Backend Engineer
Java / Scala, microservices
Link
MLOps Engineer
MLOps tools, Python, public cloud
Link
DevOps / SRE
GCP, Terraform, Prometheus
Link
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Q&A
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Contact details
albert.lewandowski@getindata.com
LinkedIn:
https://www.linkedin.com/in/albert-lewandowski
Thank you for your
attention!

More Related Content

What's hot

How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
InfluxData
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
Altoros
 
Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021
InfluxData
 
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDBHow to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
InfluxData
 
Cloud Technical Challenges
Cloud Technical ChallengesCloud Technical Challenges
Cloud Technical Challenges
Guy Coates
 
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
InfluxData
 
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxDataInfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxData
 
An Early Evaluation of Running Spark on Kubernetes
An Early Evaluation of Running Spark on KubernetesAn Early Evaluation of Running Spark on Kubernetes
An Early Evaluation of Running Spark on Kubernetes
DataWorks Summit
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
John Archer
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
VMware Tanzu
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
matteo mazzeri
 
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxDB + Kepware: Start Monitoring Industrial Data QuicklyInfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxData
 
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
InfluxData
 
Big Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David GiardBig Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David Giard
ITCamp
 
IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData
InfluxData
 
Webinar Registration Getting Started with Building Your First IoT App
Webinar Registration Getting Started with Building Your First IoT AppWebinar Registration Getting Started with Building Your First IoT App
Webinar Registration Getting Started with Building Your First IoT App
InfluxData
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
Márton Kodok
 
Automated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + SensuAutomated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + Sensu
Rundeck
 
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxDataSensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
InfluxData
 
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Codemotion
 

What's hot (20)

How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
 
Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021
 
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDBHow to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
 
Cloud Technical Challenges
Cloud Technical ChallengesCloud Technical Challenges
Cloud Technical Challenges
 
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
 
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxDataInfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
 
An Early Evaluation of Running Spark on Kubernetes
An Early Evaluation of Running Spark on KubernetesAn Early Evaluation of Running Spark on Kubernetes
An Early Evaluation of Running Spark on Kubernetes
 
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Delivering Agile Data Science on Openshift  - Red Hat Summit 2019Delivering Agile Data Science on Openshift  - Red Hat Summit 2019
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxDB + Kepware: Start Monitoring Industrial Data QuicklyInfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
 
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
 
Big Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David GiardBig Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David Giard
 
IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData
 
Webinar Registration Getting Started with Building Your First IoT App
Webinar Registration Getting Started with Building Your First IoT AppWebinar Registration Getting Started with Building Your First IoT App
Webinar Registration Getting Started with Building Your First IoT App
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
 
Automated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + SensuAutomated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + Sensu
 
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxDataSensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
 
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
 

Similar to Kubernetes and real-time analytics - how to connect these two worlds with Apache Flink? - Albert Lewandowski, GetInData

Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
GetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
GetInData
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
kloia
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
GetInData
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
GetInData
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
Daniel Zivkovic
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
Rahul Kumar Gupta
 
Running Stateful Apps on Kubernetes
Running Stateful Apps on KubernetesRunning Stateful Apps on Kubernetes
Running Stateful Apps on Kubernetes
Yugabyte
 
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
Datacratic
 
Industrial IoT bootcamp
Industrial IoT bootcampIndustrial IoT bootcamp
Industrial IoT bootcamp
Lothar Schubert
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
VMware Tanzu
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshare
Sumit Naiksatam
 
The rise of microservices
The rise of microservicesThe rise of microservices
The rise of microservices
Cloud Technology Experts
 
Microservices with kubernetes @190316
Microservices with kubernetes @190316Microservices with kubernetes @190316
Microservices with kubernetes @190316
Jupil Hwang
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
Robert Metzger
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
Puppet
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
VMware Tanzu
 
Puppet devops wdec
Puppet devops wdecPuppet devops wdec
Puppet devops wdec
Wojciech Dec
 
Connecting the Smart Factory to the Cloud
Connecting the Smart Factory to the CloudConnecting the Smart Factory to the Cloud
Connecting the Smart Factory to the Cloud
HiveMQ
 

Similar to Kubernetes and real-time analytics - how to connect these two worlds with Apache Flink? - Albert Lewandowski, GetInData (20)

Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
 
Running Stateful Apps on Kubernetes
Running Stateful Apps on KubernetesRunning Stateful Apps on Kubernetes
Running Stateful Apps on Kubernetes
 
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
 
Industrial IoT bootcamp
Industrial IoT bootcampIndustrial IoT bootcamp
Industrial IoT bootcamp
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshare
 
The rise of microservices
The rise of microservicesThe rise of microservices
The rise of microservices
 
Microservices with kubernetes @190316
Microservices with kubernetes @190316Microservices with kubernetes @190316
Microservices with kubernetes @190316
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 
Puppet devops wdec
Puppet devops wdecPuppet devops wdec
Puppet devops wdec
 
Connecting the Smart Factory to the Cloud
Connecting the Smart Factory to the CloudConnecting the Smart Factory to the Cloud
Connecting the Smart Factory to the Cloud
 

More from GetInData

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
GetInData
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
GetInData
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
GetInData
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
GetInData
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
GetInData
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
GetInData
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
GetInData
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
GetInData
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
GetInData
 
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
GetInData
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
GetInData
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
GetInData
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
GetInData
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
GetInData
 
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
GetInData
 
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
GetInData
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
GetInData
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
GetInData
 
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
GetInData
 

More from GetInData (20)

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
 
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
 
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
 
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
 
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
Zdarzeniowe zasilanie i przeliczanie danych w nowoczesnej hurtowni danych opa...
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 

Kubernetes and real-time analytics - how to connect these two worlds with Apache Flink? - Albert Lewandowski, GetInData

  • 1. Kubernetes and real-time analytics How to connect these two worlds with Apache Flink? Author: Albert Lewandowski
  • 2. © Copyright. All rights reserved. Not to be reproduced without prior written consent. About me ● Big Data DevOps Engineer - GetInData ● Focused on infrastructure, cloud, Big Data, AI, scalable web applications ● Certified Google Cloud Architect ● Certified Kubernetes Administrator
  • 3. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Content ● Principles in the Big Data world on Kubernetes ● Why real-time data streaming? ● Different faces of Apache Flink. ● Flink and Kubernetes - real life scenarios. ● Observability of the platform. ● Quick start on your computer.
  • 5. © Copyright. All rights reserved. Not to be reproduced without prior written consent. What is Kubernetes? Open-source platform for managing containerized workloads and services
  • 6. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes - Operators Method of deploying and managing app Automated provisioning of resources One setup for multiple environments Examples: pulsar-operator, postgres-operator, prometheus-operator
  • 7. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes - Custom Resource Definitions Defining custom APIs as add-ons Dynamic registration with Kubernetes API CRDs can be accessed with kubectl A CRD represents the desired state and an operator makes it happen.
  • 8. © Copyright. All rights reserved. Not to be reproduced without prior written consent. What is Apache Flink? Flink is an open-source stream processing framework that supports both batch processing and data streaming programs Flink’s Savepoints are different from Checkpoints in a similar way that backups are different from recovery logs in traditional database systems. A savepoint is a consistent image of the execution state of a streaming job State of the Flink job
  • 9. © Copyright. All rights reserved. Not to be reproduced without prior written consent. What is Apache Flink? Job Diagram State of Flink job Diagram
  • 10. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Apache Flink Roadmap Source: Roadmap - Apache Flink
  • 11. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Perception Business logic CI/CD Idempotency Reprocessing Explainability Monitoring Testing Serving Infrastructure Data Ingestion Security
  • 12. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Reality Business logic CI/CD Idempotency Reprocessing Explainability Monitoring Testing Serving Infrastructure Data Ingestion Security
  • 14. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Streaming vs. Batch Events 1 2 3 4 5 6 Batch Stream
  • 15. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Use cases User activity Fraud detection Logistics Industrial IoT Location data Recommendations
  • 16. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Use case - Kcell - Telecom company in Kazakhstan 500K Events / second 10M Subscribers 165K Events / s 10M Subscribers 22.5 TB / month 2020 2018 40 TB / month
  • 17. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Use case - Kcell SMS events Voice usage events Data usage events Roaming events Location events Input Process Actions
  • 18. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Use case - Kcell - some scenarios for Flink Balance Top Up Case If subscriber top-ups her balance too often in short period of time. We can offer her a less expensive tariff or auto-payment services. Fraud case in roaming Send an email to the anti-fraud unit if subscriber registered in roaming but his balance at the moment is equal to 0. This situation is impossible in standard case. Automatic SIM card activation Send an email to the anti-fraud unit if subscriber registered in roaming but his balance at the moment is equal to 0. This situation is impossible in standard case. Dealer Motivation Case Trigger bonus for a dealer when we discover that purchase happened attributable to him/her.
  • 19. Apache Flink One tool, multiple versions
  • 20. © Copyright. All rights reserved. Not to be reproduced without prior written consent. One tool, multiple languages Java 8 or 11 Python SQL Scala 2.11 or 2.12
  • 21. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Where should I install? Standalone Kubernetes YARN cluster ● CICD process ● Service Discovery - monitoring with Prometheus ● Scalability ● Managing resources ● A/B Testing
  • 22. © Copyright. All rights reserved. Not to be reproduced without prior written consent. High Availability of Flink JobManager level Storage level ● ZooKeeper ● Kubernetes (beta) ● High Availability of storage to/from which Flink writes/reads savepoints and checkpoints ● Performance of storage Job Strategy ● Data reprocessing policy ● How to deploy new job?
  • 23. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Flink K8S Operator Kubernetes Operator for Apache Flink Ververica Platform Native Kubernetes - Apache Flink CRDs Yes Yes No No CICD Kubernetes API Kubernetes API REST API or Web UI Kubernetes API Installation Helm chart or raw Kubernetes manifests Helm chart or raw Kubernetes manifests Helm chart or raw Kubernetes manifests No need to install any component SQL Editor No No Yes No Dependencies No No Persistence volume for database Object storage for artifactory No Status beta beta production beta
  • 24. Flink + Kubernetes = ? Overview in the article here
  • 25. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Why Flink on Kubernetes? Simpler deployment process Flexible jobs management Simple Service Discovery - Prometheus Flexible testing
  • 26. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Installation & Configuration Helm A package manager for Kubernetes CICD tool Example: Gitlab CI Kubernetes API Flink jobs
  • 27. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Testing Incubating Mode A/B Testing Blue Green Deployment Production data Production job Job Incubating mode Separated output Standard output Dedicated TaskManagers Dedicated TaskManagers savepoi nt Proxy Flink Job #1 Flink Job #2 Result #1 Result #2
  • 28. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Deployment process Versioning images Unit & Integration tests Git Flow Deployment process Monitoring
  • 29. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes aspects Dedicated namespaces Secured access to Flink (RBAC) Configuration files Storage for savepoints&checkpoints Resources Secrets Network performance
  • 30. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Self-healing and autoscaling Scale based on metrics Flink restarts External tool for fixing Re-create cluster Automate manual tasks
  • 31. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Job Cluster & Session Cluster Job Cluster Session Cluster Full set of Flink cluster for each individual job Standalone Flink cluster on Kubernetes Short running tasks Ad-hoc queries Long running tasks Separate images for different jobs
  • 32. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Stories from production ● Automate in the beginning ● CICD pipeline is a must ● Verify JVM metrics ● Test different Flink configurations to get the best performance and no restarts ● Secure access to Flink jobs ● Get logs from Flink TMs and JMs
  • 34. © Copyright. All rights reserved. Not to be reproduced without prior written consent. How to start locally? ● Minikube / Docker Desktop or any different local K8s env ● Ververica Platform ● Locally started Kafka cluster or use a Datagen APACHE FLINK KUBERNETES STREAMING SQL
  • 36. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Observability Observability is about measuring how well internal states of the system can be inferred from knowledge of its external outputs (according to the control theory).
  • 37. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Part One: Metrics Get metrics from environment and application - but how?
  • 38. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Prometheus - Kubernetes-native solution open-source systems monitoring and alerting toolkit joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes a lot of exporters you can write your own easily mature ecosystem PushGateway, Blackbox, AlertManager, etc.
  • 39. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Prometheus - simple or complex High Availability? Simple Complex Example solutions: Cortex (above), Thanos, M3DB
  • 40. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Pull vs. push-based monitoring Pull Push Collector takes metrics Agents push metrics Workload on central poller increases with the number of devices polled. Polling task fully distributed among agents, resulting in linear scalability. Polling protocol can potentially open up system to remote access and denial of service attacks. Push agents are inherently secure against remote attacks since they do not listen for network connections. Flexible: poller can ask for any metric at any time. Relatively inflexible: pre-determined, fixed set of measurements are periodically exported.
  • 41. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Prometheus - Stories service discovery simple on k8s limited security archived data how old data is required? monitor monitoring
  • 42. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Part Two: Logs analytics 1. Get logs from app or environment. 2. Save logs. 3. Query them. 4. Make your system self-healing and discover what’s happening inside your platform.
  • 43. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Logs analytics - which tool should I choose? Logs Analytics for Developers Logs Analytics for Business Loki ElasticSearch
  • 44. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ELK vs. Loki ELK Loki + Promtail/Fluentd Indexing Keys and content of each key Only labels Query language Query DSL or Lucene QL LogQL Tool for data visualisation Kibana Grafana Query performances Faster due to indexed all the data Slower due to indexing only labels Resource requirements Higher due to the need of indexing Lower due to index only labels
  • 45. © Copyright. All rights reserved. Not to be reproduced without prior written consent. What about alerts? Alerts signify that a human needs to take action immediately in response to something that is either happening or about to happen, in order to improve the situation.
  • 47. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Flink - Complex Event Processing Article. Codebase for example.
  • 48. © Copyright. All rights reserved. Not to be reproduced without prior written consent. DevOps best practises Article.
  • 49. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes - first setup ● Minikube ● Kind ● Use Kubernetes service from public cloud provider like AWS, GCP, Azure during free tier
  • 50. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes + Flink - Operator $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/crd.yaml $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/namespace.yaml $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/role.yaml $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/role-binding.yaml $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/deploy/flinkk8soperator.yaml Verify if it works: $ kubectl -n flink-operator get po Run the example job: $ kubectl create -f https://raw.githubusercontent.com/lyft/flinkk8soperator/v0.5.0/examples/wordcount/flink-operator-custom-resou rce.yaml Verify if it is running and its status: $ kubectl get flinkapplication.flink.k8s.io -n flink-operator wordcount-operator-example -o yaml Requirements: Kubernetes cluster, kubectl
  • 51. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Kubernetes + Ververica Platform Install Ververica Platform locally with Helm $ helm repo add ververica https://charts.ververica.com $ helm install vvp ververica/ververica-platform $ helm install vvp ververica/ververica-platform --set acceptCommunityEditionLicense=true Verify if Ververica is up $ kubectl get po Access the web user interface and REST API $ kubectl port-forward service/vvp-ververica-platform 8080:80 Do you want to test Flink SQL feature? Use Flink Faker (a data generator source connector) https://github.com/knaufk/flink-faker/ It requires changing used image for vvp-gateway. Requirements: Kubernetes cluster, kubectl, Helm
  • 52. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Join Us! Data Engineer Spark, Kafka, Airflow, public cloud Link Backend Engineer Java / Scala, microservices Link MLOps Engineer MLOps tools, Python, public cloud Link DevOps / SRE GCP, Terraform, Prometheus Link
  • 53. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Q&A
  • 54. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Contact details albert.lewandowski@getindata.com LinkedIn: https://www.linkedin.com/in/albert-lewandowski
  • 55. Thank you for your attention!