SlideShare a Scribd company logo

Flink in Zalando's World of Microservices

Zalando Technology
Zalando Technology
Zalando TechnologyZalando Technology

Berlin Apache Flink Meetup, May 2016 In this talk we present Zalando's microservices architecture and introduce Saiki – our next generation data integration and distribution platform on AWS. We show why we chose Apache Flink to serve as our stream processing framework and describe how we employ it for our current use cases: business process monitoring and continuous ETL. We then have an outlook on future use cases. By Javier Lopez & Mihail Vieru, Zalando, Zalando SE

Flink in Zalando's World of Microservices

1 of 54
Download to read offline
world of microservices
Flink in Zalando’s
Agenda
Zalando’s Microservices Architecture
Saiki - Data Integration and Distribution at Scale
Flink in a Microservices World
Stream Processing Use Cases:
Business Process Monitoring
Continuous ETL
Future Work
2
About us
Mihail Vieru
Big Data Engineer,
Business Intelligence
Javier López
Big Data Engineer,
Business Intelligence
3
4
15 countries
4 fulfillment centers
~18 million active customers
2.9 billion € revenue 2015
150,000+ products
10,000+ employees
One of Europe's largest
online fashion retailers
Zalando Technology
1100+ TECHNOLOGISTS
Rapidly growing
international team
http://tech.zalando.com
6
Ad

Recommended

Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mark Kromer
 
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...Zalando Technology
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lakeMykola Zerniuk
 

More Related Content

What's hot

Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Building Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueBuilding Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueAmazon Web Services
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflakeSivakumar Ramar
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best PracticesEduardo Castro
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta LakeKnoldus Inc.
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
From a hack to Data Mesh (Devoxx 2022)
From a hack to Data Mesh (Devoxx 2022)From a hack to Data Mesh (Devoxx 2022)
From a hack to Data Mesh (Devoxx 2022)Simon Maurin
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowDremio Corporation
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudHow to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudDenodo
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowHow to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowLucas Arruda
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingDatabricks
 

What's hot (20)

Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Building Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS GlueBuilding Serverless ETL Pipelines with AWS Glue
Building Serverless ETL Pipelines with AWS Glue
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best Practices
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
From a hack to Data Mesh (Devoxx 2022)
From a hack to Data Mesh (Devoxx 2022)From a hack to Data Mesh (Devoxx 2022)
From a hack to Data Mesh (Devoxx 2022)
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
Data lake
Data lakeData lake
Data lake
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudHow to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud DataflowHow to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
 

Similar to Flink in Zalando's World of Microservices

Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...Flink Forward
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Value Association
 
data Artisans Product Announcement
data Artisans Product Announcementdata Artisans Product Announcement
data Artisans Product AnnouncementFlink Forward
 
What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5Idan Tohami
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionRittman Analytics
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsC4Media
 
Querona Presentation 2018
Querona Presentation 2018Querona Presentation 2018
Querona Presentation 2018Synergo!
 
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...Lucas Jellema
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkFabian Hueske
 
Apache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureGyula Fóra
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingTimothy Spann
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dcBob Ward
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedInAmy W. Tang
 
Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.Fabian Hueske
 
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIsFabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIsFlink Forward
 
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Implyconfluent
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Lucas Jellema
 

Similar to Flink in Zalando's World of Microservices (20)

Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
 
data Artisans Product Announcement
data Artisans Product Announcementdata Artisans Product Announcement
data Artisans Product Announcement
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live Streams
 
Querona Presentation 2018
Querona Presentation 2018Querona Presentation 2018
Querona Presentation 2018
 
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
 
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Apache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and Future
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.
 
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIsFabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
 
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
 
Oow2016 review-db-dev-bigdata-BI
Oow2016 review-db-dev-bigdata-BIOow2016 review-db-dev-bigdata-BI
Oow2016 review-db-dev-bigdata-BI
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
 

More from Zalando Technology

How We Made our Tech Organization and Architecture Converge Towards Scalability
How We Made our Tech Organization and Architecture Converge Towards ScalabilityHow We Made our Tech Organization and Architecture Converge Towards Scalability
How We Made our Tech Organization and Architecture Converge Towards ScalabilityZalando Technology
 
Powering Radical Agility with Docker
Powering Radical Agility with Docker Powering Radical Agility with Docker
Powering Radical Agility with Docker Zalando Technology
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnReactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnZalando Technology
 
Zalando Tech: From Java to Scala in Less Than Three Months
Zalando Tech: From Java to Scala in Less Than Three MonthsZalando Tech: From Java to Scala in Less Than Three Months
Zalando Tech: From Java to Scala in Less Than Three MonthsZalando Technology
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkZalando Technology
 
Building a Reactive RESTful API with Akka Http & Slick
Building a Reactive RESTful API with Akka Http & SlickBuilding a Reactive RESTful API with Akka Http & Slick
Building a Reactive RESTful API with Akka Http & SlickZalando Technology
 
Radical Agility with Autonomous Teams and Microservices
Radical Agility with Autonomous Teams and MicroservicesRadical Agility with Autonomous Teams and Microservices
Radical Agility with Autonomous Teams and MicroservicesZalando Technology
 
Order Processing at Scale: Zalando at Camunda Community Day
Order Processing at Scale: Zalando at Camunda Community DayOrder Processing at Scale: Zalando at Camunda Community Day
Order Processing at Scale: Zalando at Camunda Community DayZalando Technology
 
ZMON: Monitoring Zalando's Engineering Platform
ZMON: Monitoring Zalando's Engineering PlatformZMON: Monitoring Zalando's Engineering Platform
ZMON: Monitoring Zalando's Engineering PlatformZalando Technology
 
Mobile Testing Challenges at Zalando Tech
Mobile Testing Challenges at Zalando TechMobile Testing Challenges at Zalando Tech
Mobile Testing Challenges at Zalando TechZalando Technology
 
Auto-scaling your API: Insights and Tips from the Zalando Team
Auto-scaling your API: Insights and Tips from the Zalando TeamAuto-scaling your API: Insights and Tips from the Zalando Team
Auto-scaling your API: Insights and Tips from the Zalando TeamZalando Technology
 
Radical Agility with Autonomous Teams and Microservices in the Cloud
Radical Agility with Autonomous Teams and Microservices in the CloudRadical Agility with Autonomous Teams and Microservices in the Cloud
Radical Agility with Autonomous Teams and Microservices in the CloudZalando Technology
 

More from Zalando Technology (13)

How We Made our Tech Organization and Architecture Converge Towards Scalability
How We Made our Tech Organization and Architecture Converge Towards ScalabilityHow We Made our Tech Organization and Architecture Converge Towards Scalability
How We Made our Tech Organization and Architecture Converge Towards Scalability
 
Powering Radical Agility with Docker
Powering Radical Agility with Docker Powering Radical Agility with Docker
Powering Radical Agility with Docker
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnReactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
 
Zalando Tech: From Java to Scala in Less Than Three Months
Zalando Tech: From Java to Scala in Less Than Three MonthsZalando Tech: From Java to Scala in Less Than Three Months
Zalando Tech: From Java to Scala in Less Than Three Months
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
 
Building a Reactive RESTful API with Akka Http & Slick
Building a Reactive RESTful API with Akka Http & SlickBuilding a Reactive RESTful API with Akka Http & Slick
Building a Reactive RESTful API with Akka Http & Slick
 
Radical Agility with Autonomous Teams and Microservices
Radical Agility with Autonomous Teams and MicroservicesRadical Agility with Autonomous Teams and Microservices
Radical Agility with Autonomous Teams and Microservices
 
Order Processing at Scale: Zalando at Camunda Community Day
Order Processing at Scale: Zalando at Camunda Community DayOrder Processing at Scale: Zalando at Camunda Community Day
Order Processing at Scale: Zalando at Camunda Community Day
 
ZMON: Monitoring Zalando's Engineering Platform
ZMON: Monitoring Zalando's Engineering PlatformZMON: Monitoring Zalando's Engineering Platform
ZMON: Monitoring Zalando's Engineering Platform
 
Mobile Testing Challenges at Zalando Tech
Mobile Testing Challenges at Zalando TechMobile Testing Challenges at Zalando Tech
Mobile Testing Challenges at Zalando Tech
 
Auto-scaling your API: Insights and Tips from the Zalando Team
Auto-scaling your API: Insights and Tips from the Zalando TeamAuto-scaling your API: Insights and Tips from the Zalando Team
Auto-scaling your API: Insights and Tips from the Zalando Team
 
Radical Agility with Autonomous Teams and Microservices in the Cloud
Radical Agility with Autonomous Teams and Microservices in the CloudRadical Agility with Autonomous Teams and Microservices in the Cloud
Radical Agility with Autonomous Teams and Microservices in the Cloud
 

Recently uploaded

Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfMarielaL5
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersThousandEyes
 
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxKyle Willson
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPMemory Fabric Forum
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEandreiandasan
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologySafe Software
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaISPMAIndia
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Curtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfCurtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfDomotica daVinci
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfDomotica daVinci
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Bringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptxBringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptxMaarten Balliauw
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, GoogleISPMAIndia
 
The Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolThe Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolProduct School
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfThomas Poetter
 

Recently uploaded (20)

Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdf
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for Partners
 
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Curtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfCurtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdf
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user group
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Bringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptxBringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptx
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
 
The Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolThe Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product School
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
 

Flink in Zalando's World of Microservices

  • 2. Agenda Zalando’s Microservices Architecture Saiki - Data Integration and Distribution at Scale Flink in a Microservices World Stream Processing Use Cases: Business Process Monitoring Continuous ETL Future Work 2
  • 3. About us Mihail Vieru Big Data Engineer, Business Intelligence Javier López Big Data Engineer, Business Intelligence 3
  • 4. 4
  • 5. 15 countries 4 fulfillment centers ~18 million active customers 2.9 billion € revenue 2015 150,000+ products 10,000+ employees One of Europe's largest online fashion retailers
  • 6. Zalando Technology 1100+ TECHNOLOGISTS Rapidly growing international team http://tech.zalando.com 6
  • 8. Classical ETL process Vintage Business Intelligence Business Logic Data Warehouse (DWH) DatabaseDBA BI Business Logic Database Business Logic Database Business Logic Database Dev 8
  • 9. Vintage Business Intelligence Very stable architecture that is still in use in the oldest components Classical ETL process • Use-case specific • Usually outputs data into a Data Warehouse • well structured • easy to use by the end user (SQL) 9
  • 12. Autonomy Autonomous teams • can choose own technology stack • including persistence layer • are responsible for operations • should use isolated AWS accounts 12
  • 13. Supporting autonomy — Microservices Business Logic Database RESTAPI Business Logic Database RESTAPI Business Logic Database RESTAPI Business Logic Database RESTAPI Business Logic Database RESTAPI Business Logic Database RESTAPI Business Logic Database RESTAPI 13
  • 14. Supporting autonomy — Microservices Business Logic Database Team A Business Logic Database Team B RESTAPI RESTAPI Applications communicate using REST APIs Databases hidden behind the walls of AWS VPC public Internet 14
  • 15. Supporting autonomy — Microservices Business Logic Database Team A Business Logic Database Team B Classical ETL process is impossible! RESTAPI public Internet RESTAPI 15
  • 16. Supporting autonomy — Microservices Business Logic Database RESTAPI AppA Business Logic Database RESTAPI AppB Business Logic Database RESTAPI AppC Business Logic Database RESTAPI AppD 16
  • 17. Supporting autonomy — Microservices Business Intelligence Business Logic Database RESTAPI AppA Business Logic Database RESTAPI AppB Business Logic Database RESTAPI AppC Business Logic Database RESTAPI AppD 17
  • 18. Supporting autonomy — Microservices App A App B App DApp C BI Data Warehouse ? 18
  • 19. SAIKI
  • 20. SAIKI Saiki Data Platform App A App B App DApp C BI Data Warehouse 20
  • 21. SAIKI Saiki — Data Integration & Distribution App A App B App DApp C BI Data Warehouse Buku 21
  • 22. SAIKI Saiki — Data Integration & Distribution App A App B App DApp C BI Data Warehouse Buku AWS S3 Tukang REST API 22
  • 23. SAIKI Saiki — Data Integration & Distribution App A App B App DApp C BI Data Warehouse Buku Tukang REST API AWS S3 23 E.g. Forecast DB
  • 24. Saiki — Data Integration & Distribution Old Load Process New Load Process relied on Delta Loads relies on Event Stream JDBC connection RESTful HTTPS connections Data Quality could be controlled by BI independently trust for Correctness of Data in the delivery teams PostgreSQL dependent Independent of the source technology stack N to 1 data stream N to M streams, no single data sink 24
  • 25. BI Data Warehouse E.g. Forecast DB SAIKI Saiki — Data Integration & Distribution App A App B App DApp C Buku Tukang REST API AWS S3 25
  • 26. BI Data Warehouse E.g. Forecast DB SAIKI Saiki — Data Integration & Distribution App A App B App DApp C Buku Tukang REST API Stream Processing via Apache Flink AWS S3 26
  • 27. BI Data Warehouse E.g. Forecast DB SAIKI Saiki — Data Integration & Distribution App A App B App DApp C Buku Tukang REST API Stream Processing via Apache Flink Data Lake . AWS S3 27
  • 29. Current state in BI 29 • Centralized ETL tools (Pentaho DI/Kettle and Oracle Stored Procedures) • Reporting data does not arrive in real time • Data to data science & analytical teams is provided using Exasol DB • All data comes from SQL databases
  • 30. Opportunities for Next Gen BI • Cloud Computing • Distributed ETL • Scale • Access to real time data • All teams publishing data to central bus (Kafka) 30
  • 31. Opportunities for Next Gen BI • Hub for Data teams • Data Lake provides distributed access and fine grained security • Data can be transformed (aggregated, joined, etc.) before delivering it to data teams • Non structured data • “General-purpose data processing engines like Flink or Spark let you define own data types and functions.” - Fabian Hueske 31
  • 32. The right fit Stream Processing 32
  • 33. The right fit - Data processing engine (I) 33 Feature Apache Spark 1.5.2 Apache Flink 0.10.1 processing mode micro-batching tuple at a time temporal processing support processing time event time, ingestion time, processing time batch processing yes yes latency seconds sub-second back pressure handling through manual configuration implicit, through system architecture state storage distributed dataset in memory distributed key/value store state access full state scan for each microbatch value lookup by key state checkpointing yes yes high availability mode yes yes
  • 34. The right fit - Data processing engine (II) 34 Feature Apache Spark 1.5.2 Apache Flink 0.10.1 event processing guarantee exactly once exactly once Apache Kafka connector yes yes Amazon S3 connector yes yes operator library neutral ++ (split, windowByCount..) language support Java, Scala, Python Java, Scala, Python deployment standalone, YARN, Mesos standalone, YARN support neutral ++ (mailing list, direct contact & support from data Artisans) documentation + neutral maturity ++ neutral
  • 35. Saiki Data Platform Apache Flink • true stream processing framework • process events at a consistently high rate with low latency • scalable • great community and on-site support from Berlin/ Europe • university graduates with Flink skills https://tech.zalando.com/blog/apache-showdown-flink-vs.-spark/ 35
  • 36. Flink in AWS - Our appliance 36 MASTER ELB EC2 Docker Flink Master EC2 Docker Flink Shadow Master WORKERS ELB EC2 Docker Flink Worker EC2 Docker Flink Worker EC2 Docker Flink Worker
  • 39. Business Process A business process is in its simplest form a chain of correlated events: Business Events from the whole Zalando platform flow through Saiki => opportunity to process those streams in near real-time 39 start event completion event ORDER_CREATED ALL_SHIPMENTS_DONE
  • 40. Near Real-Time Business Process Monitoring • Check if technically the Zalando platform works • 1000+ Event Types • Analyze data on the fly: • Order velocities • Delivery velocities • Control SLAs of correlated events, e.g. parcel sent out after order 40
  • 41. Architecture BPM 41 App A App B Nakadi Event Bus App C Operational Systems Kafka2Kafka Unified Log Stream Processing Tokoh ZMON Cfg Service PUBLICINTERNET OAUTH Saiki BPM Elasticsearch
  • 42. How we use Flink in BPM • 1 Event Type -> 1 Kafka topic • Define a granularity level of 1 minute (TumblingWindows) • Analyze processes with multiple event types (Join & Union) • Generation and processing of Complex Events (CEP & State) 42
  • 44. Extract Transform Load (ETL) Traditional ETL process: • Batch processing • No real time • ETL tools • Heavy processing on the storage side 44
  • 45. What changed with Radical Agility? • Data comes in a semi-structured format (JSON payload) • Data is distributed in separate Kafka topics • There would be peak times, meaning that the data flow will increase by several factors • Data sources number increased by several factors 45
  • 46. Architecture Continuous ETL 46 App A App B Nakadi Event Bus App C Operational Systems Kafka2Kafka Unified Log Stream Processing Saiki Continuous ETL Tukang Oracle DWH Orang
  • 47. How we (would) use Flink in Continuous ETL • Transformation of complex payloads into simple ones for easier consumption in Oracle DWH • Combine several topics based on Business Rules (Union, Join) • Pre-Aggregate data to improve performance in the generation of reports (Windows, State) • Data cleansing • Data validation 47
  • 49. Complex Event Processing for BPM Cont. example business process: • Multiple PARCEL_SHIPPED events per order • Generate complex event ALL_SHIPMENTS_DONE, when all PARCEL_SHIPPED events received (CEP lib, State) 49
  • 50. Deployments from other BI Teams Flink Jobs from other BI Teams Requirements: • manage and control deployments • isolation of data flows • prevent different jobs from writing to the same sink • resource management in Flink • share cluster resources among concurrently running jobs StreamSQL would significantly lower the entry barrier 50
  • 51. Replace Kafka2Kafka • Python app • extracts events from REST API Nakadi Event Bus • writes them to our Kafka cluster Idea: Replace Python app with Flink Jobs, which would write directly to Saiki’s Kafka cluster (under discussion) 51
  • 52. Other Future Topics • CD integration for Flink appliance • Monitoring data flows via heartbeats • Flink <-> Kafka • Flink <-> Elasticsearch • New use cases for Real Time Analytics/ BI • Sales monitoring 52