SlideShare a Scribd company logo
1 of 51
Download to read offline
Kevin Lam
Twitter: @CodesKlam
Apache Flink
Adoption @ Shopify
Team Mission:
enable near real-time
stream processing
applications
@CodesKlam
2M+ ~175
Merchants Countries
@CodesKlam
N to 1 @CodesKlam
@CodesKlam
@CodesKlam
Problems
@CodesKlam
Too specialized
Problems
@CodesKlam
Too specialized
Broken Ownership
Problems
@CodesKlam
Too specialized
Broken Ownership
Unstable
Problems
@CodesKlam
👋
@CodesKlam
Why Flink?
@CodesKlam
Driven by pilot use case
Why Flink?
@CodesKlam
Driven by pilot use case
Large complex stateful transforms
Why Flink?
@CodesKlam
Driven by pilot use case
Large complex stateful transforms
Consolidate o
ff
ers
Why Flink?
@CodesKlam
Why Flink?
@CodesKlam
1. Streaming Primitives
Why Flink?
@CodesKlam
1. Streaming Primitives
2. Infrastructure
Why Flink?
@CodesKlam
1. Streaming Primitives
2. Infrastructure
3. Scalability
Why Flink?
@CodesKlam
1. Streaming Primitives
2. Infrastructure
3. Scalability
4. Community Support
5. Ease of Programming + Languages
Why Flink?
@CodesKlam
Enable Flink Adoption
Deprecated and shutdown the
old services
The plan
@CodesKlam
Enable Flink Adoption
Deprecated and shutdown the
old services
The plan
@CodesKlam
Our strategy:
an ecosystem of tools
to make everything
possible, then make it
easy
@CodesKlam
Foundations
@CodesKlam
Library
Foundations
@CodesKlam
Library
Deployment Infra & Tooling
Foundations
@CodesKlam
Library
Deployment Infra & Tooling
Cookbook
Foundations
@CodesKlam
Library
Deployment Infra & Tooling
Cookbook
Foundations
Hands-on Support
@CodesKlam
Library
Deployment Infra & Tooling
Cookbook
Foundations
Hands-on Support
Project Generator
@CodesKlam
@CodesKlam
@CodesKlam
@CodesKlam
@CodesKlam
@CodesKlam
@CodesKlam
@CodesKlam
implicit val env: StreamExecutionEnvironment = Trickle.createEnv()
@CodesKlam
implicit val env: StreamExecutionEnvironment = Trickle.createEnv()
val checkoutTrackSource: DataStream[CheckoutTrack_V3] =
MonorailKafkaSource.asDataStream[CheckoutTrack_V3](…)
val lineItemsSource: DataStream[Change[LineItemRecord]] =
CDCKafkaSource.asDataStream[LineItemRecord](…)
val sink: Sink[VendorPopularityTopology.Result] = BigtableSink[Result]
(…)
@CodesKlam
implicit val env: StreamExecutionEnvironment = Trickle.createEnv()
val checkoutTrackSource: DataStream[CheckoutTrack_V3] =
MonorailKafkaSource.asDataStream[CheckoutTrack_V3](…)
val lineItemsSource: DataStream[Change[LineItemRecord]] =
CDCKafkaSource.asDataStream[LineItemRecord](…)
val sink: Sink[VendorPopularityTopology.Result] = BigtableSink[Result]
(…)
val checkouts = processCheckoutTrackSource(checkoutTrackSource)
val lineItems = processLineItemsSource(lineItemsSource)
val results = aggregateJoinResults(
joinCheckoutsAndLineItems(checkouts, lineItems)
)
@CodesKlam
implicit val env: StreamExecutionEnvironment = Trickle.createEnv()
val checkoutTrackSource: DataStream[CheckoutTrack_V3] =
MonorailKafkaSource.asDataStream[CheckoutTrack_V3](…)
val lineItemsSource: DataStream[Change[LineItemRecord]] =
CDCKafkaSource.asDataStream[LineItemRecord](…)
val sink: Sink[VendorPopularityTopology.Result] = BigtableSink[Result]
(…)
val checkouts = processCheckoutTrackSource(checkoutTrackSource)
val lineItems = processLineItemsSource(lineItemsSource)
val results = aggregateJoinResults(
joinCheckoutsAndLineItems(checkouts, lineItems)
)
results.sinkTo(sink)
@CodesKlam
implicit val env: StreamExecutionEnvironment = Trickle.createEnv()
val checkoutTrackSource: DataStream[CheckoutTrack_V3] =
MonorailKafkaSource.asDataStream[CheckoutTrack_V3](…)
val lineItemsSource: DataStream[Change[LineItemRecord]] =
CDCKafkaSource.asDataStream[LineItemRecord](…)
val sink: Sink[VendorPopularityTopology.Result] = BigtableSink[Result]
(…)
val checkouts = processCheckoutTrackSource(checkoutTrackSource)
val lineItems = processLineItemsSource(lineItemsSource)
val results = aggregateJoinResults(
joinCheckoutsAndLineItems(checkouts, lineItems)
)
results.sinkTo(sink)
@CodesKlam
How did it go?
@CodesKlam
Made Flink generally available in Jan 2022
How did it go?
@CodesKlam
Made Flink generally available in Jan 2022
>75 applications in production
Millions of events processed per second
How did it go?
@CodesKlam
Made Flink generally available in Jan 2022
>75 applications in production
Millions of events processed per second
How did it go?
Legacy Systems shut down
@CodesKlam
🧠 Learnings
@CodesKlam
Treasure
f
irst customers
🧠 Learnings
@CodesKlam
Treasure
f
irst customers
Be nimble
🧠 Learnings
@CodesKlam
Treasure
f
irst customers
Be nimble
Distributed ownership
🧠 Learnings
@CodesKlam
Low Code Flink Applications
Infrastructure Automation
And more!
Next Steps
@CodesKlam
Thanks!
twitter: @CodesKlam
Also, we’re hiring! shopify.com/careers

More Related Content

What's hot

What's hot (20)

Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at Alibaba
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
 
Mvcc (oracle, innodb, postgres)
Mvcc (oracle, innodb, postgres)Mvcc (oracle, innodb, postgres)
Mvcc (oracle, innodb, postgres)
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Optimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File PruningOptimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File Pruning
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Real Time UI with Apache Kafka Streaming Analytics of Fast Data and Server Push
Real Time UI with Apache Kafka Streaming Analytics of Fast Data and Server PushReal Time UI with Apache Kafka Streaming Analytics of Fast Data and Server Push
Real Time UI with Apache Kafka Streaming Analytics of Fast Data and Server Push
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 

Similar to Apache Flink Adoption @ Shopify

Have your Cake and Eat it Too - Architecture for Batch and Real-time processing
Have your Cake and Eat it Too - Architecture for Batch and Real-time processingHave your Cake and Eat it Too - Architecture for Batch and Real-time processing
Have your Cake and Eat it Too - Architecture for Batch and Real-time processing
DataWorks Summit
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
Ben Stopford
 
Scaling Rails With Torquebox Presented at JUDCon:2011 Boston
Scaling Rails With Torquebox Presented at JUDCon:2011 BostonScaling Rails With Torquebox Presented at JUDCon:2011 Boston
Scaling Rails With Torquebox Presented at JUDCon:2011 Boston
benbrowning
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward
 
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
Kai Wähner
 
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and ConfluentEvent Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
confluent
 

Similar to Apache Flink Adoption @ Shopify (20)

ColdFusion Keynote: Building the Agile Web Since 1995
ColdFusion Keynote: Building the Agile Web Since 1995ColdFusion Keynote: Building the Agile Web Since 1995
ColdFusion Keynote: Building the Agile Web Since 1995
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Have your cake and eat it too
Have your cake and eat it tooHave your cake and eat it too
Have your cake and eat it too
 
Have your Cake and Eat it Too - Architecture for Batch and Real-time processing
Have your Cake and Eat it Too - Architecture for Batch and Real-time processingHave your Cake and Eat it Too - Architecture for Batch and Real-time processing
Have your Cake and Eat it Too - Architecture for Batch and Real-time processing
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
 
Scaling Rails With Torquebox Presented at JUDCon:2011 Boston
Scaling Rails With Torquebox Presented at JUDCon:2011 BostonScaling Rails With Torquebox Presented at JUDCon:2011 Boston
Scaling Rails With Torquebox Presented at JUDCon:2011 Boston
 
Taming WebSocket with Scarlet
Taming WebSocket with ScarletTaming WebSocket with Scarlet
Taming WebSocket with Scarlet
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at Cisco
 
Full-Stack Reativo com Spring WebFlux + Angular - Devs Java Girl
Full-Stack Reativo com Spring WebFlux + Angular - Devs Java GirlFull-Stack Reativo com Spring WebFlux + Angular - Devs Java Girl
Full-Stack Reativo com Spring WebFlux + Angular - Devs Java Girl
 
session_01_react_.pptx
session_01_react_.pptxsession_01_react_.pptx
session_01_react_.pptx
 
Building Advanced Serverless Workflows with AWS Step Functions | AWS Floor28
Building Advanced Serverless Workflows with AWS Step Functions | AWS Floor28Building Advanced Serverless Workflows with AWS Step Functions | AWS Floor28
Building Advanced Serverless Workflows with AWS Step Functions | AWS Floor28
 
The use case of a scalable architecture
The use case of a scalable architectureThe use case of a scalable architecture
The use case of a scalable architecture
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
 
Real time dashboards with Kafka and Druid
Real time dashboards with Kafka and DruidReal time dashboards with Kafka and Druid
Real time dashboards with Kafka and Druid
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
 
Vertica the convertro way
Vertica   the convertro wayVertica   the convertro way
Vertica the convertro way
 
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
Apache Kafka in the Telco Industry (OSS, BSS, OTT, IMS, NFV, Middleware, Main...
 
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and ConfluentEvent Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
 
Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
 
Mike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and PatternsMike Taulty MIX10 Silverlight Frameworks and Patterns
Mike Taulty MIX10 Silverlight Frameworks and Patterns
 

Recently uploaded

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Recently uploaded (20)

WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
 
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAMWSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
 
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & InnovationWSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
 
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
 
WSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid Environments
 
WSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital Businesses
 
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfAzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
 

Apache Flink Adoption @ Shopify