SlideShare a Scribd company logo
The magic behind your Lyft ride prices
A case study on machine learning and streaming
Strata Data, San Francisco, March 27th 2019
Rakesh Kumar | Engineer, Pricing
Thomas Weise | @thweise | Engineer, Streaming Platform
go.lyft.com/dynamic-pricing-strata-sf-2019
Agenda
2
● Introduction to dynamic pricing
● Legacy pricing infrastructure
● Streaming use case
● Streaming based infrastructure
● Beam & multiple languages
● Beam Flink runner
● Lessons learned
3
Dynamic Pricing
Supply/Demand curve
ETA
Pricing
Notifications
Detect Delays
Coupons
User Delight
Fraud
Behaviour Fingerprinting
Monetary Impact
Imperative to act fast
Top Destinations
Core Experience
Introduction to
Dynamic Pricing
4
What is prime time?
Location + time specific multiplier on
the base fare for a ride
e.g. "in downtown SF at 5:00pm, prime
time is 2.0"
Means we double the normal fare in
that place at that time
Location: geohash6 (e.g. ‘9q8yyq’)
Time: calendar minute
5
6
● Balance supply and demand to maintain service level
● State of marketplace is constantly changing
● "Surge pricing solves the wild goose chase" (paper)
Why do we need prime time?
Legacy Pricing
Infrastructure
7
Legacy architecture: A series of cron jobs
● Ingest high volume of client app events
(Kinesis, KCL)
● Compute features (e.g. demand,
conversation rate, supply) from events
● Run ML models on features to compute
primetime for all regions (per min, per gh6)
SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...}
NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...}
8
Problems
1. Latency
2. Code complexity (LOC)
3. Hard to add new features involving windowing/join (i.e. arbitrary demand
windows, subregional computation)
4. No dynamic / smart triggers
9
Can we use Flink?
10
11
Streaming Stack
11
Streaming
Application
(SQL, Java)
Stream / Schema
Registry
Deployment
Tooling
Metrics &
Dashboards
Alerts Logging
Amazon
EC2
Amazon S3 Wavefront
Salt
(Config / Orca)
Docker
Source Sink
12
Streaming and Python
● Flink and many other big data ecosystem projects are Java / JVM based
○ Team wants to adopt streaming, but doesn’t have the Java skills
○ Jython != Python
● Use cases for different language environments
○ Python primary option for Machine Learning
● Cost of many API styles and runtime environments
13
Solution with Beam
Streaming
Application
(Python/Beam)
Source Sink
Streaming based
Pricing Infrastructure
14
15
Pipeline (conceptual outline)
kinesis events
(source)
aggregate and
window
filter events
run models to
generate
features
(culminating in
PT)
internal services
redis
ride_requested,
app_open, ...
unique_users_per_min,
unique_requests_per_5_
min, ...
conversion learner,
eta learner, ...
Lyft apps
(phones)
valid sessions,
dedupe, ...
Details of implementation
1. Filtering (with internal service calls)
2. Aggregation with Beam windowing: 1min, 5min (by event time)
3. Triggers: watermark or stateful processing
4. Machine learning models invoked using stateful Beam transforms
5. Final gh6:pt output from pipeline stored to Redis
16
Gains
• 60% reduction in latency
• Reuse of model code
• 10K => 4K LOC
• 300 => 120 AWS instances
17
Beam and multiple
languages
18
19
The Beam Vision
1. End users: who want to write pipelines in a
language that’s familiar.
2. SDK writers: who want to make Beam
concepts available in new languages.
Includes IOs: connectors to data stores.
3. Runner writers: who have a distributed
processing environment and want to
support Beam pipelines
Beam Model: Fn Runners
Apache
Flink
Apache
Spark
Beam Model: Pipeline Construction
Other
LanguagesBeam Java
Beam
Python
Execution Execution
Cloud
Dataflow
Execution
https://s.apache.org/apache-beam-project-overview
20
Multi-Language Support
● Initially Java SDK and Java Runners
● 2016: Start of cross-language support effort
● 2017: Python SDK on Dataflow
● 2018: Go SDK (for portable runners)
● 2018: Python on Flink MVP
● Next: Cross-language pipelines, more portable runners
21
Python Example
p = beam.Pipeline(runner=runner, options=pipeline_options)
(p
| ReadFromText("/path/to/text*") | Map(lambda line: ...)
| WindowInto(FixedWindows(120)
trigger=AfterWatermark(
early=AfterProcessingTime(60),
late=AfterCount(1))
accumulation_mode=ACCUMULATING)
| CombinePerKey(sum))
| WriteToText("/path/to/outputs")
)
result = p.run()
( What, Where, When, How )
22
⋮
input | Sum.PerKey()
Python
input.apply(
Sum.integersPerKey())
Java
SELECT key, SUM(value)
FROM input GROUP BY key
SQL (via Java)
⋮
Cloud Dataflow
Apache Spark
Apache Flink
Apache Apex
Gearpump
Apache Samza
Apache Nemo
(incubating)
IBM Streams
Sum Per Key
Java objects
Sum Per Key
Dataflow JSON API
Portability (originally)
https://s.apache.org/state-of-beam-sfo-2018
23
⋮
input | Sum.PerKey()
Python
stats.Sum(s, input)
Go
SELECT key, SUM(value)
FROM input GROUP BY key
SQL (via Java)
⋮
input.apply(
Sum.integersPerKey())
Java Apache Spark
Apache Flink
Apache Apex
Gearpump
Cloud Dataflow
Apache Samza
Apache Nemo
(incubating)
IBM Streams
Sum Per Key
Java objects
Sum Per Key
Portable protos
Portability (current)
https://s.apache.org/state-of-beam-sfo-2018
Beam Flink Runner
24
25
Portability Framework w/ Flink Runner
SDK
(Python)
Job Service
Artifact
Staging
Job Manager
Fn Services
(Beam Flink Task)
Task Manager
Executor / Fn API
Provision Control Data
Artifact
Retrieval
State Logging
gRPC
Pipeline (protobuf)
ClusterRunner
Dependencies
(optional)
python -m
apache_beam.examples.wordcount 
--input=/etc/profile 
--output=/tmp/py-wordcount-direct 
--runner=PortableRunner 
--job_endpoint=localhost:8099 
--streaming
Staging Location
(DFS, S3, …)
SDK Worker
(UDFs)
SDK Worker
(UDFs)
SDK Worker
(Python)
Flink Job
26
Portable Runner
● Provide Job Service endpoint (Job Management API)
● Translate portable pipeline representation to native (Flink) API
● Provide gRPC endpoints for control/data/logging/state plane
● Manage SDK worker processes that execute user code
● Manage bundle execution (with arbitrary user code) via Fn API
● Manage state for side inputs, user state and timers
Common implementation for JVM based runners
(/runners/java-fn-execution) and portable “Validate Runner”
integration test suite in Python!
27
Fn API - Bundle Processing
https://s.apache.org/beam-fn-api-processing-a-bundle
Bundle size
matters!
● Amortize
overhead
over many
elements
● Watermark
hold effect on
latency
28
Lyft Flink Runner Customizations
● Translator extension for streaming sources
○ Kinesis, Kafka consumers that we also use in Java Flink jobs
○ Message decoding, watermarks
● Python execution environment for SDK workers
○ Tailored to internal deployment tooling
○ Docker-free, frozen virtual envs
● https://github.com/lyft/beam/tree/release-2.11.0-lyft
29
Fn API
How slow is this ?
● Fn API Overhead 15% ?
● Fused stages
● Bundle size
● Parallel SDK workers
● TODO: Cython, protobuf
C++ bindings
decode, …, window count
(messages
| 'reshuffle' >> beam.Reshuffle()
| 'decode' >> beam.Map(lambda x: (__import__('random').randint(0, 511), 1))
| 'noop1' >> beam.Map(lambda x : x)
| 'noop2' >> beam.Map(lambda x : x)
| 'noop3' >> beam.Map(lambda x : x)
| 'window' >> beam.WindowInto(window.GlobalWindows(),
trigger=Repeatedly(AfterProcessingTime(5 * 1000)),
accumulation_mode= AccumulationMode.DISCARDING)
| 'group' >> beam.GroupByKey()
| 'count' >> beam.Map(count)
)
30
Fast enough for real Python work !
● c5.4xlarge machines (16 vCPU, 32 GB)
● 16 SDK workers / machine
● 1000 ms or 1000 records / bundle
● 280,000 transforms / second / machine (~ 17,500 per worker)
● Python user code will be gating factor
31
Beam Portability Recap
● Pipelines written in non-JVM languages on JVM runners
○ Python, Go on Flink (and others)
● Full isolation of user code
○ Native CPython execution w/o library restrictions
● Configurable SDK worker execution
○ Docker, Process, Embedded, ...
● Multiple languages in a single pipeline (future)
○ Use Java Beam IO with Python
○ Use TFX with Java
○ <your use case here>
32
Feature Support Matrix (Beam 2.11.0)
https://s.apache.org/apache-beam-portability-support-table
Lessons Learned
33
Lessons Learned
• Python Beam SDK and portable Flink runner evolving
• Keep pipeline simple - Flink tasks / shuffles are not free
• Stateful processing is essential for complex logic
• Model execution latency matters
• Instrument everything for monitoring
• Approach for pipeline upgrade and restart
• Mind your dependencies - rate limit API calls
• Testing story (integration, staging)
34
We’re Hiring! Apply at www.lyft.com/careers
or email data-recruiting@lyft.com
Data Engineering
Engineering Manager
San Francisco
Software Engineer
San Francisco, Seattle, &
New York City
Data Infrastructure
Engineering Manager
San Francisco
Software Engineer
San Francisco & Seattle
Experimentation
Software Engineer
San Francisco
Streaming
Software Engineer
San Francisco
Observability
Software Engineer
San Francisco
Please ask questions!
This presentation:
http://go.lyft.com/dynamic-pricing-strata-sf-2019

More Related Content

What's hot

UX Research Case Study - OVO History Page
UX Research Case Study - OVO History PageUX Research Case Study - OVO History Page
UX Research Case Study - OVO History Page
Annisa Kharisma
 
Amazon.com: the Hidden Empire - Update 2013
Amazon.com: the Hidden Empire - Update 2013Amazon.com: the Hidden Empire - Update 2013
Amazon.com: the Hidden Empire - Update 2013
Fabernovel
 
How Zenly Nailed It - Product Methods!
How Zenly Nailed It - Product Methods!How Zenly Nailed It - Product Methods!
How Zenly Nailed It - Product Methods!
Maxime Braud
 
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
Mozza
 
A guide to product metrics by Mixpanel
A guide to product metrics by MixpanelA guide to product metrics by Mixpanel
A guide to product metrics by Mixpanel
Harsha MV
 
History Of The Development Of Mobile Applications
History Of The Development Of Mobile ApplicationsHistory Of The Development Of Mobile Applications
History Of The Development Of Mobile Applications
emmaroberts477
 
Mobile App Development
Mobile App DevelopmentMobile App Development
Mobile App DevelopmentChris Morrell
 
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
PICKASO App Marketing
 
Live Commerce - Zia Daniell Widger
Live Commerce - Zia Daniell WidgerLive Commerce - Zia Daniell Widger
Live Commerce - Zia Daniell Widger
eCommerce Institute
 
Case study: Lab + Online Usability Testing
Case study: Lab + Online Usability TestingCase study: Lab + Online Usability Testing
Case study: Lab + Online Usability Testing
UserZoom
 
App Annie: The State of Mobile 2021
App Annie: The State of Mobile 2021App Annie: The State of Mobile 2021
App Annie: The State of Mobile 2021
Léo Xavier
 
Web accessibility - WAI-ARIA a small introduction
Web accessibility - WAI-ARIA a small introductionWeb accessibility - WAI-ARIA a small introduction
Web accessibility - WAI-ARIA a small introduction
Geoffroy Baylaender
 
Mobile app-marketing-playbook
Mobile app-marketing-playbookMobile app-marketing-playbook
Mobile app-marketing-playbook
Raj Singh
 
Advantages Of Geofencing
Advantages Of GeofencingAdvantages Of Geofencing
Advantages Of Geofencing
WebEngage
 
Android vs ios System Architecture in OS perspective
Android vs ios System Architecture in OS perspectiveAndroid vs ios System Architecture in OS perspective
Android vs ios System Architecture in OS perspective
Raj Pratim Bhattacharya
 
Amazon go- Just walk-out technology
Amazon go- Just walk-out technologyAmazon go- Just walk-out technology
Amazon go- Just walk-out technology
Avinash Vemula
 
UBER-Current Strategy, Competition Analysis and Global Expansion
UBER-Current Strategy, Competition Analysis and Global ExpansionUBER-Current Strategy, Competition Analysis and Global Expansion
UBER-Current Strategy, Competition Analysis and Global Expansion
Shaminder Saini
 
Wave 7: The Story of Why - Social media landscape in Singapore and SEA
Wave 7: The Story of Why - Social media landscape in Singapore and SEA Wave 7: The Story of Why - Social media landscape in Singapore and SEA
Wave 7: The Story of Why - Social media landscape in Singapore and SEA
madhavitumkur
 
Retail Technology Trends
Retail Technology TrendsRetail Technology Trends
Retail Technology Trends
Marie Weaver
 
Webinar - GA4 vs MATOMO.pdf
Webinar - GA4 vs MATOMO.pdfWebinar - GA4 vs MATOMO.pdf
Webinar - GA4 vs MATOMO.pdf
Julien Dereumaux
 

What's hot (20)

UX Research Case Study - OVO History Page
UX Research Case Study - OVO History PageUX Research Case Study - OVO History Page
UX Research Case Study - OVO History Page
 
Amazon.com: the Hidden Empire - Update 2013
Amazon.com: the Hidden Empire - Update 2013Amazon.com: the Hidden Empire - Update 2013
Amazon.com: the Hidden Empire - Update 2013
 
How Zenly Nailed It - Product Methods!
How Zenly Nailed It - Product Methods!How Zenly Nailed It - Product Methods!
How Zenly Nailed It - Product Methods!
 
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
Organic Acquisition: How To Acquire A Million Users With Zero Marketing (mobi...
 
A guide to product metrics by Mixpanel
A guide to product metrics by MixpanelA guide to product metrics by Mixpanel
A guide to product metrics by Mixpanel
 
History Of The Development Of Mobile Applications
History Of The Development Of Mobile ApplicationsHistory Of The Development Of Mobile Applications
History Of The Development Of Mobile Applications
 
Mobile App Development
Mobile App DevelopmentMobile App Development
Mobile App Development
 
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
2021 / 2022 ASO Guide - App Store Optimization Guide by PICKASO
 
Live Commerce - Zia Daniell Widger
Live Commerce - Zia Daniell WidgerLive Commerce - Zia Daniell Widger
Live Commerce - Zia Daniell Widger
 
Case study: Lab + Online Usability Testing
Case study: Lab + Online Usability TestingCase study: Lab + Online Usability Testing
Case study: Lab + Online Usability Testing
 
App Annie: The State of Mobile 2021
App Annie: The State of Mobile 2021App Annie: The State of Mobile 2021
App Annie: The State of Mobile 2021
 
Web accessibility - WAI-ARIA a small introduction
Web accessibility - WAI-ARIA a small introductionWeb accessibility - WAI-ARIA a small introduction
Web accessibility - WAI-ARIA a small introduction
 
Mobile app-marketing-playbook
Mobile app-marketing-playbookMobile app-marketing-playbook
Mobile app-marketing-playbook
 
Advantages Of Geofencing
Advantages Of GeofencingAdvantages Of Geofencing
Advantages Of Geofencing
 
Android vs ios System Architecture in OS perspective
Android vs ios System Architecture in OS perspectiveAndroid vs ios System Architecture in OS perspective
Android vs ios System Architecture in OS perspective
 
Amazon go- Just walk-out technology
Amazon go- Just walk-out technologyAmazon go- Just walk-out technology
Amazon go- Just walk-out technology
 
UBER-Current Strategy, Competition Analysis and Global Expansion
UBER-Current Strategy, Competition Analysis and Global ExpansionUBER-Current Strategy, Competition Analysis and Global Expansion
UBER-Current Strategy, Competition Analysis and Global Expansion
 
Wave 7: The Story of Why - Social media landscape in Singapore and SEA
Wave 7: The Story of Why - Social media landscape in Singapore and SEA Wave 7: The Story of Why - Social media landscape in Singapore and SEA
Wave 7: The Story of Why - Social media landscape in Singapore and SEA
 
Retail Technology Trends
Retail Technology TrendsRetail Technology Trends
Retail Technology Trends
 
Webinar - GA4 vs MATOMO.pdf
Webinar - GA4 vs MATOMO.pdfWebinar - GA4 vs MATOMO.pdf
Webinar - GA4 vs MATOMO.pdf
 

Similar to The magic behind your Lyft ride prices: A case study on machine learning and streaming

Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Thomas Weise
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward
 
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Thomas Weise
 
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
Flink Forward
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
Aljoscha Krettek
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
Prateek Maheshwari
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Sean Zhong
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
markgrover
 
Deep Dive into SpaceONE
Deep Dive into SpaceONEDeep Dive into SpaceONE
Deep Dive into SpaceONE
Choonho Son
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
Rafael Ferreira da Silva
 
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Bowen Li
 
Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017
Jacob Maes
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
GoDataDriven
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward
 
Docker serverless v1.0
Docker serverless v1.0Docker serverless v1.0
Docker serverless v1.0
Thomas Chacko
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
Prakash Chockalingam
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
iguazio
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018
Trayan Iliev
 
Apache Flink - a Gentle Start
Apache Flink - a Gentle StartApache Flink - a Gentle Start
Apache Flink - a Gentle Start
Liangjun Jiang
 

Similar to The magic behind your Lyft ride prices: A case study on machine learning and streaming (20)

Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
 
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
 
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
Flink Forward Berlin 2018: Thomas Weise & Aljoscha Krettek - "Python Streamin...
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
 
Deep Dive into SpaceONE
Deep Dive into SpaceONEDeep Dive into SpaceONE
Deep Dive into SpaceONE
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
 
Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
 
Docker serverless v1.0
Docker serverless v1.0Docker serverless v1.0
Docker serverless v1.0
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018
 
Apache Flink - a Gentle Start
Apache Flink - a Gentle StartApache Flink - a Gentle Start
Apache Flink - a Gentle Start
 

More from Karthik Murugesan

Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
Karthik Murugesan
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slides
Karthik Murugesan
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach
Karthik Murugesan
 
Microsoft cosmos
Microsoft cosmosMicrosoft cosmos
Microsoft cosmos
Karthik Murugesan
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
Karthik Murugesan
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
Karthik Murugesan
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
Karthik Murugesan
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
Karthik Murugesan
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Karthik Murugesan
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
Karthik Murugesan
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
Karthik Murugesan
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF Estimator
Karthik Murugesan
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018
Karthik Murugesan
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018
Karthik Murugesan
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
Karthik Murugesan
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017
Karthik Murugesan
 
State Of AI 2018
State Of AI 2018State Of AI 2018
State Of AI 2018
Karthik Murugesan
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
Karthik Murugesan
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
Karthik Murugesan
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Karthik Murugesan
 

More from Karthik Murugesan (20)

Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slides
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach
 
Microsoft cosmos
Microsoft cosmosMicrosoft cosmos
Microsoft cosmos
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF Estimator
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017
 
State Of AI 2018
State Of AI 2018State Of AI 2018
State Of AI 2018
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
 

Recently uploaded

JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 

Recently uploaded (20)

JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 

The magic behind your Lyft ride prices: A case study on machine learning and streaming

  • 1. The magic behind your Lyft ride prices A case study on machine learning and streaming Strata Data, San Francisco, March 27th 2019 Rakesh Kumar | Engineer, Pricing Thomas Weise | @thweise | Engineer, Streaming Platform go.lyft.com/dynamic-pricing-strata-sf-2019
  • 2. Agenda 2 ● Introduction to dynamic pricing ● Legacy pricing infrastructure ● Streaming use case ● Streaming based infrastructure ● Beam & multiple languages ● Beam Flink runner ● Lessons learned
  • 3. 3 Dynamic Pricing Supply/Demand curve ETA Pricing Notifications Detect Delays Coupons User Delight Fraud Behaviour Fingerprinting Monetary Impact Imperative to act fast Top Destinations Core Experience
  • 5. What is prime time? Location + time specific multiplier on the base fare for a ride e.g. "in downtown SF at 5:00pm, prime time is 2.0" Means we double the normal fare in that place at that time Location: geohash6 (e.g. ‘9q8yyq’) Time: calendar minute 5
  • 6. 6 ● Balance supply and demand to maintain service level ● State of marketplace is constantly changing ● "Surge pricing solves the wild goose chase" (paper) Why do we need prime time?
  • 8. Legacy architecture: A series of cron jobs ● Ingest high volume of client app events (Kinesis, KCL) ● Compute features (e.g. demand, conversation rate, supply) from events ● Run ML models on features to compute primetime for all regions (per min, per gh6) SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...} NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...} 8
  • 9. Problems 1. Latency 2. Code complexity (LOC) 3. Hard to add new features involving windowing/join (i.e. arbitrary demand windows, subregional computation) 4. No dynamic / smart triggers 9
  • 10. Can we use Flink? 10
  • 11. 11 Streaming Stack 11 Streaming Application (SQL, Java) Stream / Schema Registry Deployment Tooling Metrics & Dashboards Alerts Logging Amazon EC2 Amazon S3 Wavefront Salt (Config / Orca) Docker Source Sink
  • 12. 12 Streaming and Python ● Flink and many other big data ecosystem projects are Java / JVM based ○ Team wants to adopt streaming, but doesn’t have the Java skills ○ Jython != Python ● Use cases for different language environments ○ Python primary option for Machine Learning ● Cost of many API styles and runtime environments
  • 15. 15 Pipeline (conceptual outline) kinesis events (source) aggregate and window filter events run models to generate features (culminating in PT) internal services redis ride_requested, app_open, ... unique_users_per_min, unique_requests_per_5_ min, ... conversion learner, eta learner, ... Lyft apps (phones) valid sessions, dedupe, ...
  • 16. Details of implementation 1. Filtering (with internal service calls) 2. Aggregation with Beam windowing: 1min, 5min (by event time) 3. Triggers: watermark or stateful processing 4. Machine learning models invoked using stateful Beam transforms 5. Final gh6:pt output from pipeline stored to Redis 16
  • 17. Gains • 60% reduction in latency • Reuse of model code • 10K => 4K LOC • 300 => 120 AWS instances 17
  • 19. 19 The Beam Vision 1. End users: who want to write pipelines in a language that’s familiar. 2. SDK writers: who want to make Beam concepts available in new languages. Includes IOs: connectors to data stores. 3. Runner writers: who have a distributed processing environment and want to support Beam pipelines Beam Model: Fn Runners Apache Flink Apache Spark Beam Model: Pipeline Construction Other LanguagesBeam Java Beam Python Execution Execution Cloud Dataflow Execution https://s.apache.org/apache-beam-project-overview
  • 20. 20 Multi-Language Support ● Initially Java SDK and Java Runners ● 2016: Start of cross-language support effort ● 2017: Python SDK on Dataflow ● 2018: Go SDK (for portable runners) ● 2018: Python on Flink MVP ● Next: Cross-language pipelines, more portable runners
  • 21. 21 Python Example p = beam.Pipeline(runner=runner, options=pipeline_options) (p | ReadFromText("/path/to/text*") | Map(lambda line: ...) | WindowInto(FixedWindows(120) trigger=AfterWatermark( early=AfterProcessingTime(60), late=AfterCount(1)) accumulation_mode=ACCUMULATING) | CombinePerKey(sum)) | WriteToText("/path/to/outputs") ) result = p.run() ( What, Where, When, How )
  • 22. 22 ⋮ input | Sum.PerKey() Python input.apply( Sum.integersPerKey()) Java SELECT key, SUM(value) FROM input GROUP BY key SQL (via Java) ⋮ Cloud Dataflow Apache Spark Apache Flink Apache Apex Gearpump Apache Samza Apache Nemo (incubating) IBM Streams Sum Per Key Java objects Sum Per Key Dataflow JSON API Portability (originally) https://s.apache.org/state-of-beam-sfo-2018
  • 23. 23 ⋮ input | Sum.PerKey() Python stats.Sum(s, input) Go SELECT key, SUM(value) FROM input GROUP BY key SQL (via Java) ⋮ input.apply( Sum.integersPerKey()) Java Apache Spark Apache Flink Apache Apex Gearpump Cloud Dataflow Apache Samza Apache Nemo (incubating) IBM Streams Sum Per Key Java objects Sum Per Key Portable protos Portability (current) https://s.apache.org/state-of-beam-sfo-2018
  • 25. 25 Portability Framework w/ Flink Runner SDK (Python) Job Service Artifact Staging Job Manager Fn Services (Beam Flink Task) Task Manager Executor / Fn API Provision Control Data Artifact Retrieval State Logging gRPC Pipeline (protobuf) ClusterRunner Dependencies (optional) python -m apache_beam.examples.wordcount --input=/etc/profile --output=/tmp/py-wordcount-direct --runner=PortableRunner --job_endpoint=localhost:8099 --streaming Staging Location (DFS, S3, …) SDK Worker (UDFs) SDK Worker (UDFs) SDK Worker (Python) Flink Job
  • 26. 26 Portable Runner ● Provide Job Service endpoint (Job Management API) ● Translate portable pipeline representation to native (Flink) API ● Provide gRPC endpoints for control/data/logging/state plane ● Manage SDK worker processes that execute user code ● Manage bundle execution (with arbitrary user code) via Fn API ● Manage state for side inputs, user state and timers Common implementation for JVM based runners (/runners/java-fn-execution) and portable “Validate Runner” integration test suite in Python!
  • 27. 27 Fn API - Bundle Processing https://s.apache.org/beam-fn-api-processing-a-bundle Bundle size matters! ● Amortize overhead over many elements ● Watermark hold effect on latency
  • 28. 28 Lyft Flink Runner Customizations ● Translator extension for streaming sources ○ Kinesis, Kafka consumers that we also use in Java Flink jobs ○ Message decoding, watermarks ● Python execution environment for SDK workers ○ Tailored to internal deployment tooling ○ Docker-free, frozen virtual envs ● https://github.com/lyft/beam/tree/release-2.11.0-lyft
  • 29. 29 Fn API How slow is this ? ● Fn API Overhead 15% ? ● Fused stages ● Bundle size ● Parallel SDK workers ● TODO: Cython, protobuf C++ bindings decode, …, window count (messages | 'reshuffle' >> beam.Reshuffle() | 'decode' >> beam.Map(lambda x: (__import__('random').randint(0, 511), 1)) | 'noop1' >> beam.Map(lambda x : x) | 'noop2' >> beam.Map(lambda x : x) | 'noop3' >> beam.Map(lambda x : x) | 'window' >> beam.WindowInto(window.GlobalWindows(), trigger=Repeatedly(AfterProcessingTime(5 * 1000)), accumulation_mode= AccumulationMode.DISCARDING) | 'group' >> beam.GroupByKey() | 'count' >> beam.Map(count) )
  • 30. 30 Fast enough for real Python work ! ● c5.4xlarge machines (16 vCPU, 32 GB) ● 16 SDK workers / machine ● 1000 ms or 1000 records / bundle ● 280,000 transforms / second / machine (~ 17,500 per worker) ● Python user code will be gating factor
  • 31. 31 Beam Portability Recap ● Pipelines written in non-JVM languages on JVM runners ○ Python, Go on Flink (and others) ● Full isolation of user code ○ Native CPython execution w/o library restrictions ● Configurable SDK worker execution ○ Docker, Process, Embedded, ... ● Multiple languages in a single pipeline (future) ○ Use Java Beam IO with Python ○ Use TFX with Java ○ <your use case here>
  • 32. 32 Feature Support Matrix (Beam 2.11.0) https://s.apache.org/apache-beam-portability-support-table
  • 34. Lessons Learned • Python Beam SDK and portable Flink runner evolving • Keep pipeline simple - Flink tasks / shuffles are not free • Stateful processing is essential for complex logic • Model execution latency matters • Instrument everything for monitoring • Approach for pipeline upgrade and restart • Mind your dependencies - rate limit API calls • Testing story (integration, staging) 34
  • 35. We’re Hiring! Apply at www.lyft.com/careers or email data-recruiting@lyft.com Data Engineering Engineering Manager San Francisco Software Engineer San Francisco, Seattle, & New York City Data Infrastructure Engineering Manager San Francisco Software Engineer San Francisco & Seattle Experimentation Software Engineer San Francisco Streaming Software Engineer San Francisco Observability Software Engineer San Francisco
  • 36. Please ask questions! This presentation: http://go.lyft.com/dynamic-pricing-strata-sf-2019