Dynamic pricing of Lyft rides using streaming

•

4 likes•872 views

Amar Pai

Talk given Feb 07 2019 at SF Analytics meetup: https://www.meetup.com/SF-Big-Analytics/events/258174292/

Technology

Dynamic pricing of Lyft rides using
streaming
Presented by: Amar Pai (apai@lyft.com)
02-07-19

Overview
2
● Introduction to Dynamic Pricing (a.k.a Prime Time)
● Overview of existing infrastructure
● Why move to Beam/Flink?
● Overview of streaming-based Infrastructure
● Lessons learned

What is prime time?
Location + time specific multiplier on
the base fare for a ride
e.g. "in downtown SF at 5:02pm, prime
time is 2.0"
Means we double the normal fare in
that place at that time
location: geohash6 (e.g. 'abc123')
time: calendar minute (i.e. utc epoch
time rounded to nearest minute)
4

5
● Balance supply and demand to maintain service level
● State of marketplace is constantly changing
● "Surge pricing solves the wild goose chase" (paper)
Why do we need prime time?

$Existing architecture: A series of cron jobs ● Ingest high volume of client app events (kinesis) ● Compute features (e.g. demand, conversation rate) from events ● Run ML models on features to compute primetime for all regions (per min, per gh6) SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...} NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...} 7$

Problems...
1. Latency
2. Code complexity (LOC)
3. Hard to add new features involving windowing/join (i.e. arbitrary demand
windows, subregional computation)
4. No dynamic / smart triggers
9

Overview of
Streaming-based
Infrastructure
11

12
Pipeline (conceptual outline)
kinesis events
(source)
aggregate and
window
filter events
run models to
generate
features
(culminating in
PT)
internal services
redis
ride_requested,
app_open, ...
unique_users_per_min,
unique_requests_per_5_
min, ...
conversion learner,
eta learner, ...
Lyft apps
(phones)
valid sessions,
dedupe, ...

Details of implementation
1. Windowing: 1min, 5min (by event time)
2. Triggers: via watermark, via state
3. Aggregation handled by Beam/Flink
4. Filtering (with internal service calls) done within Beam operators
5. Machine learning models invoked within Beam operators
6. Final gh6:pt output from pipeline stored to Redis
13

Progress
1. Started with proof of concepts for major ideas
2. All code ported
3. Now running tests in some regions (to validate effect on business metrics)
14

Still todo
1. Productionization
2. Kafka
3. Stateful processing
15

Wish list
1. Better Python documentation for Beam features (e.g. stateful processing)
2. Built in support for kv logging (kibana)
3. Better ability to diagnose failures
22

Join Us!
Dynamic pricing = "sweet spot" where streaming, machine learning and data science
intersect
Work on a critical high volume system with immediate impact
Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use
cases
Ride sharing is making the world a better place!
lyft.com/careers
24

What's hot

Go at uberRob Skillington

Flink Forward Berlin 2017: Ruben Casado Tejedor - Flink-Kudu connector: an op...Flink Forward

Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...Flink Forward

Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Aljoscha Krettek

OpenAPI and gRPC Side by-SideTim Burks

Flink Forward Berlin 2017: Boris Lublinsky, Stavros Kontopoulos - Introducing...Flink Forward

Functional programming in Scaladatamantra

Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN

OSDC 2018 - Distributed monitoringGianluca Arbezzano

FluentD for end to end monitoringPhil Wilkins

p4alu: Arithmetic Logic Unit in P4Kentaro Ebisawa

Fluentd Project Intro at Kubecon 2019 EUN Masahiro

JS introductionYi Tseng

Fluentd and Distributed Logging at KubeconN Masahiro

Extending Flux - Writing Your Own Functions by Adam AnthonyInfluxData

Traitement temps réel chez Streamroot - Golang Paris Juin 2016Simon Caplette

Introduction to GraalVMSHASHI KUMAR

Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Taiwan User Group

Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)Igalia

What's hot (19)

Go at uber

Flink Forward Berlin 2017: Ruben Casado Tejedor - Flink-Kudu connector: an op...

Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...

Talk Python To Me: Stream Processing in your favourite Language with Beam on ...

OpenAPI and gRPC Side by-Side

Flink Forward Berlin 2017: Boris Lublinsky, Stavros Kontopoulos - Introducing...

Functional programming in Scala

Grokking Techtalk #38: Escape Analysis in Go compiler

OSDC 2018 - Distributed monitoring

FluentD for end to end monitoring

p4alu: Arithmetic Logic Unit in P4

Fluentd Project Intro at Kubecon 2019 EU

JS introduction

Fluentd and Distributed Logging at Kubecon

Extending Flux - Writing Your Own Functions by Adam Anthony

Traitement temps réel chez Streamroot - Golang Paris Juin 2016

Introduction to GraalVM

Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On

Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)

Similar to Dynamic pricing of Lyft rides using streaming

The magic behind your Lyft ride prices: A case study on machine learning and ...Karthik Murugesan

Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward

Apache Flink Adoption at ShopifyYaroslav Tkachenko

Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringDatabricks

SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)Yuuki Takano

Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Landon Robinson

Near real-time anomaly detection at Lyftmarkgrover

Apache Flink - a Gentle StartLiangjun Jiang

Voip.pdfvicepy

Streaming 101: Hello WorldJosh Fischer

Apache Flink Training Workshop @ HadoopCon2016 - #1 System OverviewApache Flink Taiwan User Group

Flink at netflix paypal speaker seriesMonal Daxini

Apache Kafka® Delivers a Single Source of Truth for The New York Timesconfluent

Introduction to Apache Flink at Vienna Meet UpStefan Papp

Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward

LCU14 310- Cisco ODP v2Linaro

Flink forward-2017-netflix keystones-paasMonal Daxini

Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018Bowen Li

Funambol C++ APIFunambol

Similar to Dynamic pricing of Lyft rides using streaming (20)

The magic behind your Lyft ride prices: A case study on machine learning and ...

Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...

Apache Flink Adoption at Shopify

Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring

SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)

Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...

Near real-time anomaly detection at Lyft

Apache Flink - a Gentle Start

Voip.pdf

Streaming 101: Hello World

Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview

Flink at netflix paypal speaker series

Apache Kafka® Delivers a Single Source of Truth for The New York Times

Introduction to Apache Flink at Vienna Meet Up

Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...

LCU14 310- Cisco ODP v2

Flink forward-2017-netflix keystones-paas

Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018

Funambol C++ API

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

CloudStudio User manual (basic edition):comworks

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

GenCyber Cyber Security Day PresentationMichael W. Hawkins

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

How to Remove Document Management Hurdles with X-Docs?XfilesPro

Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Presentation on how to chat with PDF using ChatGPT code interpreter

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

CloudStudio User manual (basic edition):

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Pigging Solutions Piggable Sweeping Elbows

GenCyber Cyber Security Day Presentation

The transition to renewables in India.pdf

How to Remove Document Management Hurdles with X-Docs?

Next-generation AAM aircraft unveiled by Supernal, S-A2

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Unblocking The Main Thread Solving ANRs and Frozen Frames

The Codex of Business Writing Software for Real-World Solutions 2.pptx

08448380779 Call Girls In Friends Colony Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Dynamic pricing of Lyft rides using streaming

1. Dynamic pricing of Lyft rides using streaming Presented by: Amar Pai (apai@lyft.com) 02-07-19

2. Overview 2 ● Introduction to Dynamic Pricing (a.k.a Prime Time) ● Overview of existing infrastructure ● Why move to Beam/Flink? ● Overview of streaming-based Infrastructure ● Lessons learned

3. Introduction to Dynamic Pricing 3

4. What is prime time? Location + time specific multiplier on the base fare for a ride e.g. "in downtown SF at 5:02pm, prime time is 2.0" Means we double the normal fare in that place at that time location: geohash6 (e.g. 'abc123') time: calendar minute (i.e. utc epoch time rounded to nearest minute) 4

5. 5 ● Balance supply and demand to maintain service level ● State of marketplace is constantly changing ● "Surge pricing solves the wild goose chase" (paper) Why do we need prime time?

6. Overview of Existing Infrastructure 6

7. Existing architecture: A series of cron jobs ● Ingest high volume of client app events (kinesis) ● Compute features (e.g. demand, conversation rate) from events ● Run ML models on features to compute primetime for all regions (per min, per gh6) SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...} NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...} 7

8. Why move to Beam/Flink? 8

9. Problems... 1. Latency 2. Code complexity (LOC) 3. Hard to add new features involving windowing/join (i.e. arbitrary demand windows, subregional computation) 4. No dynamic / smart triggers 9

10. Streaming Architecture TechStack

11. Overview of Streaming-based Infrastructure 11

12. 12 Pipeline (conceptual outline) kinesis events (source) aggregate and window filter events run models to generate features (culminating in PT) internal services redis ride_requested, app_open, ... unique_users_per_min, unique_requests_per_5_ min, ... conversion learner, eta learner, ... Lyft apps (phones) valid sessions, dedupe, ...

13. Details of implementation 1. Windowing: 1min, 5min (by event time) 2. Triggers: via watermark, via state 3. Aggregation handled by Beam/Flink 4. Filtering (with internal service calls) done within Beam operators 5. Machine learning models invoked within Beam operators 6. Final gh6:pt output from pipeline stored to Redis 13

14. Progress 1. Started with proof of concepts for major ideas 2. All code ported 3. Now running tests in some regions (to validate effect on business metrics) 14

15. Still todo 1. Productionization 2. Kafka 3. Stateful processing 15

16. Lessons learned 16

17. Lessons learned 1. Shuffles are not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 17

18. Lessons learned 1. Shuffles are not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 18

19. Lessons learned 1. Shuffles are not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 19

20. Lessons learned 1. Shuffles are not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 20

21. Lessons learned 1. Shuffles are not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 21

22. Wish list 1. Better Python documentation for Beam features (e.g. stateful processing) 2. Built in support for kv logging (kibana) 3. Better ability to diagnose failures 22

23. We Are Hiring 23

24. Join Us! Dynamic pricing = "sweet spot" where streaming, machine learning and data science intersect Work on a critical high volume system with immediate impact Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use cases Ride sharing is making the world a better place! lyft.com/careers 24

25. Q&A? 25

Dynamic pricing of Lyft rides using streaming

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Dynamic pricing of Lyft rides using streaming

Similar to Dynamic pricing of Lyft rides using streaming (20)

Recently uploaded

Recently uploaded (20)

Dynamic pricing of Lyft rides using streaming