Dynamic pricing of Lyft rides using streaming

Dynamic pricing of Lyft rides using
streaming
Presented by: Amar Pai (apai@lyft.com)
02-07-19

Overview
2
● Introduction to Dynamic Pricing (a.k.a Prime Time)
● Overview of existing infrastructure
● Why move to Beam/Flink?
● Overview of streaming-based Infrastructure
● Lessons learned

Introduction to
Dynamic Pricing
3

What is prime time?
Location + time specific multiplier on
the base fare for a ride
e.g. "in downtown SF at 5:02pm, prime
time is 2.0"
Means we double the normal fare in
that place at that time
location: geohash6 (e.g. 'abc123')
time: calendar minute (i.e. utc epoch
time rounded to nearest minute)
4

5
● Balance supply and demand to maintain service level
● State of marketplace is constantly changing
● "Surge pricing solves the wild goose chase" (paper)
Why do we need prime time?

Overview of Existing
Infrastructure
6

Existing architecture: A series of cron jobs
● Ingest high volume of client app events
(kinesis)
● Compute features (e.g. demand,
conversation rate) from events
● Run ML models on features to compute
primetime for all regions (per min, per gh6)
SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...}
NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...}
7

Problems...
1. Latency
2. Code complexity (LOC)
3. Hard to add new features involving windowing/join (i.e. arbitrary demand
windows, subregional computation)
4. No dynamic / smart triggers
9

Streaming Architecture
TechStack

Overview of
Streaming-based
Infrastructure
11

12
Pipeline (conceptual outline)
kinesis events
(source)
aggregate and
window
filter events
run models to
generate
features
(culminating in
PT)
internal services
redis
ride_requested,
app_open, ...
unique_users_per_min,
unique_requests_per_5_
min, ...
conversion learner,
eta learner, ...
Lyft apps
(phones)
valid sessions,
dedupe, ...

Details of implementation
1. Windowing: 1min, 5min (by event time)
2. Triggers: via watermark, via state
3. Aggregation handled by Beam/Flink
4. Filtering (with internal service calls) done within Beam operators
5. Machine learning models invoked within Beam operators
6. Final gh6:pt output from pipeline stored to Redis
13

Progress
1. Started with proof of concepts for major ideas
2. All code ported
3. Now running tests in some regions (to validate effect on business metrics)
14

Still todo
1. Productionization
2. Kafka
3. Stateful processing
15

Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
17

Lessons learned
18

Lessons learned
19

Lessons learned
20

Lessons learned
21

Wish list
1. Better Python documentation for Beam features (e.g. stateful processing)
2. Built in support for kv logging (kibana)
3. Better ability to diagnose failures
22

Join Us!
Dynamic pricing = "sweet spot" where streaming, machine learning and data science
intersect
Work on a critical high volume system with immediate impact
Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use
cases
Ride sharing is making the world a better place!
lyft.com/careers
24

Dynamic pricing of Lyft rides using streaming

More Related Content

What's hot

Similar to Dynamic pricing of Lyft rides using streaming

Recently uploaded

Dynamic pricing of Lyft rides using streaming