Dynamic pricing of Lyft rides using
streaming
Presented by: Amar Pai (apai@lyft.com)
02-07-19
Overview
2
● Introduction to Dynamic Pricing (a.k.a Prime Time)
● Overview of existing infrastructure
● Why move to Beam/Flink?
● Overview of streaming-based Infrastructure
● Lessons learned
Introduction to
Dynamic Pricing
3
What is prime time?
Location + time specific multiplier on
the base fare for a ride
e.g. "in downtown SF at 5:02pm, prime
time is 2.0"
Means we double the normal fare in
that place at that time
location: geohash6 (e.g. 'abc123')
time: calendar minute (i.e. utc epoch
time rounded to nearest minute)
4
5
● Balance supply and demand to maintain service level
● State of marketplace is constantly changing
● "Surge pricing solves the wild goose chase" (paper)
Why do we need prime time?
Overview of Existing
Infrastructure
6
Existing architecture: A series of cron jobs
● Ingest high volume of client app events
(kinesis)
● Compute features (e.g. demand,
conversation rate) from events
● Run ML models on features to compute
primetime for all regions (per min, per gh6)
SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...}
NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...}
7
Why move to
Beam/Flink?
8
Problems...
1. Latency
2. Code complexity (LOC)
3. Hard to add new features involving windowing/join (i.e. arbitrary demand
windows, subregional computation)
4. No dynamic / smart triggers
9
Streaming Architecture
TechStack
Overview of
Streaming-based
Infrastructure
11
12
Pipeline (conceptual outline)
kinesis events
(source)
aggregate and
window
filter events
run models to
generate
features
(culminating in
PT)
internal services
redis
ride_requested,
app_open, ...
unique_users_per_min,
unique_requests_per_5_
min, ...
conversion learner,
eta learner, ...
Lyft apps
(phones)
valid sessions,
dedupe, ...
Details of implementation
1. Windowing: 1min, 5min (by event time)
2. Triggers: via watermark, via state
3. Aggregation handled by Beam/Flink
4. Filtering (with internal service calls) done within Beam operators
5. Machine learning models invoked within Beam operators
6. Final gh6:pt output from pipeline stored to Redis
13
Progress
1. Started with proof of concepts for major ideas
2. All code ported
3. Now running tests in some regions (to validate effect on business metrics)
14
Still todo
1. Productionization
2. Kafka
3. Stateful processing
15
Lessons learned
16
Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
17
Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
18
Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
19
Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
20
Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
21
Wish list
1. Better Python documentation for Beam features (e.g. stateful processing)
2. Built in support for kv logging (kibana)
3. Better ability to diagnose failures
22
We Are Hiring
23
Join Us!
Dynamic pricing = "sweet spot" where streaming, machine learning and data science
intersect
Work on a critical high volume system with immediate impact
Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use
cases
Ride sharing is making the world a better place!
lyft.com/careers
24
Q&A?
25

Dynamic pricing of Lyft rides using streaming

  • 1.
    Dynamic pricing ofLyft rides using streaming Presented by: Amar Pai (apai@lyft.com) 02-07-19
  • 2.
    Overview 2 ● Introduction toDynamic Pricing (a.k.a Prime Time) ● Overview of existing infrastructure ● Why move to Beam/Flink? ● Overview of streaming-based Infrastructure ● Lessons learned
  • 3.
  • 4.
    What is primetime? Location + time specific multiplier on the base fare for a ride e.g. "in downtown SF at 5:02pm, prime time is 2.0" Means we double the normal fare in that place at that time location: geohash6 (e.g. 'abc123') time: calendar minute (i.e. utc epoch time rounded to nearest minute) 4
  • 5.
    5 ● Balance supplyand demand to maintain service level ● State of marketplace is constantly changing ● "Surge pricing solves the wild goose chase" (paper) Why do we need prime time?
  • 6.
  • 7.
    Existing architecture: Aseries of cron jobs ● Ingest high volume of client app events (kinesis) ● Compute features (e.g. demand, conversation rate) from events ● Run ML models on features to compute primetime for all regions (per min, per gh6) SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...} NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...} 7
  • 8.
  • 9.
    Problems... 1. Latency 2. Codecomplexity (LOC) 3. Hard to add new features involving windowing/join (i.e. arbitrary demand windows, subregional computation) 4. No dynamic / smart triggers 9
  • 10.
  • 11.
  • 12.
    12 Pipeline (conceptual outline) kinesisevents (source) aggregate and window filter events run models to generate features (culminating in PT) internal services redis ride_requested, app_open, ... unique_users_per_min, unique_requests_per_5_ min, ... conversion learner, eta learner, ... Lyft apps (phones) valid sessions, dedupe, ...
  • 13.
    Details of implementation 1.Windowing: 1min, 5min (by event time) 2. Triggers: via watermark, via state 3. Aggregation handled by Beam/Flink 4. Filtering (with internal service calls) done within Beam operators 5. Machine learning models invoked within Beam operators 6. Final gh6:pt output from pipeline stored to Redis 13
  • 14.
    Progress 1. Started withproof of concepts for major ideas 2. All code ported 3. Now running tests in some regions (to validate effect on business metrics) 14
  • 15.
    Still todo 1. Productionization 2.Kafka 3. Stateful processing 15
  • 16.
  • 17.
    Lessons learned 1. Shufflesare not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 17
  • 18.
    Lessons learned 1. Shufflesare not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 18
  • 19.
    Lessons learned 1. Shufflesare not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 19
  • 20.
    Lessons learned 1. Shufflesare not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 20
  • 21.
    Lessons learned 1. Shufflesare not free (KISS) 2. Know how to profile memory to find leaks 3. Instrument everything 4. Python Beam on Flink is alpha 5. Think early about staging/deployment 21
  • 22.
    Wish list 1. BetterPython documentation for Beam features (e.g. stateful processing) 2. Built in support for kv logging (kibana) 3. Better ability to diagnose failures 22
  • 23.
  • 24.
    Join Us! Dynamic pricing= "sweet spot" where streaming, machine learning and data science intersect Work on a critical high volume system with immediate impact Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use cases Ride sharing is making the world a better place! lyft.com/careers 24
  • 25.