4. What is prime time?
Location + time specific multiplier on
the base fare for a ride
e.g. "in downtown SF at 5:02pm, prime
time is 2.0"
Means we double the normal fare in
that place at that time
location: geohash6 (e.g. 'abc123')
time: calendar minute (i.e. utc epoch
time rounded to nearest minute)
4
5. 5
● Balance supply and demand to maintain service level
● State of marketplace is constantly changing
● "Surge pricing solves the wild goose chase" (paper)
Why do we need prime time?
7. Existing architecture: A series of cron jobs
● Ingest high volume of client app events
(kinesis)
● Compute features (e.g. demand,
conversation rate) from events
● Run ML models on features to compute
primetime for all regions (per min, per gh6)
SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...}
NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...}
7
12. 12
Pipeline (conceptual outline)
kinesis events
(source)
aggregate and
window
filter events
run models to
generate
features
(culminating in
PT)
internal services
redis
ride_requested,
app_open, ...
unique_users_per_min,
unique_requests_per_5_
min, ...
conversion learner,
eta learner, ...
Lyft apps
(phones)
valid sessions,
dedupe, ...
13. Details of implementation
1. Windowing: 1min, 5min (by event time)
2. Triggers: via watermark, via state
3. Aggregation handled by Beam/Flink
4. Filtering (with internal service calls) done within Beam operators
5. Machine learning models invoked within Beam operators
6. Final gh6:pt output from pipeline stored to Redis
13
14. Progress
1. Started with proof of concepts for major ideas
2. All code ported
3. Now running tests in some regions (to validate effect on business metrics)
14
17. Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
17
18. Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
18
19. Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
19
20. Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
20
21. Lessons learned
1. Shuffles are not free (KISS)
2. Know how to profile memory to find leaks
3. Instrument everything
4. Python Beam on Flink is alpha
5. Think early about staging/deployment
21
22. Wish list
1. Better Python documentation for Beam features (e.g. stateful processing)
2. Built in support for kv logging (kibana)
3. Better ability to diagnose failures
22
24. Join Us!
Dynamic pricing = "sweet spot" where streaming, machine learning and data science
intersect
Work on a critical high volume system with immediate impact
Lots to explore-- integration w/ Kafka, scalability, GPUs, expanding streaming use
cases
Ride sharing is making the world a better place!
lyft.com/careers
24