SlideShare a Scribd company logo
1 of 31
Download to read offline
Threading Needles in a Haystack:
Sessionizing Uber Trips in Real Time
Amey Chaugule
Uber Technologies, Inc.
Agenda
● The Marketplace Org @ Uber
● Sessionization use cases at Uber
● Some unique challenges about Sessions @ Uber
● Anatomy of an Uber Session
● Our Sessions DSL
● Sessions in Production
● Scale
● Q & A
The Uber Marketplace
The Uber Marketplace
In our marketplace, there are models that describe the world and the
decision engines that act upon them.
Marketplace @ Uber
Real-time events in the physical world drive marketplace dynamics
which then affect the algorithmic engines in the Marketplace which in
turn influence the events in the real world in a continuous feedback
loop.
Some examples of marketplace dynamics:
Supply
Forecast
Trips
Need for a sessionized view of the Uber experience
Given the scale and complexity of our systems, events are distributed
across multiple disparate real-time streams over the twin dimensions of
time and space.
How do we contextualize these event streams so they can be logically
grouped together and quickly surface useful information to downstream
decision engines?
Real-time Use cases
For instance, the some algorithms need to adjust to spikes in demand to
balance the supply in near real-time.
While some machine learning-models need information about
pre-request activities in real-time.
To understand rider behaviour we need to understand the full user
experience from start to finish.
Post-Dispatch Rider
Cancel
Completed Trip
Pre-Dispatch Rider
Cancel
Unfulfilled
Dispatched
Exit pre-request
screen
Exit at request
screen
RequestSession Start
Driver Cancel
Get to Request
Screen
R/S
C/R
C/S
The definition of this experience, a SESSION, is critical to understanding our
internal business operations.
Some unique challenges
Given the ride-sharing marketplace Uber operates, our Sessions state
machine needs to model interactions between riders, driver partners
and back-end systems
The Anatomy of an Uber Session
The Sessions State Machine
The Sessions State Machine - The shopping state
The Sessions State Machine - Requesting a ride
The Sessions State Machine - Cancellation (the sad path ☹ )
The Sessions State Machine - On Trip (The happy path )
The Sessions State Machine - Session End (The happy path )
The Sessions State Machine - On trip (the happy path )
Implementation
Pickup Experience at Super Bowl 2017
Sessions DSL
type Condition = SessionInput => Boolean
/**
* This class defines a single transition. It takes in two functions:
*
* @param condition Function of type (SessionInput => Boolean) representing conditions that trigger this transition.
* @param nextState Function of type (State, SessionInput) => State, which results in the next state.
*/
class Transition(condition: Condition, nextState: (State, SessionInput) => State) {
def isConditionValid(event: SessionInput) = condition(event)
def transition(event: SessionInput, fromState: State): State = nextState(fromState, event)
}
e.g.
val requestRideAndRiderLookingTransition = new Transition(isRequestEvent and isRiderLooking, RequestRideState.transitionTo)
Sessions DSL
sealed trait State {
val ts: Long
val name: RiderSessionStateName.Value
val transitionReason: Option[String]
val jobUuid: Option[String]
val originLocation: Option[GeoLocation]
val destinationLocation: Option[GeoLocation]
// List of Transitions out of this state. They are evaluated in the order of precedence they appear in this list.
val transitions: List[Transition]
/**
*
* @param event Current session input.
* @return The next state resulting from the input event.
*/
def withEvent(event: SessionInput): State = {
// Find the transition with condition that event holds valid.
val transition = transitions.find(t => t.isConditionValid(event))
transition map(_.transition(event, this)) getOrElse this
}
}
Sessions DSL - Putting it all together
case class RequestRideState(ts: Long,
override val vehicleViewId: VehicleView,
transitionReason: Option[String] = None,
jobUuid: Option[String] = None,
originLocation: Option[GeoLocation] = None,
destinationLocation: Option[GeoLocation] = None) extends State with AssignedVehicleView {
override val name: SessionState.Value = SessionState.RequestRide
override val transitions: List[Transition] = List(Transition.onTripTransition,
Transition.shoppingStateFromRequestTransition,
Transition.requestRideSelfTransition,
Transition.shoppingStateOnRequestExpiration)
}
Sessions - Moving from Spark to Flink
unionedStreams
.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor[SessionInput](Time.seconds(30)) {
override def extractTimestamp(event: SessionInput): Long = event.timestamp })
.keyBy(_.riderUuid)
.window(EventTimeSessionWindows.withDynamicGap(Time.minutes(30)))
.evictor(DeltaEvictor.of(30000.0D, FlinkSessionsPipeline.deltaFunction, true))
.process(new ProcessWindowFunction[SessionInput, List[RiderSessionObject], String, TimeWindow]() {
…
}
Spark’s stateFunc abstraction just fit nicely into ProcessWindow
Sessions in Production
“Time is relative… and clocks are hard.” - I. Brodsky
Our state machine models interactions between riders, drivers, and
internal back-end systems each with their own notion of time!
We essentially need to keep track of per-key watermarks.
Checkpointing
Checkpointing to HDFS can be unreliable.
What levels of backpressure can your downstream applications
tolerate?
At times, it’s just easier to store per-key state to Kafka and restart a
new pipeline, letting backfill take care of any gaps.
Backfills
We rely on upserts into Elasticsearch and backfilling can introduce subtle
bugs by being “more correct.”
In our implementation each session is indexed by an anonymized rider
UUID and its start time, i.e. event time of the first event to kick off a
session.
Re-ordering the event consumption in a backfill can easily give you two
nearly identical sessions with slightly different start times
Schema Evolution
Uber’s ride sharing products are constantly evolving and our pipeline
needs to keep in sync.
A new pipeline without a state needs to warm up the state before we
can trust it to reliably write to our production indices.
Production hand off between the old and the new pipelines needs to be
carefully choreographed.
Observability
Keep track of any and every metric you can possibly think of for data
reliability as well as processing metrics such as Flink & JVM stats.
We use M3, our open source metrics platform for Prometheus.
Upstream data can and will change on you.
Imputing State
Uber’s first and foremost responsibility is to ensure a reliable, safe ride
for our users from request/dispatch all the way to a completed trip.
Mobile logging is buffered and the best option, especially when running
on low tech phones and in areas with poor network connectivity.
Our sessionization pipelines need to be resilient to dropped events
Sessions in Numbers
Scale
We ingest tens of billions of events daily.
We generate tens of millions of sessions.
Currently the production pipeline is running in Spark Streaming and
we’re comparing it against a Flink successor.
Thank you.
Learn more: uber.com/marketplace

More Related Content

Similar to Sessionizing Uber Trips in Realtime - Flink Forward '18, Berlin

Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Danny Yuan
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberSudhir Tonse
 
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2
 
Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features WSO2
 
Patterns & Practices for Cloud-based Microservices
Patterns & Practices for Cloud-based MicroservicesPatterns & Practices for Cloud-based Microservices
Patterns & Practices for Cloud-based MicroservicesC4Media
 
Goto meetup Stockholm - Let your microservices flow
Goto meetup Stockholm - Let your microservices flowGoto meetup Stockholm - Let your microservices flow
Goto meetup Stockholm - Let your microservices flowBernd Ruecker
 
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep diveAWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep divePeter Shi
 
JavaBin Oslo: Open source workflow and rule management with Camunda
JavaBin Oslo: Open source workflow and rule management with CamundaJavaBin Oslo: Open source workflow and rule management with Camunda
JavaBin Oslo: Open source workflow and rule management with CamundaBernd Ruecker
 
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?João Pedro Martins
 
NEW LAUNCH! Building Distributed Applications with AWS Step Functions
NEW LAUNCH! Building Distributed Applications with AWS Step FunctionsNEW LAUNCH! Building Distributed Applications with AWS Step Functions
NEW LAUNCH! Building Distributed Applications with AWS Step FunctionsAmazon Web Services
 
Programming models for event controlled programs
Programming models for event controlled programsProgramming models for event controlled programs
Programming models for event controlled programsPriya Kaushal
 
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션AWS Step Functions을 활용한 서버리스 앱 오케스트레이션
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션Amazon Web Services Korea
 
Loadrunner interview questions and answers
Loadrunner interview questions and answersLoadrunner interview questions and answers
Loadrunner interview questions and answersGaruda Trainings
 
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..Matt Walters
 
Declarative presentations UIKonf
Declarative presentations UIKonfDeclarative presentations UIKonf
Declarative presentations UIKonfNataliya Patsovska
 
Serverless Days 2019 - Lost in transaction
Serverless Days 2019 - Lost in transactionServerless Days 2019 - Lost in transaction
Serverless Days 2019 - Lost in transactionBernd Ruecker
 
An introduction to Spot Instances and AWS Fleet - Webinar
An introduction to Spot Instances and AWS Fleet - WebinarAn introduction to Spot Instances and AWS Fleet - Webinar
An introduction to Spot Instances and AWS Fleet - WebinarCMPUTE
 
Architecting Microservices in .Net
Architecting Microservices in .NetArchitecting Microservices in .Net
Architecting Microservices in .NetRichard Banks
 
Azure appservice
Azure appserviceAzure appservice
Azure appserviceRaju Kumar
 

Similar to Sessionizing Uber Trips in Realtime - Flink Forward '18, Berlin (20)

Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at Uber
 
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
 
Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features
 
Patterns & Practices for Cloud-based Microservices
Patterns & Practices for Cloud-based MicroservicesPatterns & Practices for Cloud-based Microservices
Patterns & Practices for Cloud-based Microservices
 
Goto meetup Stockholm - Let your microservices flow
Goto meetup Stockholm - Let your microservices flowGoto meetup Stockholm - Let your microservices flow
Goto meetup Stockholm - Let your microservices flow
 
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep diveAWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
 
JavaBin Oslo: Open source workflow and rule management with Camunda
JavaBin Oslo: Open source workflow and rule management with CamundaJavaBin Oslo: Open source workflow and rule management with Camunda
JavaBin Oslo: Open source workflow and rule management with Camunda
 
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
 
NEW LAUNCH! Building Distributed Applications with AWS Step Functions
NEW LAUNCH! Building Distributed Applications with AWS Step FunctionsNEW LAUNCH! Building Distributed Applications with AWS Step Functions
NEW LAUNCH! Building Distributed Applications with AWS Step Functions
 
Programming models for event controlled programs
Programming models for event controlled programsProgramming models for event controlled programs
Programming models for event controlled programs
 
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션AWS Step Functions을 활용한 서버리스 앱 오케스트레이션
AWS Step Functions을 활용한 서버리스 앱 오케스트레이션
 
Loadrunner interview questions and answers
Loadrunner interview questions and answersLoadrunner interview questions and answers
Loadrunner interview questions and answers
 
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..
How to CQRS in node: Eventually Consistent, Distributed Microservice Systems..
 
Windows Workflow Foundation
Windows Workflow FoundationWindows Workflow Foundation
Windows Workflow Foundation
 
Declarative presentations UIKonf
Declarative presentations UIKonfDeclarative presentations UIKonf
Declarative presentations UIKonf
 
Serverless Days 2019 - Lost in transaction
Serverless Days 2019 - Lost in transactionServerless Days 2019 - Lost in transaction
Serverless Days 2019 - Lost in transaction
 
An introduction to Spot Instances and AWS Fleet - Webinar
An introduction to Spot Instances and AWS Fleet - WebinarAn introduction to Spot Instances and AWS Fleet - Webinar
An introduction to Spot Instances and AWS Fleet - Webinar
 
Architecting Microservices in .Net
Architecting Microservices in .NetArchitecting Microservices in .Net
Architecting Microservices in .Net
 
Azure appservice
Azure appserviceAzure appservice
Azure appservice
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Sessionizing Uber Trips in Realtime - Flink Forward '18, Berlin

  • 1. Threading Needles in a Haystack: Sessionizing Uber Trips in Real Time Amey Chaugule Uber Technologies, Inc.
  • 2. Agenda ● The Marketplace Org @ Uber ● Sessionization use cases at Uber ● Some unique challenges about Sessions @ Uber ● Anatomy of an Uber Session ● Our Sessions DSL ● Sessions in Production ● Scale ● Q & A
  • 4. The Uber Marketplace In our marketplace, there are models that describe the world and the decision engines that act upon them.
  • 5. Marketplace @ Uber Real-time events in the physical world drive marketplace dynamics which then affect the algorithmic engines in the Marketplace which in turn influence the events in the real world in a continuous feedback loop. Some examples of marketplace dynamics: Supply Forecast Trips
  • 6. Need for a sessionized view of the Uber experience Given the scale and complexity of our systems, events are distributed across multiple disparate real-time streams over the twin dimensions of time and space. How do we contextualize these event streams so they can be logically grouped together and quickly surface useful information to downstream decision engines?
  • 7. Real-time Use cases For instance, the some algorithms need to adjust to spikes in demand to balance the supply in near real-time. While some machine learning-models need information about pre-request activities in real-time.
  • 8. To understand rider behaviour we need to understand the full user experience from start to finish. Post-Dispatch Rider Cancel Completed Trip Pre-Dispatch Rider Cancel Unfulfilled Dispatched Exit pre-request screen Exit at request screen RequestSession Start Driver Cancel Get to Request Screen R/S C/R C/S The definition of this experience, a SESSION, is critical to understanding our internal business operations.
  • 9. Some unique challenges Given the ride-sharing marketplace Uber operates, our Sessions state machine needs to model interactions between riders, driver partners and back-end systems
  • 10. The Anatomy of an Uber Session
  • 12. The Sessions State Machine - The shopping state
  • 13. The Sessions State Machine - Requesting a ride
  • 14. The Sessions State Machine - Cancellation (the sad path ☹ )
  • 15. The Sessions State Machine - On Trip (The happy path )
  • 16. The Sessions State Machine - Session End (The happy path )
  • 17. The Sessions State Machine - On trip (the happy path ) Implementation Pickup Experience at Super Bowl 2017
  • 18. Sessions DSL type Condition = SessionInput => Boolean /** * This class defines a single transition. It takes in two functions: * * @param condition Function of type (SessionInput => Boolean) representing conditions that trigger this transition. * @param nextState Function of type (State, SessionInput) => State, which results in the next state. */ class Transition(condition: Condition, nextState: (State, SessionInput) => State) { def isConditionValid(event: SessionInput) = condition(event) def transition(event: SessionInput, fromState: State): State = nextState(fromState, event) } e.g. val requestRideAndRiderLookingTransition = new Transition(isRequestEvent and isRiderLooking, RequestRideState.transitionTo)
  • 19. Sessions DSL sealed trait State { val ts: Long val name: RiderSessionStateName.Value val transitionReason: Option[String] val jobUuid: Option[String] val originLocation: Option[GeoLocation] val destinationLocation: Option[GeoLocation] // List of Transitions out of this state. They are evaluated in the order of precedence they appear in this list. val transitions: List[Transition] /** * * @param event Current session input. * @return The next state resulting from the input event. */ def withEvent(event: SessionInput): State = { // Find the transition with condition that event holds valid. val transition = transitions.find(t => t.isConditionValid(event)) transition map(_.transition(event, this)) getOrElse this } }
  • 20. Sessions DSL - Putting it all together case class RequestRideState(ts: Long, override val vehicleViewId: VehicleView, transitionReason: Option[String] = None, jobUuid: Option[String] = None, originLocation: Option[GeoLocation] = None, destinationLocation: Option[GeoLocation] = None) extends State with AssignedVehicleView { override val name: SessionState.Value = SessionState.RequestRide override val transitions: List[Transition] = List(Transition.onTripTransition, Transition.shoppingStateFromRequestTransition, Transition.requestRideSelfTransition, Transition.shoppingStateOnRequestExpiration) }
  • 21. Sessions - Moving from Spark to Flink unionedStreams .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor[SessionInput](Time.seconds(30)) { override def extractTimestamp(event: SessionInput): Long = event.timestamp }) .keyBy(_.riderUuid) .window(EventTimeSessionWindows.withDynamicGap(Time.minutes(30))) .evictor(DeltaEvictor.of(30000.0D, FlinkSessionsPipeline.deltaFunction, true)) .process(new ProcessWindowFunction[SessionInput, List[RiderSessionObject], String, TimeWindow]() { … } Spark’s stateFunc abstraction just fit nicely into ProcessWindow
  • 23. “Time is relative… and clocks are hard.” - I. Brodsky Our state machine models interactions between riders, drivers, and internal back-end systems each with their own notion of time! We essentially need to keep track of per-key watermarks.
  • 24. Checkpointing Checkpointing to HDFS can be unreliable. What levels of backpressure can your downstream applications tolerate? At times, it’s just easier to store per-key state to Kafka and restart a new pipeline, letting backfill take care of any gaps.
  • 25. Backfills We rely on upserts into Elasticsearch and backfilling can introduce subtle bugs by being “more correct.” In our implementation each session is indexed by an anonymized rider UUID and its start time, i.e. event time of the first event to kick off a session. Re-ordering the event consumption in a backfill can easily give you two nearly identical sessions with slightly different start times
  • 26. Schema Evolution Uber’s ride sharing products are constantly evolving and our pipeline needs to keep in sync. A new pipeline without a state needs to warm up the state before we can trust it to reliably write to our production indices. Production hand off between the old and the new pipelines needs to be carefully choreographed.
  • 27. Observability Keep track of any and every metric you can possibly think of for data reliability as well as processing metrics such as Flink & JVM stats. We use M3, our open source metrics platform for Prometheus. Upstream data can and will change on you.
  • 28. Imputing State Uber’s first and foremost responsibility is to ensure a reliable, safe ride for our users from request/dispatch all the way to a completed trip. Mobile logging is buffered and the best option, especially when running on low tech phones and in areas with poor network connectivity. Our sessionization pipelines need to be resilient to dropped events
  • 30. Scale We ingest tens of billions of events daily. We generate tens of millions of sessions. Currently the production pipeline is running in Spark Streaming and we’re comparing it against a Flink successor.
  • 31. Thank you. Learn more: uber.com/marketplace