1. Real-Time City-Scale Taxi Ridesharing
Shuo Ma, Yu Zheng and Ouri Wolfson
IEEE Transactions on Knowledge and Data Engineering – July 2015
2. Why Ridesharing?
• Taxi demands are higher than number of taxis in
peak hours
• People spend longer time on road
• Obvious solution?
Increasing number of taxis
1. Additional traffic
2. More energy consumption
3. More pollution
4. Decrease taxi driver’s income
2
3. Taxi-sharing system
• Passenger submit a ride request using mobile App
– Origin
– Destination
– Pickup time (present in most case)
– Drop off time (may not present)
• Schedule proper taxis to pick up with conditions
– Time
– Capacity
– Monetary
3
4. Taxi-sharing system
• Existing passengers will be inquired
– Increase in travel time
– Decrease in fare
• Upon agreement of existing passengers, new schedule will
be sent to all the passengers and driver
4
5. Constraints
• Vehicle Capacity Constraint
– Number of riders does not exceed number of seats
• Time Window Constraints
– All the riders should depart the origin and arrive the destination
during corresponding pickup and delivery time
• Monetary Constraints
– Riders do not pay more than without taxi-sharing for same distance
– Driver does not earn less than without taxi-sharing for same distance
– Fare of existing riders decrease when a new rider joins the trip
5
6. Data Model
• Ride request – Q
1. Q.t – request submitted time
2. Q.o – origin point
3. Q.d – destination point
4. Q.pw.l – late window of pickup time
5. Q.dw.l – late window of drop off time
• Rider indicate Q.d and Q.dw.l
• Other properties are automatically obtained
– Q.pw.l is obtained by adding a fixed value to Q.t
6
7. Data Model
• Taxi Status – V
1. V.ID – unique ID of a taxi
2. V.t – timestamp
3. V.l – geographical location of the taxi
4. V.s – Current schedule of the taxi
Temporally ordered sequence of origin and destination of n ride
requests such that for each ride Qi, Qi.o precedes Qi.d
5. V.r – Current projected route
Sequence of road network node based on V.s
• V is dynamic
– For example, scheduling of two ride requests Q1 and Q2 can be
Q1.o -> Q2.o -> Q1.d -> Q2.d at certain time
Q2.o -> Q1.d -> Q2.d once taxi has passed Q1.o
7
8. System Architecture
• Multiple servers for different purposes
• Monitor for administers
• Role specific UI for mobile app
• Taxi automatically reports the location to
cloud
1. Taxi establishes connection with
system
2. Rider gets on and off a taxi
3. At a frequency while taxi is
connected to the system
• Rider submits the request Q to
Communication server (1)
• Incoming requests are streamed into a
queue (FCFS)
• Communication server sends a request to
indexing server (2)
• Indexing server finds candidate taxis and
return back (3)
8
9. System Architecture
• Communication server sends Q and
candidate taxis to Scheduling server (4)
• Scheduling server finds and return the
optimum taxi with its modified schedule
(5)
• Existing riders will be enquired whether
they would like to accept the join of new
rider (6)
• Upon approval of all the existing riders,
new rider gets the confirmation with
- Taxi Id
- Estimated pickup time
- Fare
- Scheduled route
- Reservation code
• Driver receives new schedule and same
reservation code (7)
9
10. System Architecture
• What if one existing rider rejects the new joiner request?
– System remember the rider’s choice
– Automatically reject the route change if the ratio of fair change and
time delay is smaller than the largest value the rider has ever
rejected
– Does not explain how the next taxi is decided?
• Single machine is used to implement the serves
– Answer queries in few milliseconds
– Single point of failure
• No parallel processing of request queue
10
12. Monitor
• Two views
1. Ride request
- Scheduled and unscheduled requests
- Origin and destination of the requests
- Search request using request Id
2. Taxi
- Location of the taxis
- Updated with taxi status update
- Search using taxi Id
12
14. Index of Taxi
• Need spatial-temporal index of taxis to speedup searching
• Partition the network into a grid
• gi represents a grid cell
• Choose a rode network node which is
closest to the center of grid cell
• This is denoted as “anchor node” ci
• Find travel distance dij and travel
time tij for each anchor node pair
• Travel distance is computed only
once
• Travel time is computed frequently
using historical data 14
15. Index of Taxi
• Pair of dij and tij are stored in a matrix called “grid distance
matrix”
• Distance between two nodes is approximated by distance
between two anchor nodes
• This avoids expensive computation
15
16. Index of Taxi
• Each grid cell gi maintains three lists
1. Temporally ordered grid cell list lt
2. Spatially ordered grid cell list ls
3. Taxi list lv
• lt and ls are constructed using grid distance matrix
• ls is computed only once
• lt is updated upon update of grid
distance matrix
16
17. Index of Taxi
• Neighbor grid cells may not be the neighbors in lt and ls
• Taxi list lv keeps list of taxis to enter gi in near future with
timestamp ta
• Do not keep all the taxis – scope is limited to 1 or 2 hours
• List lv is updated dynamically
• Taxi is removed from lv when it
leaves the grid
• Tax is added to lv when it is newly
scheduled to ender the grid
17
18. Taxi searching (single-side)
• Ride request Q is received
• Select the grid where Q.o is located (g7)
• Select any other arbitrary grid gi if it satisfies
ti7 + tcur < Q.pw.l
18
19. Taxi searching (single-side)
• For each selected gi, find taxis whose ta is no later than
Q.wp.l – ti7
• Algorithm may result in many taxis retrieved for later
scheduling module
• Not desirable for a real time application
• Solution – dual-side taxi searching
19
20. Taxi searching (dual-side)
• Bi-directional searching process
• Select grid cells and taxis from origin and destination side
simultaneously
• Select grid cells from origin if it satisfies
ti7 + tcur < Q.pw.l
• Select grid cells from
destination if it satisfies
tj2 + tcur < Q.dw.l
20
21. Taxi searching (dual-side)
• Maintain two sets So and Sd
• Initially both are empty
• Pop one grid at a time and add taxis to
the set which satisfies time constraint
• Compute the intersection of sets
• If intersection is not empty return the
list
• Otherwise continue to pop the grid
cell list
• May not result optimum candidates
• However, this is compensated with
reduction in computation time
• Experiments shows this reduces 50%
taxis while the increase in travel time
is 1% over single-side
21
22. Taxi scheduling
• Given set of taxi statuses Sv find the taxi status which
satisfies Q with minimum travel distance increase
• Theoretically all possible ways of inserting Q into the
schedule have to be tried
– Reorder the schedule
– Insert Q.o
– Insert Q.d
– Check capacity and time constraints while inserting
– Check monetary constraints after inserting
• f
n!/2m insertions
22
23. Taxi scheduling
• Simply ignore the reordering of existing schedule (may be
even convenient for the existing passengers)
• O(n2) ways of inserting the new ride request
• Consider the fastest path for insertion (may not be the
shortest)
• Travel time increase due to insert of Q.o
td = (Q1.o -> Q3.o) + (Q3.o -> Q3.o) + tw - (Q1.o -> Q2.o)
• If td results late arrival for any Q, insertion fails
23
24. Taxi scheduling
• Use slack time for efficient time constraint check
– Slack time on origin arrival (Q.o)st = Q.pw.l - ap
– Slack time on destination arrival (Q.d)st = Q.pw.l - ad
• Insetion fails if td > Min(any of (Q.d)st)
• If Insertion is successful update slack time and proceed to
insert Q.d
• Check capacity constraint before insertion
• Resulting insertion satisfies both time and capacity
constraint
24
25. Taxi scheduling
• Check monetary constraints
– Any rider who participate in taxi-sharing should pay no more than
what he takes a taxi by himself
– Each rider whose travelling time is lengthened due to join of new
rider Q should get a decrease in taxi fare proportionally
– Driver should make money for the distance of reroutes incurred
• This encourages riders and passengers to participate in ride-
sharing
• Before checking the monetary constraints,
– di : distance between Qi.o and Qi.d
– F : fare calculation function
– fi : taxi fare of rider i after Qn joins
– M : New revenue of the driver
– D : New total distance
25
26. Taxi scheduling
• First monetary condition
fi ≤ F(di)
• Third monetary condition
M ≥ F(D)
• M = ∑fi ,
F(D) ≤ M = ∑fi ≤ = ∑F(di)
• Take the lower bound of M to minimize the total taxi fare
of riders
• Distribute the total fair M to n riders
– ∆fi : decrease in fair for rider i
– ∆Ti : increase in travel time for rider i
– ∆D : increase in total travel time
– f: positive constant
26
27. Taxi scheduling
• Fair of nth rider,
fn = F(dn) - f
• Fair reduction of other riders,
∆fi = ∆Ti/∑∆Ti *(F(dn) - f - F(∆D)
Existing drivers share the fair reduction
• As ∆fi > 0
F(dn) > f + F(∆D)
• This is the sufficient and necessary condition for a taxi to
satisfy all three monetary conditions
27
28. Taxi scheduling
• Upon suggestion of new ride join, reduction in fair and
increment in travel time, existing rider may reject it
• Keep this rejection rate Q.r for each ride request
• For later requests check
∆fi/∆Ti ≥ Qi.r
28
30. Evaluation
• Road network - real network of Beijing
• Taxi trajectories - trajectory data set contains the GPS
trajectory of over 33,000 taxis during a period of 87 days
• Requests are distributed sparsely over the network
30
31. Evaluation
• Generate realistic ride request stream for validation
1. Learn distribution of ride requests over time of a day
– Discretize a day into time frame
– Learn Poisson distribution for a given pair of road segment and time
frame of a day
– Generate Q.o and Q.t
– Q.pw.l is obtained by adding a constant to Q.t
– Q.dw.l is obtained by summing Q.pw.l and average travel distance
between Q.o and Q.d
– Simulate unsuccessful taxi requests with param ∆ which multiplies
successful number of taxi requests in historical data
31
38. Evaluation Results(Effectiveness)
• Taxi sharing saves up to 12% in travel distance depending
on delta
• Saves 1.5 billion km per year
• Reduces the necessity of 120 million liter of gas per year
• Reduces 2.2 million kg carbon dioxide emission per year
38
39. Evaluation Results(Efficiency)
• Average taxi searching time is 0.15ms
• Average scheduling time is 10.33 ms
1. Number of taxis access per ride request - TAPR
39
41. Evaluation Results(Efficiency)
• Average execution time per ride request when using the DB
taxi-sharing method with and without the schedule
reordering before insertion (20% longer for reordering)
41
42. Related Work
1. Taxi Recommendation and Dispatching
– Suggesting parking place using historical trajectories for driver [1]
– Suggest pickup points for drivers [2]
– Several taxi dispatching algorithm which finds closest free taxi [3][4]
2. Carpool and Dial-A-Ride
– Carpool solutions to deal with routine commutes [5] [6]
– Several static DAP solutions [7]
– Little research on dynamic DAP [8]
3. Real-time taxi-sharing
– Taxi-sharing without considering time constraints [9]
– Taxi-sharing with time constraints [10][11]
42
43. Contribution of the work
• Author’s previous work - T-Share: A Large-Scale Dynamic
Taxi Ridesharing Service [12]
• Contribution for the new work
1. Implemented a cloud-mobile based taxi-sharing system
2. Monetary constraints for both drivers and riders are introduced
3. Comprehensive experiments are performed
New measurements are introduced – Taxi sharing rate and seat
occupancy rate
43
44. Strength of the paper
• Well written
• Built in application in hand
• Vast coverage of experiments with implementation of taxi
request stream
• Proposed monetary constraints encourage passengers to use
taxi-sharing
44
45. Weakness of the paper
• Failure scenario of taxi request is not explained
• Single machine implementation of the architecture
• No parallel processing of requests. No evaluation results for
parallel requests
• Travel time update of grid distance matrix is not clearly
explained
• Lower bound is considered for driver revenue which may
discourage driver to participate in taxi-sharing
• Rejection rates are associated with a ride request
45
46. Possible Future Work
• Keep customer profiles
1. Maintain rejection rate for a profile
2. Allows to select taxis based on gender preference
• Add a constant revenue increment for the driver on
monetary constraints validation
• Scalable architecture with parallel processing of requests
46
47. References
[1] J. Yuan, Y. Zheng, L. Zhang, Xi. Xie, and G. Sun, “Where to find my next passenger,” in
Proc. 13th Int. Conf. Ubiquitous Comput., 2011, pp. 109–118.
[2] Y. Ge, H. Xiong, A. Tuzhilin, K. Xiao, M. Gruteser, and M. Pazzani, “An energy-efficient
mobile recommender system,” in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data
Mining, 2010, pp. 899–908.
[3] D. Zhang and T. He, “CallCab: A unified recommendation system for carpooling and
regular taxicab services,” in Proc. IEEE Int. Conf. Big Data, 2013, pp. 439–447.
[4] W. Wu, W. S. Ng, S. Krishnaswamy, and A. Sinha, “To Taxi or Not to Taxi?—Enabling
personalised and real-time transportation decisions for mobile users,” in Proc. IEEE 13th
Int. Conf. Mob. Data Manage., Jul. 2012, pp. 320–323.
[5] R. Baldacci, V. Maniezzo, and A. Mingozzi, “An exact method for the car pooling problem
based on lagrangean column generation,” Oper. Res., vol. 52, no. 3, pp. 422–439, 2004.
1794 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 7,
JULY 2015
[6] R. W. Calvo, F. de Luigi, P. Haastrup, and V. Maniezzo, “A distributed geographic
information system for the daily carpooling problem,” Comput. Oper. Res., vol. 31, pp.
2263–2278, 2004.
47
48. References
[7] C. J.-F. and L. G., “A tabu search heuristic for the static multivehicle dial-a-ride
problem,” Transp. Res. Part B Methodol., vol. 37, no. 6, pp. 579–594, 2003
[8] M. E. T. Horn, “Fleet scheduling and dispatching for demandresponsive passenger
services,” Transp. Res. Part C Emerg. Technol., vol. 10, no. 1, pp. 35–63, 2002.
[9] G. Gidofalvi, T. B. Pedersen, T. Risch, and E. Zeitler, “Highly scalable trip grouping for
large-scale collective transportation systems,” in Proc. 11th Int. Conf. Extending Database
Technol.: Adv. Database Technol., 2008, pp. 678–689
[10] P. M. d’Orey, R. Fernandes, and M. Ferreira, “Empirical evaluation of a dynamic and
distributed taxi-sharing system,” in Proc. 15th Int. IEEE Conf. Intell. Transp. Syst., Sep.
2012, pp. 140–146.
[11] S. Ma and O. Wolfson, “Analysis and evaluation of the slugging form of ridesharing,” in
Proc. 21st ACM SIGSPATIAL Int. Conf. Adv. Geographic Inf. Syst., 2013, pp. 64–73.
[12] S. Ma, Y. Zheng, and O. Wolfson, “T-Share: A large-scale dynamic ridesharing service,”
in Proc. 29th IEEE Int. Conf. Data Eng., 2013, pp. 410–421
48