Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Similarity-Based Prediction of Travel Times for Vehicles Traveling on Known Routes Christian S. Jensen  and Dalia Tiesyte ...
Bus Arrival Time Prediction <ul><li>Modern collective transport infrastructures </li></ul><ul><ul><li>encompass online, ge...
Prediction of Travel Times <ul><li>Provide users with more accurate real-time information </li></ul><ul><li>Improve indivi...
Proposed Approach IWCTS, Dublin, 21 April 2008Summer School, Agder, July 1, 2008 <ul><ul><li>Find the historical trajector...
Prediction System Server Communication Infra- structure Historical Trajectory Data  Store trajectory Retrieve  a similar t...
Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor tr...
Route and Trajectory Representation <ul><li>Routes are mapped from 2D to 1D: Locations are given as the distance from the ...
Nearest Neighbor Trajectory <ul><li>We expect that a similar trajectory from the past can predict the future movement of t...
Problem Statement <ul><li>Travel-time prediction by similar historical trajectories </li></ul><ul><ul><li>Define a similar...
Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor tr...
Similarity Measures–Requirements  <ul><li>Fundamental assumption: similar past implies similar future. </li></ul><ul><li>A...
Weighted L P  Distance ( WLP ) <ul><li>Weighted Euclidean Distance </li></ul><ul><ul><li>efficient to compute </li></ul></...
Correlation-Based Weights <ul><li>Trajectory representation </li></ul><ul><li>The weight  w i   for segment  i   is the su...
Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor tr...
Prediction by Nearest Neighbor <ul><li>Dynamically choose a  trajectory  from the available database that minimizes  WLP ....
List-Based Indexing <ul><li>Assumptions </li></ul><ul><ul><li>A trajectory is a sequence ( Δ t 1 , …,   Δ t n ), where  Δ ...
List-Based Indexing, Cont. <ul><li>Data structure </li></ul><ul><ul><li>A sorted list for each timing point on the route, ...
List-Based Indexing, Cont. Has to be searched at most query max. query length: 4 } nearest neighbors current
Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor tr...
Empirical Evaluation <ul><li>Both real and generated data were used. </li></ul><ul><ul><li>In the generated data, the clus...
Conclusions <ul><li>Fundamental assumption: A similar past trajectory can predict the future trajectory of a vehicle. </li...
Future Work <ul><li>Currently </li></ul><ul><ul><li>Dynamic choice of the  nearest neighbor trajectory (NNT)  that minimiz...
Thank you for your attention. 
Related Work <ul><li>Existing approaches to travel time prediction </li></ul><ul><ul><li>Autoregressive models/Kalman filt...
Experimental Results <ul><li>Correlation-based weights give the most accurate prediction. </li></ul><ul><li>ITA significan...
Upcoming SlideShare
Loading in …5
×

acmgis 2008

433 views

Published on

Similarity search for historical scheduled trajectory data

Published in: Business, Economy & Finance
  • Be the first to comment

  • Be the first to like this

acmgis 2008

  1. 1. Similarity-Based Prediction of Travel Times for Vehicles Traveling on Known Routes Christian S. Jensen and Dalia Tiesyte Aalborg University, Denmark ACMGIS, November 6-8, 2008
  2. 2. Bus Arrival Time Prediction <ul><li>Modern collective transport infrastructures </li></ul><ul><ul><li>encompass online, geo-positioned buses, a central server, and online variable displays </li></ul></ul><ul><ul><li>inform the users of the anticipated arrival times of buses </li></ul></ul><ul><ul><li>reward/penalize bus companies based on their compliance with service agreements </li></ul></ul><ul><li>Accurate arrival time prediction is of essence </li></ul><ul><ul><li>important for the companies that deliver the software to the bus companies </li></ul></ul><ul><ul><li>currently deployed techniques are typically do not offer the desired accuracy </li></ul></ul><ul><ul><li>motivated by the availability of large collections of historical data, we propose a data-driven approach to arrival time prediction </li></ul></ul>
  3. 3. Prediction of Travel Times <ul><li>Provide users with more accurate real-time information </li></ul><ul><li>Improve individual journey planning and reduce waiting times </li></ul><ul><li>Assist in the planning of routes and schedules </li></ul><ul><li>Enable carriers to provide the expected service (on-time arrivals, predictable delays) </li></ul><ul><li>This contributes to making collective transport more attractive. </li></ul>Goal: To p redict the near-future arrival times at timing points of vehicles traveling along on known routes.
  4. 4. Proposed Approach IWCTS, Dublin, 21 April 2008Summer School, Agder, July 1, 2008 <ul><ul><li>Find the historical trajectory most similar to the partial real-time trajectory of the vehicle </li></ul></ul><ul><ul><li>Use the “future” of the historical trajectory to predict the vehicle’s future movement. </li></ul></ul>
  5. 5. Prediction System Server Communication Infra- structure Historical Trajectory Data Store trajectory Retrieve a similar trajectory New prediction Position Prediction Similarity Search Update trajectory
  6. 6. Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor trajectories </li></ul></ul><ul><li>Similarity measures for trajectories of vehicles </li></ul><ul><li>Similarity search </li></ul><ul><li>Results </li></ul><ul><li>Conclusion </li></ul>
  7. 7. Route and Trajectory Representation <ul><li>Routes are mapped from 2D to 1D: Locations are given as the distance from the start of the route. </li></ul><ul><li>A vehicle’s trajectory is represented as a sequence of travel times in-between the timing points on route R: </li></ul>2D 1D  t 1  t 2  t 3  t 4  t 5  p 0  p 1  p 2  p 3  p 4  p 5
  8. 8. Nearest Neighbor Trajectory <ul><li>We expect that a similar trajectory from the past can predict the future movement of the vehicle. </li></ul>p n Real-time trajectory Historical trajectory Future Past time = 0 time = t cur time = t i time = t n p 0 p i p cur
  9. 9. Problem Statement <ul><li>Travel-time prediction by similar historical trajectories </li></ul><ul><ul><li>Define a similarity (distance) measure d that enables the selection of the most similar historical trajectory (NNT), which would serve as an accurate predictor of the vehicle’s future movement. </li></ul></ul><ul><li>Efficient similar historical trajectory retrieval. </li></ul><ul><ul><li>Retrieve the trajectory from the database that minimizes d between a historical and the (partial) real-time trajectory. </li></ul></ul><ul><ul><li>Enable variable-length queries. </li></ul></ul><ul><ul><li>Incrementally update the NNT as new points arrive in the real-time trajectory. </li></ul></ul><ul><ul><li>Do this efficiently! </li></ul></ul>
  10. 10. Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor trajectories </li></ul></ul><ul><li>Similarity measures for trajectories of vehicles </li></ul><ul><li>Similarity search </li></ul><ul><li>Results </li></ul><ul><li>Conclusion </li></ul>
  11. 11. Similarity Measures–Requirements <ul><li>Fundamental assumption: similar past implies similar future. </li></ul><ul><li>A distance, or similarity, measure is needed for finding a historical trajectory to predict the future movement. </li></ul><ul><li>Requirements for similarity/distance measures </li></ul><ul><ul><li>support comparison of fixed-length trajectories </li></ul></ul><ul><ul><li>support sub-trajectories </li></ul></ul><ul><ul><li>is a metric </li></ul></ul><ul><ul><li>amenable to efficient , scalable computation </li></ul></ul><ul><ul><li>enable prioritization of either long- or short-term prediction </li></ul></ul>
  12. 12. Weighted L P Distance ( WLP ) <ul><li>Weighted Euclidean Distance </li></ul><ul><ul><li>efficient to compute </li></ul></ul><ul><ul><li>can be applied to sub-trajectories </li></ul></ul><ul><ul><li>outliers are tolerated to some extent (controlled by varying P) </li></ul></ul><ul><ul><li>weights can be added to prioritize the past segments that are more relevant for the prediction of the future </li></ul></ul><ul><li>We use a weighted L P -norm based distance </li></ul><ul><ul><li>The Δ t i are from the real-time trajectory and the Δ t i ’ are from a historical trajectory. </li></ul></ul>
  13. 13. Correlation-Based Weights <ul><li>Trajectory representation </li></ul><ul><li>The weight w i for segment i is the sum of the correlation coefficients k ij , j=cur+1,... cur+k , where k is the number of future segments to be predicted </li></ul><ul><ul><li>We propose to use the Kendall τ rank correlation coefficient </li></ul></ul>
  14. 14. Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor trajectories </li></ul></ul><ul><li>Similarity measures for trajectories of vehicles </li></ul><ul><li>Similarity search </li></ul><ul><li>Results </li></ul><ul><li>Conclusion </li></ul>
  15. 15. Prediction by Nearest Neighbor <ul><li>Dynamically choose a trajectory from the available database that minimizes WLP . </li></ul><ul><li>NNT is the initial trajectory </li></ul><ul><li>while the vehicle is on the route do </li></ul><ul><ul><li>Receive a new position ( p,t ) on tr </li></ul></ul><ul><ul><li>Evaluate d = WLP( NNT , tr ) </li></ul></ul><ul><ul><li>if d exceeds threshold thr </li></ul></ul><ul><ul><li>then find a new NNT that minimizes WLP </li></ul></ul><ul><li> provide NNT as the new prediction </li></ul><ul><li>end while </li></ul>
  16. 16. List-Based Indexing <ul><li>Assumptions </li></ul><ul><ul><li>A trajectory is a sequence ( Δ t 1 , …, Δ t n ), where Δ t i represents the travel time for the i th segment of the route. </li></ul></ul><ul><ul><li>Each trajectory is of length n . </li></ul></ul><ul><ul><li>(In most cases) there does exist a trajectory in the database that is similar to the current real-time trajectory (i.e., trajectories are non-random). </li></ul></ul><ul><li>Requirements </li></ul><ul><ul><li>The index must be able to answer queries of varying length. </li></ul></ul><ul><ul><li>The search should be incremental. </li></ul></ul><ul><ul><li>Perfect precision is required (the most similar trajectory must be found). </li></ul></ul>
  17. 17. List-Based Indexing, Cont. <ul><li>Data structure </li></ul><ul><ul><li>A sorted list for each timing point on the route, and an entry in each list for each trajectory. </li></ul></ul><ul><ul><li>Random access is possible (a sequential-access algorithm exists as well). </li></ul></ul><ul><li>Non-incremental algorithm </li></ul><ul><ul><li>Perform binary search in each corresponding list and locate the points that are closest to the query points. </li></ul></ul><ul><ul><li>Access each list simultaneously (next closest point) and calculate the distances to the accessed trajectories. </li></ul></ul><ul><ul><li>Track the current NNT (i.e., the NNT seen so far): the trajectory that is within the minimum distance from the query trajectory. </li></ul></ul><ul><ul><li>Calculate bound : the distance in-between the query trajectory and the set of the most recently accessed entries in each list. This is the minimum possible distance to the query so far. </li></ul></ul><ul><ul><li>Stop when the bound exceeds the distance to the current NNT. </li></ul></ul><ul><li>Incremental algorithm </li></ul><ul><ul><li>When a new point arrives, re-use the bound calculated in the previous iteration. </li></ul></ul>
  18. 18. List-Based Indexing, Cont. Has to be searched at most query max. query length: 4 } nearest neighbors current
  19. 19. Outline <ul><li>Problem statement </li></ul><ul><ul><li>Data representation </li></ul></ul><ul><ul><li>Nearest neighbor trajectories </li></ul></ul><ul><li>Similarity measures for trajectories of vehicles </li></ul><ul><li>Similarity search </li></ul><ul><li>Results </li></ul><ul><li>Conclusion </li></ul>
  20. 20. Empirical Evaluation <ul><li>Both real and generated data were used. </li></ul><ul><ul><li>In the generated data, the clustering of the data, the average variance of the travel times, and the size of the database were varied. </li></ul></ul><ul><li>Evaluation of similarity measures (accuracy of prediction). </li></ul><ul><ul><li>Euclidean, weighted Euclidean distance (including pre-set and correlation based weights), and LCSS distances were evaluated. </li></ul></ul><ul><ul><li>Correlation-based weights give the most accurate prediction. </li></ul></ul><ul><ul><li>The optimal query length is around 5. </li></ul></ul><ul><li>Evaluation of performance. </li></ul><ul><ul><li>ITA (iterative threshold algorithm), TA (threshold algorithm), and SS (sequential scan) were compared. </li></ul></ul><ul><ul><li>In most cases ITA outperforms TA and SS by the orders of magnitude, especially when queries are long (more than 5 points), and the clusters in the data exist. </li></ul></ul><ul><ul><li>SS can be beneficial with non-clustered (e.g., random) data. </li></ul></ul>
  21. 21. Conclusions <ul><li>Fundamental assumption: A similar past trajectory can predict the future trajectory of a vehicle. </li></ul><ul><li>We have proposed </li></ul><ul><ul><li>to use a weighted L P norm-based distance ( WLP ) as a trajectory similarity measure (more measures are discussed in the paper) </li></ul></ul><ul><ul><li>to index the trajectories with a sorted list-based index and to access them using an Iterative Threshold Algorithm ( ITA ) </li></ul></ul><ul><li>Experimental results suggest that the correlation-based WLP together with ITA yields vehicle travel time prediction that is satisfactorily accurate and efficient. </li></ul>
  22. 22. Future Work <ul><li>Currently </li></ul><ul><ul><li>Dynamic choice of the nearest neighbor trajectory (NNT) that minimizes the distance to the real-time trajectory. </li></ul></ul><ul><li>Proposed extension </li></ul><ul><ul><li>Dynamic choice of the prediction algorithm (including NNT) that minimizes the real-time trajectory prediction error . </li></ul></ul>
  23. 23. Thank you for your attention. 
  24. 24. Related Work <ul><li>Existing approaches to travel time prediction </li></ul><ul><ul><li>Autoregressive models/Kalman filtering </li></ul></ul><ul><ul><li>[Shalaby and Farhan 2001, Cathey and Dailey 2003 , Dailey et al. 2004, Mishalani 2008 ] </li></ul></ul><ul><ul><li>Machine learning: </li></ul></ul><ul><ul><li>Artificial Neural Networks </li></ul></ul><ul><ul><li>[Chien et al. 2002, Park et al. 2004, Hee and Rilett 2004] </li></ul></ul><ul><ul><li>Support Vector Machines </li></ul></ul><ul><ul><li>[Bin et al. 2006] </li></ul></ul><ul><ul><li>Historical speed/time patterns </li></ul></ul><ul><ul><li>[Predic et al. 2007, Sun et al. 2007]. </li></ul></ul>
  25. 25. Experimental Results <ul><li>Correlation-based weights give the most accurate prediction. </li></ul><ul><li>ITA significantly outperforms TA, when queries are long (more than 5 points). </li></ul>pred. error per point, l = 1..25 CPU: ratio vs. sequential scan, l = 1..25 l l Diff. between TA and ITA Correlation based weights

×