Your SlideShare is downloading. ×
Where Next
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Where Next

677
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
677
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • We have found a way
  • We did an experiment
  • Made so far
  • Transcript

    • 1. Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti. WhereNext: a Location Predictor on Trajectory Pattern Mining . KDD 2009 Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa) ‏ www-kdd.isti.cnr.it
    • 2.
      • Wireless networks infrastructures are the nerves of our territory
      • besides offering their services, they gather highly informative traces about the human mobile activities
      • Miniaturization, wearability, pervasiveness will produce traces of increasing
        • positioning accuracy
        • semantic richness
    • 3.
      • From the analysis of the traces of our mobile phones it is possible to reconstruct our mobile behaviour, the way we collectively move
      • This knowledge may help us improving decision-making in many mobility-related issues:
        • Planning traffic and public mobility systems in metropolitan areas;
        • Planning physical communication networks
        • Forecasting traffic-related phenomena
        • Organizing logistics systems
        • Prediction
    • 4.  
    • 5.
      • Predicting the next location of a trajectory can improve a large set of services such as:
      • Navigational services.
      • Traffic management.
      • Location-based advertising.
      • Services Pre-fetching.
      • Simulation.
      ? ? ? .4 .8 .35
    • 6.
      • How to realize this idea:
      • Extract patterns from all the available movements in a certain area instead of on the individual history of an object;
      • Using these Local movement patterns as predictive rules.
      • Build a prediction tree as global model.
      Trajectory dataset Local patterns Prediction Tree
    • 7. Select the set of interesting trajectories Validation Evaluation Extract T-Patterns (A set of Local models) Merge T-Patterns (Global model) Use the Condensed model as predictor
    • 8.
      • The local pattern we use is the T-Pattern. It describes the common behavior of a group of users in space and time.
      F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi. Trajectory pattern mining . KDD 2007: 330-339.
    • 9.
      • Generating all rules from each T-pattern and using them to build a classifier is too expensive.
      T-Pattern Rules α 1 α 2 α 3 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4
    • 10.
      • To avoid the rules generation the T-Pattern set is organized as a prefix tree.
      • For Each node v • Id identifies the node v
      • • Region a spatial component of the T-Pattern
      • • Support is the support of the T-pattern
      • For Each edge j
      • • [a,b] correspond to the time interval α n of the T-Pattern
    • 11.
      • Three steps:
        • Search for best match
        • Candidate generation
        • Make predictions
      How to compute the Best Match? Best Match Prediction
    • 12.
      • The spatio-temporal distance computed between the segment of trajectory (bounded in time using the previous transition time) and the current node of the path.
      Case a : The trajectory segment intersects the region of the node Case b : The enlarged trajectory segment intersects the region Case c : The enlarged trajectory segment doesn’t intersect the region Where the th_t is the time tolerance window defined by the user.
    • 13.
      • The path score is the aggregation of all punctual scores along a path.
      • The Best Match is the path having:
        • the maximum path score;
        • at least one admissible prediction.
      10 min 15 min 8 min 10 min Punctual score: 1 Punctual Score: .58 Punctual Score: .8 11 min 16 min Path score .79
    • 14.
      • Average generalizes distances between the trajectory and each node
      • Sum is based on the concept of depth
      • Max is the optimistic one, the best punctual score is selected as path score
      • Context-dependent aggregations can take into consideration other aspects of the problem.
    • 15.
      • The WhereNext algorithm can be tuned using its parameters: - th_t : time window tolerance
      • - th_s : space window tolerance
      • - th_score : minimum prediction score threshold
      • - th_agg : the aggregation function used to compute the path score (Avg, Sum or Max)
    • 16.
      • It is very hard to understand which is the best set of T-patterns we can use to build the our model:
      • a big set of T-patterns  very slow prediction.
      • a small set of T-patterns  coverage leaks
      • For this reason we have defined a way to measure the prediction power of a T-Pattern set.
    • 17.
      • An evaluating function is defined to estimate the predicting power of a T-Pattern set.
      • SpatialCoverage : the space coverage of the regions contained in the T-Patterns set;
      • DatasetCoverage : measures how much the T-Pattern set represents the trajectories
      • RegionSeparation : the precision of the regions in the T-Pattern set.
      Model 1 Model 2 Testing the a priori evaluation
    • 18. You are here
    • 19.
      • The results are evaluated using the following measures:
      • Accuracy : rate of the correctly predicted locations (space and time) divided by the total number of trajectories to be predicted.
      • Average Error : the average distance between the real trajectories in the predicted interval and the region predicted.
      • Prediction rate : the number of trajectories which have a prediction divided by the total number of trajectories to be predicted.
      Predicted Location Cut Original Predicted Location Cut Original Error
    • 20.
      • We used real life GPS dataset obtained from 17,000 vehicles in the urban area of the city of Milan.
      Training set : 4000 trajectories between 7am and 10 am on Wednesday Test set : 500 trajectories between 7am and 10 am on Thursday.
    • 21.
      • Predicted vs th_score
      Average Error vs th_space
    • 22.
      • Accuracy vs Average Error
      Single Users Accuracy and Prediction rate
    • 23.
      • A visual example of the application on Milan mobility data. The context is traffic management and we want to predict how the traffic will move in the city center.
      • We have built a predictor on a “good” set of T-patterns which include the city gates of Milan.
      Part of the GeoPKDD integrated platform. F. Giannotti, D. Pedreschi, and et al. Geopkdd: Geographic privacy-aware knowledge discovery and delivery (european project), 2008.
    • 24.
      • - A new technique to predict the next locations of a trajectory based on previous movements of all the objects without considering any information about the users. - The time information is used not only to order the events but is intrinsically equipped in the T-Patterns used to build the Prediction tree. - The user can tune the method to obtain a good accuracy and prediction rate.
      • - We are experimenting the method in real world applications.
    • 25.  
    • 26. Trajectories Dataset Regions of Interest T-PATTERNS
    • 27.  
    • 28.
      • The same exact spatial location (x,y) usually never occurs twice
      • The same exact transition times usually do not occur twice
      • Solution: allow approximation
        • a notion of spatial neighborhood
        • a notion of temporal tolerance
    • 29.
      • Two points match if one falls within a spatial neighborhood N() of the other
      • Two transition times match if their temporal difference is ≤ τ
      • Example:
    • 30.
      • Two points match if one falls within a spatial neighborhood N() of the other
      • Two transition times match if their temporal difference is ≤ τ
      • Example:
    • 31.
      • Two points match if one falls within a spatial neighborhood N() of the other
      • Two transition times match if their temporal difference is ≤ τ
      • Example:
    • 32.
      • T-pattern mining can be mapped to a density estimation problem over R 3n-1
        • 2 dimensions for each (x,y) in the pattern (2n) ‏
        • 1 dimension for each transition (n-1) ‏
      • Density computed by
        • mapping each sub-sequence of n points of each input trajectory to R 3n-1
        • drawing an influence area for each point (composition of N() and τ )
      • Too computationally expensive, heuristics needed
      • Our solution: a combination of sequential pattern mining and density-based clustering