- 1. Social Event Detection V.A. Traag1, A. Browet1, F. Calabrese2, F. Morlot3 1Department of Applied Mathematics UCL, Louvain-la-neuve, Belgium 2SENSEable City Lab MIT, Cambridge, USA 3Orange Labs Issy-les-Moulineaux, France 24 February 2011
- 2. Outline 1 Motivation 2 Bayesian Location Inference 3 Identiﬁcation of frequent location 4 Event detection 5 Presence probability
- 3. Introduction Purpose Analyze mobility and social behaviour of mobile phone users: 1 Detect social events i.e. unsual large gatherings of poeple. 2 Identify frequent location such as home or oﬃce. Motivation 1 Between 70% & 80% of human mobility is explain by the daily home-oﬃce routine (Barabasi et al.). Analyze the out-of-ordinary behaviour. 2 Anticipate the impact of large events on urban transit for traﬃc regulation or public transportation. 3 Identiﬁcation/Classiﬁcation of users and their habits for telecommunication company.
- 4. Introduction Purpose Analyze mobility and social behaviour of mobile phone users: 1 Detect social events i.e. unsual large gatherings of poeple. 2 Identify frequent location such as home or oﬃce. Motivation 1 Between 70% & 80% of human mobility is explain by the daily home-oﬃce routine (Barabasi et al.). Analyze the out-of-ordinary behaviour. 2 Anticipate the impact of large events on urban transit for traﬃc regulation or public transportation. 3 Identiﬁcation/Classiﬁcation of users and their habits for telecommunication company.
- 5. Introduction Purpose Analyze mobility and social behaviour of mobile phone users: 1 Detect social events i.e. unsual large gatherings of poeple. 2 Identify frequent location such as home or oﬃce. Motivation 1 Between 70% & 80% of human mobility is explain by the daily home-oﬃce routine (Barabasi et al.). Analyze the out-of-ordinary behaviour. 2 Anticipate the impact of large events on urban transit for traﬃc regulation or public transportation. 3 Identiﬁcation/Classiﬁcation of users and their habits for telecommunication company.
- 6. Introduction Purpose Analyze mobility and social behaviour of mobile phone users: 1 Detect social events i.e. unsual large gatherings of poeple. 2 Identify frequent location such as home or oﬃce. Motivation 1 Between 70% & 80% of human mobility is explain by the daily home-oﬃce routine (Barabasi et al.). Analyze the out-of-ordinary behaviour. 2 Anticipate the impact of large events on urban transit for traﬃc regulation or public transportation. 3 Identiﬁcation/Classiﬁcation of users and their habits for telecommunication company.
- 7. Introduction Available data 1 Precise location of antennas but no orientation information. 2 Record for each connection to the networks (calls, text messages, mobile internet,...) Compute 2 probability measures 1 φi (x) to be connected to antenna i given a position x 2 ψi (x) to be in position x given that the user was connected to antenna i
- 8. Location Inference The signal strength at position x of an antenna i at position Xi is deﬁned by: • the power of the antenna pi ; but pi = p; • the loss of signal strength over distance: Li (x) = 1 x − Xi β ; • a stochastic fading of the signal i.e. the Rayleigh fading Ri : Pr(Ri ≤ r) = F(r) = 1 − e−r .
- 9. Location Inference The signal strength of antenna i is then given by Si (x) = pi Li (x)Ri . Further assumptions: • Ri ⊥⊥ Rj ∀i = j. • given a position x, the user connects to the antenna i with the highest signal strength: Si (x) ≥ Sj (x) ∀j ∈ X Si (x) = max j∈X Sj (x)
- 10. Location Inference Let ai denote the fact that a user connects to antenna i. Pr(ai |x) = Pr(Si (x) = maxj∈X Sj (x)) = j∈X j=i Pr (pi Li (x)Ri ≥ pj Lj (x)Rj ) If we assume that the random variable Ri realize a speciﬁc value r, Pr(ai |x, Ri = r) = j∈X j=i Pr Rj ≤ Li (x) Lj (x) r = j∈X j=i F Li (x) Lj (x) r
- 11. Location Inference Then, it follows that φi (x) = Pr(ai |x) = ∞ 0 f (r)Pr(ai |x, Ri = r)dr = ∞ 0 e−r j∈X j=i 1 − exp −r ||x−Xj ||β ||x−Xi ||β dr ≈ ∞ 0 e−r j∈Xi 1 − exp −r ||x−Xj ||β ||x−Xi ||β dr How to choose the local neighborhood and what is its impact ?
- 12. Location Inference Delaunay Radius: ρi = max{d(Xi , Xj )| j Delaunay of i} The domain Di is deﬁne by Di = {x|rρi ≥ d(x, Xi )} The neighborhood is computed as Xi = {j|Xj ∈ Di , j ∈ X}
- 13. Location Inference Average error on 1000 random points 1 1.5 2 2.5 3 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 r Averageerror
- 14. Location Inference Based on Bayes rule, we can obtain ψi (x) = Pr(x|ai ) = Pr(ai |x)Pr(x) Pr(ai ) The value Pr(x) Pr(ai ) is not known but can be assumed constant over the domain Di . It follows that ψi (x) = φi (x) Di φi (x)dx
- 15. Location Inference Probability density ψi (x)
- 16. Frequent Location Indentiﬁcation Probability that a user connects to antenna i is φi (x) Probability that he made ki calls with antenna i is then φi (x)ki The likelihood of observing those calling frequencies is L(x|k) = i∈H φi (x)ki log L(x|k) = i∈H ki log φi (x) Maximum Likelihood Estimator(MLE) ˆxh(u) = arg max x log L(x|k(u))
- 17. Overview Event Detection General • Looking for unusual large gatherings of people. • Which people are likely to be attending an (possible) event? • Should be present at the event location with high probability. • Should not be often there. Presence probability Given calls in the neighbourhood, what is the probability the user was present during the time interval of an event? Ordinary probability What is the average probability a user was present during other weeks.
- 18. Presence probability Derivation • Probability user in area A at time tc for a call c is pc. • Assume constant leave and arrival rate γ • Then for t = tc we have e−γ|t−tc |pc. • Take max over all calls c for a user pp = 1 te − ts te ts max c e−γ|t−tc | pcdt Motivation • More calls ⇒ higher presence probability • Calls close by ⇒ higher presence probability • Don’t take into account calls outside of area.
- 19. Presence probability ← First call ← Second call Time Probability 13 14 15 16 17 18 19 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
- 20. Ordinary probability How regularly is user in the area? (Consider only same weekday, same time of day) April 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Was not present, i.e. pp(i) = 0 Was in area with probability pp(2) Was in area with probability pp(5) Ordinary probability Ordinary probability deﬁned as average probability, i.e. po = 1 W W i=1 pp(i)
- 21. Probability of attending Maximum ordinary probability • Should be present with relatively high probability • Relatively rarely present ⇒ small po (i.e. only for the event) • What is theoretical maximum ordinary probability ¯po? • Theoretical maximum: make inﬁnite number of calls with ‘best’ antenna. Probability of attending • Probability user attended then calculated as pa = pp(1 − po/¯po)
- 22. Event detection Number of attendees • Mark user as (possible) attendee if pa high enough • Number of (possible) attendees at week w given by nw • Mark week w as event if nw is high enough.
- 23. Example: Stadium 0 10 20 30 40 50 60 −2 −1 0 1 2 3 4 5 Week Z−score
- 24. Example: Stadium 0 2 4 6 8 10 12 14 16 18 20 22 24 0 50 100 150 200 250 300 350 Hour No.ofCalls Not attending Attending Regular
- 25. Example: Park 0 10 20 30 40 50 60 −4 −3 −2 −1 0 1 2 3 4 Week Z−score
- 26. Example: Park 0 2 4 6 8 10 12 14 16 18 20 22 24 0 50 100 150 200 250 300 350 Hour No.ofCalls Not Attending Attending Regular
- 27. Example: Rural area 0 10 20 30 40 50 60 −4 −3 −2 −1 0 1 2 3 4 Week Z−score
- 28. Sensitivity
- 29. Conclusions Conclusions • Possible to detect ‘social events’ in mobile phone data • Robust to antenna positioning and switching • Interesting observation: non-routine behaviour seems massive Further considerations • Use simpler (faster) method to detect irregularities • Reﬁne location estimation by likelihood inference Questions? Suggestions? Remarks?