Successfully reported this slideshow.
Upcoming SlideShare
×

# FDSE2015

309 views

Published on

http://dx.doi.org/10.1007/978-3-319-26135-5_10

Published in: Engineering
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### FDSE2015

1. 1. Traffic Speed Data Investigation with Hierarchical Modeling Tomonari MASADA Nagasaki University masada@nagasaki-u.ac.jp
2. 2. Real-Time Traffic Speed Data | NYC Open Data https://data.cityofnewyork.us/Transportation/Real-Time-Traffic-Speed-Data/xsat-x5sa Traffic speed measurements at 128 streets (Regrettably, no longer maintained)
3. 3. Problem 1 • Traffic speed data show a clear periodicity at one day period. • However, many different traffic speed distribution patterns can be observed also within each period.
4. 4. Solution 1 [Masada+ 14] • We take intuition from topic models in text mining. –The data set of each day should be modeled as a mixture of many different speed distributions.
5. 5. Latent Dirichlet Allocation (LDA) [Blei+ 03] • LDA achieves a word token level clustering. • Not a document level clustering • Each document is modeled as a mixture of many different word probability distributions. topic <-> word probability distribution document <-> topic probability distribution
6. 6. v3 v1 v3 v2 v2 v1 v2 v3 v4 t3 φ31 φ32 φ33 φ34 v1 v2 v3 v4 t2φ21 φ22 φ23 φ24 v1 v2 v3 v4 t1 φ11 φ12 φ13 φ14 θj1 θj2 θj3
7. 7. An important difference • Words are discrete entities. – LDA uses multinomial distribution for modeling per-topic word distribution. • Speeds (in mph) are continuous entities. – Our model uses gamma distribution.
8. 8. gamma distribution
9. 9. Comparison with LDA • word token <-> speed measurement (in mph) • topic (multinomial) <-> topic (gamma) • document <-> document (24 hrs from midnight)
10. 10. Full joint distribution • We estimated parameters by a variational Bayesian inference. [Masada+ 14]
11. 11. Problem 2 • Traffic speed data may show a similarity at the same time point of day. • Traffic speed data may show a similarity for the streets whose locations are close to one another.
12. 12. Solution 2 [Masada+ FDSE15] • We use metadata in topic models. –time points –geographic locations
13. 13. TRINH = TRaffic speed INvestigation with Hierarchical modeling • Make topic probabilities dependent on time points and on locations – probability that the speed measured by the sensor s at the time point t is assigned to the topic k 𝜃 𝑑𝑡𝑘 ≡ exp(𝑚 𝑑𝑘 + 𝜆 𝑘𝑠 + 𝜏 𝑘𝑡) 𝑘′ exp(𝑚 𝑑𝑘′ + 𝜆 𝑘′ 𝑠 + 𝜏 𝑘′ 𝑡)
14. 14. Parameters • 𝑚 𝑑𝑘 – How often the document d provides the topic k • 𝜆 𝑘𝑠 – How often the sensor s provides the topic k • 𝜏 𝑘𝑡 – How often the time point t (of day) provides the topic k
15. 15. Priors for parameters ("hierarchical") • 𝑚 𝑑𝑘 –K Gaussian priors • 𝜆 𝑘𝑠 –K Gaussian process priors • 𝜏 𝑘𝑡 –K Gaussian process priors
16. 16. Full joint distribution
17. 17. Inference by MCMC • Sample from the posterior distribution –Slice sampling for topic probability parameters 𝑚 𝑑𝑘, 𝜆 𝑘𝑠, and 𝜏 𝑘𝑡 –Metropolis-Hastings for hyperparameters
18. 18. Context dependency Observations of the same mph are assigned to different topics.
19. 19. Context dependency On May 27, this topic is dominant. On May 28, this topic is dominant.
20. 20. Comparison experiment • Log likelihood per measurement –Larger is better. • Data –May 27 ~ June 16, 2013 (three weeks) • Data files were downloaded every minute. –20% measurements for testing
21. 21. Prior as regularization Too strong?
22. 22. What we achieved • We obtained an MCMC for a topic model whose topic probabilities are defined by combining multiple factors. • And the factors are correlated via Gaussian. – Our model can also be applied to other types of metadata indicating intrinsic similarity of data.
23. 23. Summary • We proposed a topic model for traffic data analysis. • Sensor locations and measurement timestamps affects topic assignment. • TRINH achieves better likelihood in earlier iterations. • However, TRINH gives worse likelihood in later iterations.
24. 24. Future work • Control the strength of regularization – e.g. by weighting the factors. 𝜃 𝑑𝑡𝑘 ≡ exp(𝑚 𝑑𝑘 + 𝜆 𝑘𝑠 + 𝜏 𝑘𝑡) 𝑘′ exp(𝑚 𝑑𝑘′ + 𝜆 𝑘′ 𝑠 + 𝜏 𝑘′ 𝑡) • Look for other data sets – Location information should be more relevant.