Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Institute for Transport Studies
FACULTY OF ENVIRONMENT
A clustering method based on
repeated trip behaviour to identify
ro...
Repeated trip making
Often assumed that urban traffic consists of commuters who
drive between home and work at the same ti...
Point-to-point sensors
e.g. Bluetooth
Methodology overview
Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) ….
Traveller 2: (s...
Methodology overview
Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) ….
Traveller 2: (s...
Methodology overview
Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) ….
Traveller 2: (s...
Methodology overview
Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) ….
Traveller 2: (s...
Trip frequency
• Simply look at the number of trips per traveller in the data
• Assume individual trips missing at random
...
Spatial variability:
Sequence Alignment
A B D
E
C
F
- OD pairs?
Spatial variability:
Sequence Alignment
A B D
E
C
F
- OD pairs?
- Trip sequences?
Sequence Alignment
A B D
E
C
F
Seq1: ABDC
Spatial variability:
Sequence Alignment
A B D
E
C
F
Seq1: ABDC
Seq2: BEDF
Spatial variability:
Sequence Alignment
Dissimilarity between sequence x and y:
𝑖
𝑑𝑖𝑠𝑠𝑖𝑚𝑖
𝑤ℎ𝑒𝑟𝑒 𝑑𝑖𝑠𝑠𝑖𝑚𝑖 =
𝑚𝑎𝑡𝑐ℎ𝑐𝑜𝑠𝑡 𝑖𝑓 𝑥𝑖 ...
Time of day variability
- Which are ‘comparable trips’? No information about trip
purpose etc.
- Use as much data as possi...
The times of day I walk along my
street
8am 5pm 8pm4pm7am 1pm
Time of day
Frequency
The times of day I walk along my
street
8am 5pm 8pm4pm7am 1pm
Time of day
Frequency
Mixture of Gaussian Distributions?
Model-based clustering using
Maximum Likelihood Estimation
Which cluster does each observation belong to?
What are the par...
Overall clustering
Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) ….
Traveller 2: (s, ...
Empirical example - Wigan
Data from the 23 fixed Bluetooth detectors in and around the town of Wigan (Figure 3) is analyse...
Trip frequency
The data for 2015 included:
• 7.5 million trips
• 327,264 unique MAC addresses
• almost 28% of the travelle...
Spatial variability
15 most common sequences
in one spatial cluster
A-B-M-N-R-T-W A-B-G-N-R-T-W A-B-M-R-W
A-B-G-M-N-R-T-W A-B-R-T-W A-B-M-N-S-...
Road user classes
Using the Elbow Method, decided on 9 road user classes
Approximately 3 groups of 3:
• infrequent (< 1 / ...
Infrequent travellers (ABC)
• 92% of travellers
• 23% of trips
• Less than 1 trip per week
(6 trips per year on average)
•...
More frequent travellers
Freq travellers
(DEF)
Very freq travellers
(GHI)
Total trips observed 57%
Travellers observed 8%
...
More frequent travellers
Freq travellers
(DEF)
Very freq travellers
(GHI)
Total trips observed 57% 20%
Travellers observed...
Conclusions
• A method to identify road user classes was presented
• Method was successfully applied to a fairly large cas...
Acknowledgements
Supervised by Professor David Watling and
Dr Richard Connors at ITS
Funded by
Data from
http://www.its.leeds.ac.uk/people/f.crawford
Thank you for listening!
Any questions?
Upcoming SlideShare
Loading in …5
×

A clustering method based on repeated trip behaviour to identify road user classes using bluetooth data

435 views

Published on

Presentation Fiona Crawford - winner of the Smeed prize for best student paper at the UTSG Conference 2017
www.its.leeds.ac.uk/people/f.crawford
www.utsg.net/web/index.php?page=annual-conference

Published in: Science
  • Be the first to comment

  • Be the first to like this

A clustering method based on repeated trip behaviour to identify road user classes using bluetooth data

  1. 1. Institute for Transport Studies FACULTY OF ENVIRONMENT A clustering method based on repeated trip behaviour to identify road user classes using Bluetooth data F. Crawford Institute for Transport Studies, University of Leeds Email: ts12fc@leeds.ac.uk
  2. 2. Repeated trip making Often assumed that urban traffic consists of commuters who drive between home and work at the same times each weekday But… • increases in part time, flexible and home working? • longer shop opening hours? What proportion of travellers on the roads are these mythical regular commuters?
  3. 3. Point-to-point sensors e.g. Bluetooth
  4. 4. Methodology overview Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 2: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 3: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. ……. Sensor 1 Sensor 2 Sensor 3 ……… ………
  5. 5. Methodology overview Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 2: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 3: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. ……. Sensor 1 Sensor 2 Sensor 3 ……… ………
  6. 6. Methodology overview Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 2: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 3: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. ……. Sensor 1 Sensor 2 Sensor 3 ……… Traveller 1: freq1, spat1, tod1 Traveller 2: freq2, spat2, tod2 Traveller 3: freq3, spat3, tod3 …….
  7. 7. Methodology overview Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 2: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 3: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. ……. Sensor 1 Sensor 2 Sensor 3 ……… Traveller 1: freq1, spat1, tod1 Traveller 2: freq2, spat2, tod2 Traveller 3: freq3, spat3, tod3 ……. Cluster A Cluster DCluster CCluster B ………
  8. 8. Trip frequency • Simply look at the number of trips per traveller in the data • Assume individual trips missing at random • Using data in this format can we calculate other measures to provide other types of information?
  9. 9. Spatial variability: Sequence Alignment A B D E C F - OD pairs?
  10. 10. Spatial variability: Sequence Alignment A B D E C F - OD pairs? - Trip sequences?
  11. 11. Sequence Alignment A B D E C F Seq1: ABDC
  12. 12. Spatial variability: Sequence Alignment A B D E C F Seq1: ABDC Seq2: BEDF
  13. 13. Spatial variability: Sequence Alignment Dissimilarity between sequence x and y: 𝑖 𝑑𝑖𝑠𝑠𝑖𝑚𝑖 𝑤ℎ𝑒𝑟𝑒 𝑑𝑖𝑠𝑠𝑖𝑚𝑖 = 𝑚𝑎𝑡𝑐ℎ𝑐𝑜𝑠𝑡 𝑖𝑓 𝑥𝑖 = 𝑦𝑖 𝑑𝑖𝑠𝑡_𝑥𝑦𝑖 𝑖𝑓 𝑥𝑖 ≠ 𝑦𝑖 Seq1: A B - D C Seq2: - B E D F
  14. 14. Time of day variability - Which are ‘comparable trips’? No information about trip purpose etc. - Use as much data as possible - Time at most common site (likely to be near home/work?) - Avoid arbitrary cut-offs
  15. 15. The times of day I walk along my street 8am 5pm 8pm4pm7am 1pm Time of day Frequency
  16. 16. The times of day I walk along my street 8am 5pm 8pm4pm7am 1pm Time of day Frequency Mixture of Gaussian Distributions?
  17. 17. Model-based clustering using Maximum Likelihood Estimation Which cluster does each observation belong to? What are the parameters associated with each cluster? Likelihood function: P(X,Z|Ѳ) - Expectation-Maximisation algorithm
  18. 18. Overall clustering Traveller 1: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 2: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. Traveller 3: (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) (s, t) …. ……. Sensor 1 Sensor 2 Sensor 3 ……… Traveller 1: freq1, spat1, tod1 Traveller 2: freq2, spat2, tod2 Traveller 3: freq3, spat3, tod3 ……. Cluster A Cluster DCluster CCluster B ………
  19. 19. Empirical example - Wigan Data from the 23 fixed Bluetooth detectors in and around the town of Wigan (Figure 3) is analysed for a full year (2015). Data from the 23 fixed Bluetooth detectors in and around the town of Wigan (Figure 3) is analysed for a full year (2015). A full year of data (2015) from 23 fixed Bluetooth detectors in and around Wigan
  20. 20. Trip frequency The data for 2015 included: • 7.5 million trips • 327,264 unique MAC addresses • almost 28% of the travellers had only 1 trip • just 2% had greater than or equal to 260 trips (equivalent to at least one trip per working day in the year)
  21. 21. Spatial variability
  22. 22. 15 most common sequences in one spatial cluster A-B-M-N-R-T-W A-B-G-N-R-T-W A-B-M-R-W A-B-G-M-N-R-T-W A-B-R-T-W A-B-M-N-S-W A-B-N-R-T-W A-B-M-R-T-W A-B-M-N-S-T-W B-G-M-N-R-T-W A-B-R-W A-B-G-M-R-T-W A-B-M-N-R-W A-B-N-R-W A-B-G-M-N-R-W A B G M N W R T S
  23. 23. Road user classes Using the Elbow Method, decided on 9 road user classes Approximately 3 groups of 3: • infrequent (< 1 / week), • frequent, and • very frequent (> 1.5 / day) 2 4 12 92 226 415 685 1177 2,308 0 500 1000 1500 2000 2500 Trips in 2015 Average trip per person
  24. 24. Infrequent travellers (ABC) • 92% of travellers • 23% of trips • Less than 1 trip per week (6 trips per year on average) • Intrapersonal variability? A 1.5 B, 4.2 C 12.3 0 2 4 6 8 10 12 14 Trips in 2015 Average trip per person
  25. 25. More frequent travellers Freq travellers (DEF) Very freq travellers (GHI) Total trips observed 57% Travellers observed 8% Frequency 1/week to 1.5/day (50-550) Average trips per spatial cluster 4-10 % trips in most common spatial cluster 29% Average number of time of day clusters 2-4 Average time of day cluster variance More trips -> more clusters with smaller variance
  26. 26. More frequent travellers Freq travellers (DEF) Very freq travellers (GHI) Total trips observed 57% 20% Travellers observed 8% 0.5% Frequency 1/week to 1.5/day (50-550) >1.5/day (550-6155) Average trips per spatial cluster 4-10 12-23 % trips in most common spatial cluster 29% 25-20% Average number of time of day clusters 2-4 4.5-5.5 Average time of day cluster variance More trips -> more clusters with smaller variance Smaller variance on average than DEF, but fairly constant by trips
  27. 27. Conclusions • A method to identify road user classes was presented • Method was successfully applied to a fairly large case study area • User classes depend on trip frequency and tell us about spatial and temporal variability • Future work
  28. 28. Acknowledgements Supervised by Professor David Watling and Dr Richard Connors at ITS Funded by Data from
  29. 29. http://www.its.leeds.ac.uk/people/f.crawford Thank you for listening! Any questions?

×