Turning Oyster Cards into Information

673 views

Published on

Presented @ Leeds Intelligent Transport Systems, Feb 2012

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
673
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Turning Oyster Cards into Information

  1. 1. @neal_lathiacomputer laboratory: university of cambridge
  2. 2. online offline
  3. 3. urban data mining web urbanmining.wordpress.com
  4. 4. onlineuser data + algorithms → relevance ☺
  5. 5. public transportuser data + algorithms → relevance
  6. 6. “smart” cards1 facilitate payment2 collect user data
  7. 7. “smart” cardstime-stamped locations,modality, payments,user categoriesanonymised withpersistent user ids
  8. 8. “smart” cards datasets100% - 1 month~5.1 million people~78.8 million trips5% - 2 x 83 days~300k people~7.7 million trips
  9. 9. Purchase Geography Mobility Flow45 Zone 1 PAYG Zone 240 Travel Cards Zone 335 Zone 4 Zone 530 Zone 6252015105 arrive0 1 2 3 4 5 6 7 8 9
  10. 10. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  11. 11. can we use transport data for... 1 predicting disruption relevance i.e., rank station importance correctly?
  12. 12. can we use transport data for... predicting disruption relevance i.e., rank station importance correctly? (where you will go in the future)
  13. 13. percentile ranking0.0 (best)…0.5 (random)…1.0 (inverse)
  14. 14. percentile ranking0.0 (best)...0.25 (rank stations by popularity)...0.5 (random)…1.0 (inverse)
  15. 15. percentile ranking0.0 (best)...0.06 (factor in users history)...0.25 (rank stations by popularity)...0.5 (random)…1.0 (inverse)
  16. 16. percentile ranking0.0 (best)…0.05 (“those who touch in here also touch in at...”)...0.06 (factor in users history)...0.25 (rank stations by popularity)...0.5 (random)…1.0 (inverse)
  17. 17. accurate ranking without 1 explicitly asking 2 network topology, rail schedule
  18. 18. using transport data for... 1 predicting disruption relevance 2 personalised travel time
  19. 19. can we use transport data for... 2 predict your travel time i.e., time between touch in/out?
  20. 20. mean absolute error (minutes)0.0 (best)…
  21. 21. mean absolute error (minutes)0.0 (best)…9.82 (time tabled)
  22. 22. mean absolute error (minutes)0.0 (best)…3.30 (mean time)...9.82 (time tabled)
  23. 23. mean absolute error (minutes)0.0 (best)…3.28 (“people who travel at this time...”)3.30 (mean time)...9.82 (time tabled)
  24. 24. mean absolute error (minutes)0.0 (best)…3.17 (“people who are as familiar as you...”)3.28 (“people who travel at this time...”)3.30 (mean time)...9.82 (time tabled)
  25. 25. mean absolute error (minutes)0.0 (best)…3.13 (“your trips in the past...”)3.17 (“people who are as familiar as you...”)3.28 (“people who travel at this time...”)3.30 (mean time)...9.82 (time tabled)
  26. 26. accurate predictions without 1 explicitly asking 2 network topology, rail schedule 3 ongoing disruptions, delays
  27. 27. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  28. 28. 30 Purchase Behaviour Travel Cards 25 PAYG 20 % Purchases 15 10 5 0 Mon Tue Wed Thu Fri Sat Sun45 Purchase Geography Mobility Flow40 PAYG Zone 1 Travel Cards Zone 235 arrive Zone 330 Zone 4 Zone 525 Zone 620151050 1 2 3 4 5 6 7 8 9
  29. 29. (a) high regularity in purchases & movements(b) small increments, short terms(c) purchase on refused entry?
  30. 30. are people making the right choice?
  31. 31. £200 million overspend
  32. 32. (a) failure to predict your movements(b) failing to match mobility with fares
  33. 33. can we use transport data for... 3 predict the fares you should buy i.e., what will be cheapest?
  34. 34. classification accuracy0.0% (worst)...100% (oracle)
  35. 35. classification accuracy0.0 (worst)…77% everyone on pay as you go...100% (oracle)
  36. 36. classification accuracy0.0 (worst)…77% everyone on pay as you go80% naïve bayes...100% (oracle)
  37. 37. classification accuracy0.0 (worst)…77% everyone on pay as you go80% naïve bayes…97% (“people like you should have bought...”)100% (oracle)
  38. 38. classification accuracy0.0 (worst)…77% everyone on pay as you go80% naïve bayes…97% (“people like you should have bought...”)98% decision trees100% (oracle)
  39. 39. money saved£0.0 (worst)…£326,447.95 everyone on pay as you go£393,585.81 naïve bayes…£465,822.17 (“people like you...”)£473,918.38 decision trees£479,583.91 (oracle)
  40. 40. “smart” cards1 facilitate payment2 collect user data3 enable powerful, personalised information systems
  41. 41. using transport data for... 1 behaviours ~ policy & incentives 2 community well-being
  42. 42. ReferencesN. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised IntelligentTransport Systems. In IEEE International Conference on Data Mining. December 2010, Sydney,Australia.N. Lathia, C. Smith, J. Froehlich, L. Capra. Individuals Among Commuters: BuildingPersonalised Transport Information Systems from Fare Collection Systems. Under submission.N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers Spending on Public Transport.In ACM International Conference on Knowledge Discovery and Data Mining. August 2011. SanDiego, USA.N. Lathia, L. Capra. How Smart is Your Smart Card? Measuring Travel Behaviours,Perceptions, and Incentives. In ACM International Conference on Ubiquitous Computing.September 2011. Beijing, China.N. Lathia, D. Quercia, J. Crowcroft. The Hidden Image of the City: Sensing Community Well-Being from Urban Mobility. To Appear, 10th International Conference on Pervasive Computing.June 2012. Newcastle, UK.

×