@neal_lathia: cambridge computer lab
research:how data can help in our everyday lives                   mobility:          getting from a to b (habit)         ...
our everyday lives
time-stamped locations,modality, payments,user categoriesanonymised withpersistent user ids
what tools can be build using this data?      i.e., what if TfL had a data science team?
● what is the relation between mobility and fare purchase?● are we buying the best fares? (no)● can our data help us? (yes)
Purchase Behaviour                                                                   30                                   ...
(a) high regularity in purchases & movements(b) small increments, short terms(c) purchase on refused entry?
are people making the right choice?
£200 million     overspend
(a) failure to predict your movements(b) failing to match mobility with fares
recommender systemsmatching “users” with “items” of interest
recommender systemsmatching “users” travellers with “items” fares of       interest that are cheap for them
as the machine sees it:given {d, f, b, r, pt, ot, N} predict F
time to get out the algorithms
0. baseline – everyone on pay as you go1. naïve bayes2. k-nearest neighbours3. decision trees (c4.5)5. oracle
0. baseline ~ 75.95%1. naïve bayes ~ 79.09%2. k-nearest neighbours ~ 96.91%3. decision trees (c4.5) ~ 98.15%5. oracle - 100%
final words
not my work:
customer trustdata science
@neal_lathiaIs #recsys your thing?http://www.meetup.com/london-recsys/
Mining Millions of Oyster Card Trips
Mining Millions of Oyster Card Trips
Mining Millions of Oyster Card Trips
Mining Millions of Oyster Card Trips
Upcoming SlideShare
Loading in …5
×

Mining Millions of Oyster Card Trips

716 views

Published on

Presentation @ London Data Science Inaugural Meetup

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
716
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mining Millions of Oyster Card Trips

  1. 1. @neal_lathia: cambridge computer lab
  2. 2. research:how data can help in our everyday lives mobility: getting from a to b (habit) finding z (discovery)
  3. 3. our everyday lives
  4. 4. time-stamped locations,modality, payments,user categoriesanonymised withpersistent user ids
  5. 5. what tools can be build using this data? i.e., what if TfL had a data science team?
  6. 6. ● what is the relation between mobility and fare purchase?● are we buying the best fares? (no)● can our data help us? (yes)
  7. 7. Purchase Behaviour 30 Travel Cards 25 PAYG 20 % Purchases 15 10 5 0 Mon Tue Wed Thu Fri Sat Sun45 Purchase Geography Mobility Flow PAYG40 Zone 1 Travel Cards Zone 235 arrive Zone 330 Zone 4 Zone 525 Zone 620151050 1 2 3 4 5 6 7 8 9
  8. 8. (a) high regularity in purchases & movements(b) small increments, short terms(c) purchase on refused entry?
  9. 9. are people making the right choice?
  10. 10. £200 million overspend
  11. 11. (a) failure to predict your movements(b) failing to match mobility with fares
  12. 12. recommender systemsmatching “users” with “items” of interest
  13. 13. recommender systemsmatching “users” travellers with “items” fares of interest that are cheap for them
  14. 14. as the machine sees it:given {d, f, b, r, pt, ot, N} predict F
  15. 15. time to get out the algorithms
  16. 16. 0. baseline – everyone on pay as you go1. naïve bayes2. k-nearest neighbours3. decision trees (c4.5)5. oracle
  17. 17. 0. baseline ~ 75.95%1. naïve bayes ~ 79.09%2. k-nearest neighbours ~ 96.91%3. decision trees (c4.5) ~ 98.15%5. oracle - 100%
  18. 18. final words
  19. 19. not my work:
  20. 20. customer trustdata science
  21. 21. @neal_lathiaIs #recsys your thing?http://www.meetup.com/london-recsys/

×