Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Media Sharing on Urban Transport


Published on

  • Be the first to comment

Media Sharing on Urban Transport

  1. 1. Media Sharing based on Colocation Prediction in Urban Transport <ul><ul><li>Liam McNamara </li></ul></ul><ul><ul><li>l.mcnamara @ </li></ul></ul><ul><ul><li>Joint work with Cecilia Mascolo and Licia Capra </li></ul></ul>
  2. 3. Talk Outline <ul><ul><li>Challenges </li></ul></ul><ul><ul><li>Approach </li></ul></ul><ul><ul><li>- Selection Process </li></ul></ul><ul><ul><li>- Temporal Mean </li></ul></ul><ul><ul><li>Implementation </li></ul></ul><ul><ul><li>Evaluation </li></ul></ul><ul><ul><li>- Urban Transport Dataset </li></ul></ul><ul><ul><li>- Content Generation </li></ul></ul><ul><ul><li>Results </li></ul></ul><ul><ul><li>Conclusions </li></ul></ul>
  3. 4. Isn't this already possible? <ul><ul><li>It is still manual, awkward and time consuming. </li></ul></ul><ul><ul><li>We need intelligent automated wireless content sharing. </li></ul></ul><ul><ul><li>Which files should be shared with who ? </li></ul></ul>
  4. 6. Main Challenges <ul><ul><li>Human proximity networks can have very high churn . </li></ul></ul><ul><ul><li>Finding content that matches the user's interests, some people have very niche tastes. </li></ul></ul><ul><ul><li>Most peers will be strangers , especially in cities. </li></ul></ul><ul><ul><li>Required penetration would increase contention . </li></ul></ul>
  5. 7. Approach: Colocation Prediction <ul><ul><li>Human movement is seasonal , as are colocations. </li></ul></ul><ul><ul><li>We estimate colocation using the mean past duration. </li></ul></ul><ul><ul><li>Are colocations the same length through the day? </li></ul></ul><ul><ul><li>We keep a mean for each time period throughout the day. </li></ul></ul><ul><ul><li>A two-tier system of these profiles is employed. </li></ul></ul>
  6. 8. Selection Process Rock Metal Blues Alice Bob Carol Daniel
  7. 9. Selection Process Rock Blues Country Pop Jazz Classical Rock? Metal? Blues? Alice Bob Carol Daniel Pop Rock
  8. 10. Selection Process Rock Blues Country Pop Rock Pop Jazz Classical Rock? Metal? Blues? Alice Bob Daniel Carol
  9. 11. Selection Process: Bob or Carol? Rock Blues Country Pop Rock Bob Carol Carol may offer more appropriate content. Transfer completion is paramount. Therefore: choose host with longest remaining colocation time: remaining = predictedLength(host,time) – colocationStart(host)‏ Should a transfer be initiated if remaining is very small?
  10. 12. Approach: Temporal Mean <ul><ul><li>A global anonymous profile is maintained. </li></ul></ul><ul><ul><li>Recording all colocation lengths in the relevant time slot. </li></ul></ul>0-4 4-8 8-12 12-16 16-20 20-24 3.5 5 25.7 14.3 2.8 18 15.9 Global Hour of day Overall 23 10 - 13.8 - 12.4
  11. 13. Approach: Temporal Mean <ul><ul><li>If Bob is encountered frequently, he gains its own profile. </li></ul></ul>0-4 4-8 8-12 12-16 16-20 20-24 3.5 5 25.7 14.3 2.8 18 15.9 Global Bob Hour of day Overall 12 - - - - 12 -
  12. 14. Approach: Temporal Mean <ul><ul><li>Colocations with Bob are then added to both </li></ul></ul><ul><ul><li>the global and personal profiles. </li></ul></ul><ul><ul><li>Personal timed > Personal overall > Global timed </li></ul></ul>0-4 4-8 8-12 12-16 16-20 20-24 3.5 5 25.7 14.3 2.8 18 15.9 Global Bob Hour of day Overall 13 7 - 13.8 - 12.4 -
  13. 15. Talk Outline <ul><ul><li>Challenges </li></ul></ul><ul><ul><li>Approach </li></ul></ul><ul><ul><ul><ul><li>- Selection Process </li></ul></ul></ul></ul><ul><ul><li>- Temporal Mean </li></ul></ul><ul><ul><li>Implementation </li></ul></ul><ul><ul><li>Evaluation </li></ul></ul><ul><ul><li>- Urban Transport Dataset </li></ul></ul><ul><ul><li>- Content Generation </li></ul></ul><ul><ul><li>Results </li></ul></ul><ul><ul><li>Conclusions </li></ul></ul>
  14. 16. Implementation <ul><ul><li>Nokia N70 Smartphones </li></ul></ul><ul><ul><li>2GB MicroSD cards </li></ul></ul><ul><ul><li>PyS60 Python Interpreter (<2KLOC)‏ </li></ul></ul><ul><ul><li>Experimental runs were performed on busy commuter trains. </li></ul></ul><ul><ul><li>Achieved data rates went as low as </li></ul></ul><ul><ul><li>100Kbps </li></ul></ul>
  15. 17. Urban Transport Dataset <ul><ul><li>We used anonymous passenger movement traces from a large city's subway system. </li></ul></ul><ul><ul><li>Recorded over 1 month </li></ul></ul><ul><ul><li>1,000,000s of journeys </li></ul></ul><ul><ul><li>100,000s of passengers </li></ul></ul><ul><ul><li>Many passengers commute regular journeys, at the same time each day. </li></ul></ul>
  16. 18. Dataset features
  17. 19. Dataset processing <ul><ul><li>We needed to convert the format of the traceset. </li></ul></ul><ul><ul><li>From journey : </li></ul></ul><ul><ul><li>userid, start_station, start_time, end_station, end_time </li></ul></ul><ul><ul><li>To colocation instance: </li></ul></ul><ul><ul><li>userid1, userid2, start_time, end_time </li></ul></ul>
  18. 20. Dataset processing <ul><ul><li>Would Alice and Bob have been on the same train? </li></ul></ul><ul><ul><li>What time would they have shared the train? </li></ul></ul><ul><ul><li>Each journey's progress was timed using official timetables, and intersected with all others. </li></ul></ul><ul><ul><li>Alice Bob 8.45 9.30 </li></ul></ul>8.30 Alice in 8.45 Bob in 9.30 Alice out 9.45 Bob out Elephant & Castle Lambeth North Waterloo Embankment Charing Cross Piccadillly Circus
  19. 21. Content Modelling <ul><ul><li>Data from social music website was used to generate realistic music collections for users. </li></ul></ul><ul><ul><li>Playlist information was recorded from over 500,000 users. </li></ul></ul><ul><ul><li>User libraries are constructed to contain relative proportions of genres , and the artists in those genres. </li></ul></ul>
  20. 22. Library Creation <ul><ul><li>Each simulated node is given the taste of a user. </li></ul></ul><ul><ul><li>An interest list is created to contain all the genres of the assigned user's top50 chart. </li></ul></ul><ul><ul><li>The library is filled by adding files: </li></ul></ul><ul><ul><li>A genre is selected from the interest list. </li></ul></ul><ul><ul><li>An artist is selected from the genre according to popularity. </li></ul></ul><ul><ul><li>The track is a number chosen according to 0 ≤ y ≤ 1 </li></ul></ul><ul><ul><li>track = exp( (y-0.85) / -0.2 ) </li></ul></ul>
  21. 23. Library Creation: track popularity Distribution of proportional track popularity from artists
  22. 24. Results: Selection Methods <ul><ul><li>Random – Sources are chosen at random, and a transfer is always initiated. </li></ul></ul><ul><ul><li>Prediction – Our approach, with the temporal mean colocation data. </li></ul></ul><ul><ul><li>Oracle – Selects longest available neighbour, possessing future knowledge of every colocation. </li></ul></ul>
  23. 25. Results: Prediction success
  24. 26. Advertising Policy <ul><ul><li>shortlist = library (sent U received) </li></ul></ul><ul><ul><li>Random – Files are chosen from the shortlist at random. </li></ul></ul><ul><ul><li>Uncommon – Files are sent in increasing order of the amount of times the source has seen it (from an ad). </li></ul></ul><ul><ul><li>Unpopular – Files are sent in increasing order of the amount of times the source has transferred it. </li></ul></ul>
  25. 27. Results: Advertising Policy Using data from the Reality Mining project lasting a whole year.
  26. 28. Results: Genre availability
  27. 29. Related Work <ul><ul><li>Bluespots : &quot; Bluetooth Content Distribution Stations on Public Transport ” J. LeBrun and CN. Chuah, Mobishare 2006. </li></ul></ul><ul><ul><li>Uses infrastructure installed on buses to distribute data. </li></ul></ul><ul><ul><li>&quot; BlueTorrent : Cooperative Content Sharing for Bluetooth Users &quot; S. Jung, U. Lee, A. Chang, DK. Cho, and M. Gerla, Percom 2007. </li></ul></ul><ul><ul><li>Swarming is employed to overcome short colocations. </li></ul></ul><ul><ul><li>“ Wireless Ad Hoc Podcasting ” V. Lenders, G. Karlsson and M. May, SECON 2007. </li></ul></ul><ul><ul><li>Users subscribe to channels of content interest types. </li></ul></ul>
  28. 30. Conclusions <ul><ul><li>Selection of download source has a large impact on the efficacy of the system. </li></ul></ul><ul><ul><li>Advertising policy can have an even larger effect. </li></ul></ul><ul><ul><li>Not downloading can improve overall system performance. </li></ul></ul><ul><ul><li>Movements of a subset of the strangers we meet in a city can be predicted . </li></ul></ul>
  29. 31. Questions?
  30. 32. Results: Taste vs Increase
  31. 33. Results: Taste vs Utilisation