Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Network Intelligence Driven Human Behavior Modeling

9,375 views

Published on

Published in: Data & Analytics
  • Be the first to comment

Network Intelligence Driven Human Behavior Modeling

  1. 1. Network Intelligence Driven Behaviour Modelling for a Sustainable Connected World Fahim Kawsar Internet of Things Research Bell Labs @raswak
  2. 2. POPULATION DATA MORE DEVELOPED COUNTRIES LESS DEVELOPED COUNTRIES 2013: 1.2 Billion 2050: 1.3 Billion 2013: 5.9 Billion 2050: 8.4 Billion WORLD 2013: 7.1 Billion 2050: 9.7 Billion Source : Population Reference Bureau
  3. 3. Source : UN Department of Economics and Social Fare 72% The expected growth of World’s Urban Population by 2050
  4. 4. Source : McKinsey 30% More Water 100% More Vehicles 150% More Energy Increased Demand on Infrastructure Projected for 2020 $1T
  5. 5. But Wait - We Have Got IoT Right! Home Automobile Transport Environment SecuritySignage Industry Healthcare
  6. 6. Source : CISCO
  7. 7. “The pitch is that the IoT will make our world a greener place. Environmental sensors can detect pollution, the voices say. Smart thermostats can help us save money on our electric bills. A new breed of agriculture tech can save water by giving crops exactly the amount they need and no more.” and so as many other problems of our society will be solved by IoT ..AHEM. - Wired Magazine
  8. 8. Waste Pollution Traffic Quantified Self Quantified Home Quantified City A Lot of Connected Things + Big Data = A Lot of Savings - ?? Sense Learn Act Share
  9. 9. Can we have a conversation about what the real environmental impact of these devices will be and how we can minimize it ? Image Source : Wired Magazie
  10. 10. It takes only 8weeks for nest thermostat to save enough energy to become carbon neutral, based on the amount of energy it requires to be manufactured and distributed.
  11. 11. But How about the rest of IoT… 12 How about the rest of the IoT…
  12. 12. How can we shrink the foot print of the IoT devices? How can we increase the life span of IoT devices? How can we leverage existing infrastructures for Sensing and Learning and sharing?
  13. 13. How can we leverage existing infrastructures for Sensing and Learning and sharing?
  14. 14. The Circular Economy From a linear “take-make-dispose”-economy to a circular-pattern. Collaborative Consumption (“Shareconomy”): from ownership (product) to access (service). Connect with people online into the offline world. $3.5B in revenues from transactions in the sharing economy (ZipCar, Airbnb, TaskRabbit)
  15. 15. Network Sensing for Opportunistic Observation By observing an individual’s engagement with network annotated with temporal and spatial information, we can learn and infer behaviour. “Your Noise is my Signal” Sense Learn Act Share How can we leverage existing infrastructures for Sensing and Learning?
  16. 16. Location + Time + Venue Type = Activity
  17. 17. Location + Time + Application Type = Activity
  18. 18. Location + Time + Application Type = Activity
  19. 19. “Your Noise is my Signal” Network Sensing for Opportunistic Observation By observing an individual’s engagement with network annotated with temporal and spatial information, we can learn and infer behaviour. Implications
  20. 20. 21
  21. 21. Quantifying Yourself
  22. 22. COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED.COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Activity Aware Search Experience
  23. 23. COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED.COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Better Experience with Pervasive Spaces
  24. 24. Contextual Notification 25
  25. 25. COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED.COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Predictive Appliance Management of Homes 26
  26. 26. Smart Pricing and Personalised Content Delivery
  27. 27. Collective Measure Network Resource Planning 28
  28. 28. Business Intelligence 29
  29. 29. Network Sensing for Opportunistic Observation By observing an individual’s engagement with network annotated with temporal and spatial information, we can learn and infer behaviour. “Your Noise is my Signal” Sense Learn Act Share
  30. 30. Story of Three Cities
  31. 31. The Story of Seoul Understanding User Behaviour form Mobile Network Traces
  32. 32. Dataset 10000 Users 30 Days 77000000 records Mobile Web Dataset User Data Records (UDR) Include Web URL Up and Down Traffic Start and End Time eNodeB Id User Demography R + Java + Hadoop + Hive Anonymized Operator
  33. 33. UDR to Activity Trace Transformation Pipeline 03/03 10:02:05 03/03 10:02:06 03/03 10:12:05 03/03 10:02:06 http:/www.bbc.co.uk http:/www.img1. bbc.co.uk http:/www.facebook.com http:/www.img2.bbc.co.uk .................................................................... UDR Trace Activity Burst Detection Landmark Clustering at Each Burst Tagging Landmark with Activity Type for Each Session Session Start Time Session End Time Activity 03/03 10:02:05 03/03 10:06:05 News 03/03 10:12:05 03/03 10:17:05 Social Network Activity Trace Detect session by duration or with time weighted TF-IDF for multiple small bursts Landmark Detection 90%Coverage of the Transactions across all users. 34
  34. 34. Feature 1 : Activity Diversity Activity Diversity represents two aspects Richness: Number of different activities a user is engaged with Evenness: Relative contribution of each activity to user’s overall web time H = X piln(pi) Proportion of Activity Sessions of Activity Type i A Higher H Value indicates a diverse user who engages with many activities and spends time across activities evenly. A Lower H value indicates a more stable and periodic user who spend large amount of time on a small number of activities. 50 Random Users 64% of the users have an index of 2 or less As activity increases, diversity and evenness also increase mostly but not always. 35
  35. 35. Feature 2 : Activity Density Activity Density captures the level of user engagements as density at different parts of the day. It measures the activity session time at different parts of the day. Total Session Duration during ith Hours, i = Night | Morning | Afternoon | Evening Total Durations of all the Sessions Night Hours : 00:00 - 05:59 Morning Hours : 06:00 - 11:59 Afternoon Hours : 12:00 - 17:59 Evening Hours : 18:00 - 23:59 A pattern of activity density (low or high) indicates the hours that are the dominant period for web activities for an individual . Di = Si d |Sd| Using this feature for clustering will segment subscribers based on their temporal activity footprint. We used Hierarchical Agglomerative Clustering Algorithms using cosine similarity to segment the users into four temporal groups. 36
  36. 36. Feature 3 : Activity Popularity Activity Popularity represents the relative popularity of different activity to an individual user and captures two aspects Session Frequency: Number of Sessions of a Specific Activity Session Duration: Total Session Durations of a Specific Activity A Higher P Value indicates a popular activity, and a lower P value indicates the reverse Total Number of Sessions Number of Sessions of a specific activity Total Session Duration of a specific activity Maximum Session Duration for a single activity across all activities P = wai d + (1 w)ai f NumberofSessions Duration of Sessions (in Min) 37
  37. 37. Some Observations We have also observed a strong positive co-relation between diverse users and evening users. Night Users Morning Users Afternoon Users Evening Users Online Games Online Video Online Music Finance News Social Network Blogs Online
 Shopping We have observed a strong positive co-relation between between different temporal users and a set of activities. 38
  38. 38. The algorithm predicts activity patterns of future hour slots of current day by matching patterns of similar days in the past. Activity Prediction Algorithm A Visual Explanation of the Algorithm with Three Sample Activity Types 39
  39. 39. Prediction Performance Over 75% of the users have 65% activity coverage of with an accuracy of over 60%. Here we have only considered accuracy by only inferring the activities that will occur. Highly active users are the most predictable ones, and the prediction accuracy can reach up to 85% if we increase the slot period. Prediction Performance across all Users CumulativeDistributionFunction(CDF) Precision and Recall Precision Recall F-Score CumulativeDistributionFunction(CDF) 40
  40. 40. The Story of Kortrijk Understanding User Behaviour form Home Network Traces
  41. 41. Dataset : Project LeYLab In-Home Internet Activity Traces Living Lab for Fiber based Services in the City of Kortrijk, Belgium. ALU 7750 Service Router with Report and Analysis Manager (RAM) was used in the backbone. 86 Households 75 Applications 60 Days 9288000 Data Points
  42. 42. Web Communication Web Activity Types Online Gaming Web Browsing File Sharing Online Shopping Video Watching Home Working Semantically identical applications were grouped together into 8 distinct Activity Types and most popular 6activities were selected for subsequent study. This selection was based on a combination of accumulated network traffic, frequency, duration and temporal regularity Social Networking Application to Activity Mapping
  43. 43. Accumulated activity footprint of a representative household, activity is spread through out the day, with higher engagements during the later hours. Family Activity Trajectory Time of the Day NoofDays 0 5 10 15 20 25 6 AM 8 AM 10 AM 12 PM 2 PM 4 PM 6 PM 8 PM 10 PM 12 AM 2 AM 4 AM Web Communication Soical Networking Online Gaming Home Working Online Shopping Video Watching
  44. 44. Feature 1 : Activity Diversity Activity Diversity represents two aspects Richness: Number of Different Activities a User is engaged with Evenness: Relative contribution of each activity to user’s overall web time H = X piln(pi) Proportion of Activity Sessions of Activity Type i A Higher H Value indicates a diverse user who engages with many activities and spends time across activities evenly. A Lower H value indicates a more stable and periodic user who spend large amount of time on a small number of activities. 50 Random Users 64% of the users have an index of 2 or less As activity increases, diversity and evenness also increase mostly but not always.
  45. 45. Feature 2 : Activity Periodicity Activity Periodicity represents one aspect Temporal Regularity of the Engagement An individual usually engages with online shopping during the evening hours of every other weekends. D = Days of the WeeksH = Hours of the Day P = Periods
  46. 46. Feature 3 : Activity Popularity Activity Popularity represents two aspects Session Frequency: Number of Sessions of a Specific Activity Session Duration: Total Session Durations of a Specific Activity A Higher P Value indicates a popular activity, and a lower P value indicates the reverse Total Number of Sessions Number of Sessions of a specific activity Total Session Duration of a specific activity Maximum Session Duration for a single activity across all activities P = wai d + (1 w)ai f
  47. 47. We have identified four segments (10% of the households was not included) with distinct behavioral features; representing internet savvy families, families with little digital footprint, socially interactive families and families with working adults. Household Segmentation Credit: Xueli An 10% 12% 14% 23% 41% Heavy Weight Households Light Weight Households Socially Interactive Households Semi Heavy Weight Households Independent Households Collective Interaction of Different Segments with Different Activities
  48. 48. Prediction Performance 60% of households activities can be predicted accurately 70% of times. CumulativeDistributionFunction(CDF) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 F-Measure
  49. 49. The Story of London Understanding Urban Dynamics with Travel Network Trace
  50. 50. COPYRIGHT © 2012 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Dataset 4313954 Cards 30 Days 66237633 Trips 577 Stations London TFL Transit Data Each trip record includes: Unique Oyster Card ID Oyster Card Type Start and End Time Check-in and Check-out Stations R + Java Hadoop + Hive
  51. 51. Spatial Travel Intensity
  52. 52. Transition Modeling Demographically
  53. 53. Significant Place Detection We have observed that this is true as we can clearly identify the individuals who are regular or visitor to a specific area. Single Individual Single Station Resident Visitor
  54. 54. Dataset 3264820 Points of Interest 298 Amenity Types London Open Street Map
  55. 55. Early hours Work hours Evening hours 3 Morning clusters and 9 Day clusters and 6 Evening Clusters Segmenting London Based on Functional Signature
  56. 56. Central Londoners Show High Introversion
  57. 57. Central Londoners Show Limited Spatial Diversity but High Temporal Periodicity
  58. 58. City Hotspots have Strong Temporal Periodicity and Limited Origin Diversity
  59. 59. Internet of Things Research @ Bell Labs Behaviour Modeling Smart Object Modeling Mobile Sensing Participatory Sensing Evolutionary Graph Real World SearchIndoor Localisation HCI Studies Pervasive Display Pervasive Privacy EF5 EF5 EF5 Novel Services For Retail Community Novel Services For Enterprise Community Novel Services For Urban Community
  60. 60. Thank You Fahim Kawsar @raswak eMail: fahim.kawsar@bell-labs.com

×