• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Bridging the Gap Between Physical Location and Online Social Networks, at Ubicomp 2010
 

Bridging the Gap Between Physical Location and Online Social Networks, at Ubicomp 2010

on

  • 154 views

This paper examines the location traces of 489 users of a location sharing social network for relationships between the users' mobility patterns and structural properties of their underlying social ...

This paper examines the location traces of 489 users of a location sharing social network for relationships between the users' mobility patterns and structural properties of their underlying social network. We introduce a novel set of location-based features for analyzing the social context of a geographic region, including location entropy, which measures the diversity of unique visitors of a location. Using these features, we provide a model for predicting friendship between two users by analyzing their location trails. Our model achieves significant gains over simpler models based only on direct properties of the co-location histories, such as the number of co-locations. We also show a positive relationship between the entropy of the locations the user visits and the number of social ties that user has in the network. We discuss how the offline mobility of users can have implications for both researchers and designers of online social networks.

Authors are Justin Cranshaw, Eran Toch, Jason Hong, Aniket Kittur, and Norman Sadeh

Statistics

Views

Total Views
154
Views on SlideShare
154
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Bridging the Gap Between Physical Location and Online Social Networks, at Ubicomp 2010 Bridging the Gap Between Physical Location and Online Social Networks, at Ubicomp 2010 Presentation Transcript

    • 1 Bridging the Gap Between Physical Location and Online Social Networks Justin Cranshaw Eran Toch Jason Hong Aniket Kittur Norman Sadeh Carnegie Mellon University School of Computer Science
    • 2 On Facebook, we maintain a set of social connection we typically call Facebook friends.
    • 3 D C E B A On Facebook, we maintain a set of social connection we typically call Facebook friends.
    • 4 D C E B A There may be some people we know in real life with whom we are not Facebook friends.
    • 5 D C E B A Similarly, we may have Facebook friends that we do not know in real life.
    • 6 A D C B E D C E B A
    • 7 A D C B E D C E B A
    • 8 A D C B E D C E B A
    • 9 A D C B E D C E B A
    • 10 A D C B E D C E B A
    • 11 The purpose of this work is to explore the area between online social networks, and the real world mobility patterns of their users.
    • 12
    • 13 Outline: Goal: Define a set of observable properties of physical places that convey information about the people that visit the location and social interactions that there. Evaluation: We will evaluate these properties on a prediction task. We will attempting to discern Facebook friendships from non-friendships based on the co-location network of the users. Results: We’ll show that using these location based features significantly improves the performance of a classifier.
    • 14 Related Work: Several results affiliated with Sandy Pentland’s group [Eagle & Pentland, 2009] [Eagle, Pentland, and Lazer 2009] Several results from Microsoft research: [Zheng et. al, UbiComp, 2008] [Zheng et al, GIS, 2008] [Kostakos & Venkatanthan, 2010] Our main point of difference in this work is our focus on contextual properties of the location histories.
    • 15 Co-location Suppose A and B are co-located. How might we deduce if they are actually friends? 1. We can infer based on how they socialize and interact • We can infer based on how many other times they’ve been co-located in the past • We can infer based the context (where they are and what they’re doing) A B A and B were co-located
    • 16 Co-location Suppose A and B are co-located. How might we deduce if they are actually friends? A B A and B were co-located 1. We can infer based on how they socialize and interact • We can infer based on how many other times they’ve been co-located in the past • We can infer based the context (where they are and what they’re doing)
    • 17 Co-location Suppose A and B are co-located. How might we deduce if they are actually friends? A B They were observed together on 100 occasions On the same bus 1. We can infer based on how they socialize and interact • We can infer based on how many other times they’ve been co-located in the past • We can infer based the context (where they are and what they’re doing) A and B were co-located If we just infer based on 2. we might guess that they are friends, when it’s very likely they are not.
    • 18 Co-location Suppose A and B are co-located. How might we deduce if they are actually friends? 1. We can infer based on how they socialize and interact • We can infer based on how many other times they’ve been co-located in the past • We can infer based the context (where they are and what they’re doing) A B They were observed together on 4 occasions 3 times at A’s house, and 1 time at B’s house A and B were co-located If we just infer based on 2. we might guess that they are not- friends, when in fact it’s much more likely that they are.
    • 19 Co-location Suppose A and B are co-located. How might we deduce if they are actually friends? This example motivates two hypotheses: that the number of co-locations of two people is a poor indicator of their relationship between them, and that context about the location can help in prediction. A B A and B were co-located
    • 20 How can we derive context on a large scale, only from location data?
    • 21 How can we derive context on a large scale, only from location data? One Option:Location Diversity
    • 22 Location Diversity For a given location we define: Frequency: total number of observations at the location User Count: total number of users observed at the location Entropy: the entropy of the distribution of observation of distinct users Location diversity helps us identify the locations where chance co- locations are most likely. Locations with high diversity have more chance encounters.
    • 23 Location Diversity Frequency: LOW User count: LOW Entropy: LOW (40.46,-79.9) (40.45,-79.9)(40.45,-80.0) (40.46,-80.0) 9/14, 9:00AM 9/18, 10:00AM 9/18, 10:05AM Observation = (user id, latitude, longitude, time) Observations A A A A Observation of user A B Observation of user B C Observation of user C We look at all observations of users over time at a given location.
    • 24 Location Diversity Frequency: HIGH User count: LOW Entropy: LOW (40.46,-79.9) (40.45,-79.9)(40.45,-80.0) (40.46,-80.0) A A A A A A A A A A A A A Observation of user A B Observation of user B C Observation of user C We look at all observations of users over time at a given location.
    • 25 Location Diversity Frequency: HIGH User count: HIGH Entropy: LOW (40.46,-79.9) (40.45,-79.9)(40.45,-80.0) (40.46,-80.0) A A A A B A A A A A A C Here, co-locations are more likely to mean friendship. A Observation of user A B Observation of user B C Observation of user C We look at all observations of users over time at a given location.
    • 26 Location Diversity Frequency: HIGH User count: HIGH Entropy: HIGH (40.46,-79.9) (40.45,-79.9)(40.45,-80.0) (40.46,-80.0) Here, co-locations are more likely to be due to chance. A Observation of user A B Observation of user B C Observation of user C C A A B B C A C B A B C We look at all observations of users over time at a given location.
    • 27 Connection to Biological Diversity: Ecologists have been using entropy to study location for over 50 years. Uses: habitat determination, health of an ecosystem, land use determinations for conservation
    • 28 How does location diversity relate to predicting (Facebook) friendships from co-location?
    • 29 A B A A A A A A A A A A A A A A A B B B B C C C C A B An edge indicates a co-location Location 1 History Location 2 History A B Case 1: Its difficult to conclude that A and B. Case 2: It’s more likely that A and B are actually friends. HIGH Entropy LOW Entropy E E D D Recall these diagrams show all historical observations at the location over time. An edge indicates the users were there are the same time.
    • 30 A A A A B B B C C C A B A A A A B B C C A B A A A B B B B C C C C A B A A A B C A Location 1 History Location 2 History Location 3 History A B An edge indicates a co-location Here it is difficult to conclude that A and B are friends. D D E D D E E D E E The history of A and B’s co- location
    • 31 The history of A and B’s co- location An edge indicates a co-location Here it is much more likely that there A and B are friends. A B A B A A A A A B A A A A A A A B B B B A B B A A D D D D D D D D A B Location 1 History Location 2 History Location 3 History
    • 32 Location Entropy Pittsburgh, PA
    • 33 Location Entropy Pittsburgh, PA Shopping and Dining Universities Shopping and Dining Bars and Pubs Residential Residential HIGH Entropy LOW Entropy HIGH Entropy HIGH Entropy LOW Entropy HIGH Entropy
    • 34 The history of unique people that visit a location over time tells us a great deal of information about that location. This in turn provides insight into the individuals that visit the location, and the social interactions that occur there.
    • 35 The history of unique people that visit a location over time tells us a great deal of information about that location. This in turn provides insight into the individuals that visit the location, and the social interactions that occur there. We used this general principal to define other potentially useful features of co-location data.
    • 36 Feature Categories Description Intensity and Duration The size and spatial and temporal range of the set of co-locations. Location Diversity Location diversity measures of the locations where the users were co-located. Specificity Whether the locations the users were co-located are “shared” with the community or “specific” to them. Structural Properties Relevant structural properties of the co-location graph that are indicative of friendship.
    • 37 Feature Categories Description Intensity and Duration The size and spatial and temporal range of the set of co-locations. Location Diversity Location diversity measures of the locations where the users were co-located. Specificity Whether the locations the users were co-located are “shared” with the community or “specific” to them. Structural Properties Relevant structural properties of the co-location graph that are indicative of friendship. These features use shallow properties of the co-location history: how many times, how many places, what time of day, etc.
    • 38 Feature Categories Description Intensity and Duration The size and spatial and temporal range of the set of co-locations. Location Diversity Location diversity measures of the locations where the users were co-located. Specificity Whether the locations the users were co-located are “shared” with the community or “specific” to them. Structural Properties Relevant structural properties of the co-location graph that are indicative of friendship. These features predominately use properties derived from the history of location observations, such as the location entropy.
    • 39 The Data 489 users with at least 1 month of tracking data from Locaccino Area: Restricted to users in the Pittsburgh metro area Recruitment: some from formal user studies, some were invited friends of participants, other randomly joined System use is possibly across non-overlapping time intervals About 90% of the users were laptop users In all over 4 million location observations
    • 40 Comparing the networks Social Network Co-location Network Intersection (co-located friends) Num Edges 1007 3636 360 Our goal it to differentiate meaningful edges in the co-locations from co-locations of chance. Co-location among users is pervasive, yet co-location among friends is comparatively rare. We would like to predict whether two users are friends from their co-location history alone.
    • 41 Evaluation Classifiers: trained 3 AdaBoost classifiers (with decision stumps). • One only used Intensity and Duration features • One used Diversity, Structural, and Specificity features • One used all features Baseline: we classify solely based on the number of times the users were co-located. Goal: Compare Intensity and Duration features to Diversity, Structural, and Specificity features.
    • 42 Using features such a location entropy significantly improves performance over shallow features such as number of co-locations
    • 43 Using features such a location entropy significantly improves performance over shallow features such as number of co-locations
    • 44 This highlights the variability in online social network ties with respect to behavior. Overall classifier performance was good for testing our hypotheses, but was not great for classification purposes. Accuracy is high, but precision/recall trade-offs are poor do to unbalanced class proportions (many more non- friends than friends) If the end goal is classification, perhaps more specialized approaches might be best.
    • 45 Additional Findings We also looked at the relationship between an individuals location history, and the number of Facebook friends a user has. We found a convincing positive relationship between the entropy of places a user goes to and the number friends the user has.
    • 46 Correlation of mobility features with number of friends The location diversity variables and the mobility regularity variables show very strong correlations. Users that have irregular routines, and users who visit diverse locations have more connections in the Locaccino social network.
    • 47 Limitations Many users, spread over different time periods. Most of the users were laptop users, which offers a course approximation of mobility. Population is homogenous.
    • 48 Future Work Non binary ties: Numeric ties -- tie strength from colocation Categorical ties -- relationship types More data from smart phones More specialized learning models
    • 49 I’d be happy to take your questions! Thank you for your time and attention. Justin Cranshaw jcransh@cs.cmu.edu Illustration by David Pearson, in William Safire, On Language, New York Times Magazine, June 26, 2009.
    • 50
    • 51 Extra Slides:
    • 52 User Mobility Look at the history of locations of each user We define a set of features of the location history of each user that is predictive of the number of friends they have in the Locacciono network.
    • 53 User Mobility Features Description Intensity and Duration These features describe the size and spatial and temporal range of the set observations of the user. Location Diversity These features describe the diversity of observations collected at the locations the user visits. Regularity These features describe temporal regularity of the location observations of the user. Do their observations follow a regular routine or are they random?
    • 54 Structural Comparisons Social Network Co-location Network Intersection (co-located friends) Num Vertices 489 489 489 Num Non-Isolate Vertices 366 245 127 Num Edges 1007 3636 360 Num Connected Components 44 91 99 Largest Components Size 299 293 84 Density 0.013 0.063 0.005 Connectedness 0.59 0.56 0.06 Transitivity 0.41 0.48 0.42
    • 55 Why do we want to do this? The relationship between online social networks and physical location is understudied. Partitioning the social graph is a hard and important problem Could have implications in creating better (context based) social network privacy controls