Successfully reported this slideshow.
Your SlideShare is downloading. ×

Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 36 Ad

Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Download to read offline

Presentation slides at KDE seminar 2013/04/24, which introduces the paper "Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations."

Presentation slides at KDE seminar 2013/04/24, which introduces the paper "Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations."

Advertisement
Advertisement

More Related Content

Similar to Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations (20)

More from Yuto Yamaguchi (12)

Advertisement

Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

  1. 1. Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign 13/04/24 KDE Seminar: Yuto Yamaguchi 1 Paper Introduction Speaker: Yuto Yamaguchi KDD ‘12
  2. 2. Introduction •  Users’ locations are important to many applications •  e.g.) Advertisement, Recommendation •  But most of users do not provide their location information •  On Twitter, only 16% of users register city level locations in their profiles •  The objective of this paper is to profile users’ home locations in social network. 13/04/24 KDE Seminar: Yuto Yamaguchi 2
  3. 3. General Ideas for Location Inference •  A user more likely to follow another user who lives near •  e.g.) A user in Chicago follows another user in Chicago •  [Backstorm et al., WWW ‘10], •  [Clodoveu et al., T-GIS ‘11] , … •  A user more likely to post about a near location to him •  e.g.) A user in Houston posts about rockets •  [Cheng et al., CIKM ‘10], •  [Chandra et al., SocialCom ’11], •  [Kinsella et al., SMUC ‘11], … 13/04/24 KDE Seminar: Yuto Yamaguchi 3
  4. 4. Challenges •  On Twitter, following network and tweets provide valuable signals for profiling their home locations •  But there are two challenges, •  Scarce Signals •  126 friends on average, but only 16% of them provide locations •  6 location related terms in every 100 tweets •  Noisy Signals •  a user may follow another user who lives in a distant location •  a user may post about distant locations 13/04/24 KDE Seminar: Yuto Yamaguchi 4
  5. 5. Ideas in this paper •  The authors propose a unified discriminative influence model UDI which has two features below •  Unified Signals (for scarce signal challenge) •  Integrates social network and user-centric data (i.e., tweets) in a probabilistic framework, which is viewed as a heterogeneous graph •  Discriminative Influence (for noisy signal challenge) •  Users and locations have their own influence scope e.g.) Lady Gaga (with a broad influence scope) is more likely to be followed by a user far away à users with broad scopes do not provide so strong signals for location inference 13/04/24 KDE Seminar: Yuto Yamaguchi 5
  6. 6. Contributions •  Propose a unified discriminative influence model UDI •  Heterogeneous graph •  Influence scope •  Propose two location profiling methods using the above model (introduced later) •  Local prediction method •  Global prediction method •  Conduct extensive experiments using Twitter dataset •  Their method can place 66% users within 100 miles error distance 13/04/24 KDE Seminar: Yuto Yamaguchi 6
  7. 7. PROBLEM FORMULATION 13/04/24 KDE Seminar: Yuto Yamaguchi 7
  8. 8. Heterogeneous Graph 13/04/24 KDE Seminar: Yuto Yamaguchi 8 User nodes ui ∈U vj ∈ VVenue nodes If ui posts about vj, create an edge <ui, vj> If ui follows uj, create an edge <ui, uj>
  9. 9. Location Profiling Problem 13/04/24 KDE Seminar: Yuto Yamaguchi 9 Given a Twitter Graph G, estimate a location for each user ui so as to make close to ui’s true location ˆLui ˆLui Lui
  10. 10. INFLUENCE MODEL 13/04/24 KDE Seminar: Yuto Yamaguchi 10
  11. 11. Motivation 1/2 13/04/24 KDE Seminar: Yuto Yamaguchi 11 Near users (venues) are more likely to be followed (tweeted) by other users
  12. 12. Motivation 2/2 13/04/24 KDE Seminar: Yuto Yamaguchi 12 Each user (venue) has an influence scope of different size Influential user regular user
  13. 13. Basic Ideas for the Influence model •  Geographically influential user has a broad influence scope •  e.g.) world wide celebrities such as Lady Gaga •  The fact that a user follows a geographically influential user does NOT provide valuable signals for location inference •  e.g.) NOT VALUABLE: a user follows Lady Gaga VALUABLE: a user follows a regular user in Chicago 13/04/24 KDE Seminar: Yuto Yamaguchi 13
  14. 14. Model Formulation •  The authors adopt a Gaussian distribution to model the above characteristics 13/04/24 KDE Seminar: Yuto Yamaguchi 14 latitude longitude probability to follow (tweet) N(Lni ,Σni ) node ni’s influence scope
  15. 15. Influence scope – users 13/04/24 KDE Seminar: Yuto Yamaguchi 15 latitude longitude probability to follow N(Lui ,Σui ) user ui’s influence scope High probability to follow ui Low probability to follow ui user ui’s home location
  16. 16. Influence scope – venues 13/04/24 KDE Seminar: Yuto Yamaguchi 16 latitude longitude probability to tweet N(Lvi ,Σvi ) venue vi’s influence scope High probability to tweet Low probability to tweet venue vi’s location
  17. 17. Different scope size – users 13/04/24 KDE Seminar: Yuto Yamaguchi 17 high influence Regular user Geographically influential user More likely to be followed by distant users
  18. 18. Different scope size – venues 13/04/24 KDE Seminar: Yuto Yamaguchi 18 high influence Regular venue Geographically influential venue More likely to be tweeted by distant users
  19. 19. Model Parameters •  Mean and variance for each Gaussian •  Mean is the location of node ni •  Variance decides the size of each influence scope •  The number of parameters is 13/04/24 KDE Seminar: Yuto Yamaguchi 19 N(Lni ,Σni ) Lni Σni Σni = σni 0 0 σni " # $ $ % & ' ' 2 U + V( )
  20. 20. LOCATION PROFILING METHODS Local prediction method Global prediction method 13/04/24 KDE Seminar: Yuto Yamaguchi 20
  21. 21. Basic Ideas for Location Profiling 13/04/24 KDE Seminar: Yuto Yamaguchi 21 Estimate such model parameters that maximize the likelihood of obtaining the given Twitter graph Lni Σni and for each node ni Parameters:
  22. 22. Local Prediction Method •  This method only considers the ego-network •  Maximize the likelihood of this network 13/04/24 KDE Seminar: Yuto Yamaguchi 22 tweet follow labeled user labeled user labeled user unlabeled user labeled user: his location is known unlabeled user: his location is unknown ego-network
  23. 23. Likelihood Function of Local Method 13/04/24 KDE Seminar: Yuto Yamaguchi 23 P ego-network of ui | parameters( )= P uj follows ui | Luj , Lui ,Σui( )uj ∈Followers ui( ) ∏ × P ui follows uj | Lui , Luj ,Σuj( )uj ∈Followees ui( ) ∏ × P ui tweets vj | Lui , Lvj ,Σvj( )vj ∈Venues ui( ) ∏ These are Gaussian Maximize this function
  24. 24. Each Gaussian 13/04/24 KDE Seminar: Yuto Yamaguchi 24 P uj follows ui | Luj , Lui ,Σui( )= 1 2πσui 2 exp Xui − Xuj( ) 2 + Yui −Yuj( ) 2 −2σui 2 # $ % %% & ' ( (( •  High probability if ui and uj is close •  High probability if ui has broad influence scope
  25. 25. Solution of Local Method 13/04/24 KDE Seminar: Yuto Yamaguchi 25 Xui = Xuj σuiuj ∈ followers ui( ) ∑ + Xuj σujuj ∈ followees ui( ) ∑ + Xvj σvivj ∈venues ui( ) ∑ 1 σuiuj ∈ followers ui( ) ∑ + 1 σujuj ∈ followees ui( ) ∑ + 1 σvivj ∈venues ui( ) ∑ σui 2 = Xui − Xuj( ) 2 + Yui −Yuj( ) 2 2 followers ui( )uj ∈ followers ui( ) ∑ Obtained as closed-form (no need to memorize) substitute
  26. 26. Global Prediction Method •  This method maximizes the likelihood of the whole network •  Predict locations of unknown users simultaneously 13/04/24 KDE Seminar: Yuto Yamaguchi 26
  27. 27. Likelihood Function of Global Method 13/04/24 KDE Seminar: Yuto Yamaguchi 27 P whole network | parameters( )= P ui follows uj | Lui , Luj ,Σuj( )ui,uj ∈FollowEdges ∏ × P ui tweets vj | Lui , Lvj ,Σvj( )ui,vj ∈TweetEdges ∏ These are Gaussian Maximize this function
  28. 28. Iterative Algorithm for Global Method •  Global method has no closed form solution à Iterative algorithm 13/04/24 KDE Seminar: Yuto Yamaguchi 28 1. Initialize locations for all unlabeled users 2.  3. repeat 1. update for all nodes using 2. repeat 1. update for all unlabeled users using 3. until converge 4.  5.  4. until converge Lu σn k Lu k Lu ← Lu k k ←1 σn k Lu Lu k Lu k ← k +1
  29. 29. EXPERIMENTS 13/04/24 KDE Seminar: Yuto Yamaguchi 29
  30. 30. Dataset •  Twitter dataset •  Crawled Profiles, followers, and followees of 3,980,061 users •  Geocoded their location profiles into coordinates based on U.S. Gazetteer •  630,187 users are correctly geocoded ß labeled users •  158,220 of labeled users have at least one labeled neighbor •  neighbor: follower or followee •  Crawled at most 600 tweets for each labeled user, and obtained 139,180 users’ tweets •  Other users are protected users •  Using this dataset, the authors conducted five-fold cross validation •  80% of 139,180 users are for training set, 20% are for test set •  Repeat 5 runs 13/04/24 KDE Seminar: Yuto Yamaguchi 30
  31. 31. Methods •  Compared 6 methods •  BaseU: Backstorm et al.’s method [1] •  Using only social graph •  BaseC: Cheng et al.’s method [2] •  Using only tweets •  UDIU: Local prediction method, but only uses user nodes •  UDIC: Local prediction method, but only uses venue nodes •  UDII: Local prediction method •  UDIG: Global prediction method 13/04/24 KDE Seminar: Yuto Yamaguchi 31 No influence model [1] Backstorm et al., “Find me if you can: improving geographical prediction with social and spatial proximity”, WWW’10 [2] Cheng et al., “You are where you tweet: a content-based approach to geo- locating twitter users”, CIKM’10
  32. 32. Results – Prediction results 13/04/24 KDE Seminar: Yuto Yamaguchi 32 ACC: Ratio of correctly predicted users within 100 miles AED@k%: Average error distance of top k% users •  Influence model is effective to predict locations •  Comparing BaseU and UDIU (BaseC and UDIC) •  Integrating both signals is effective to predict locations •  Comparing UDIU and UDII (UDIC and UDII) •  Global method improves Local one only 1.5% •  Comparing UDIG and UDII
  33. 33. Results – Global and Local 13/04/24 KDE Seminar: Yuto Yamaguchi 33 +9% in ACC 20% training users and 80% test users In the case that most of users are unlabeled, the global method improves the local one substantially
  34. 34. Results – Influence scope 13/04/24 KDE Seminar: Yuto Yamaguchi 34 •  Users with a large number of followers do not always have large σ •  e.g.) MythBusters Official have larger σ than Lady Gaga but have smaller number of followers
  35. 35. CONCLUSION 13/04/24 KDE Seminar: Yuto Yamaguchi 35
  36. 36. Conclusion •  Proposed •  Unified discriminative influence model (UDI) •  Two location prediction method based on influence model •  global and local •  Conducted experiments using large Twitter dataset •  Proposed methods significantly outperform existing methods •  NO future work 13/04/24 KDE Seminar: Yuto Yamaguchi 36

×