Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Talk of the City: Londoners and Social Media

861 views

Published on

talk of the city
http://tinyurl.com/cctxbzo

tracking emotions in the city
http://tinyurl.com/7uvjasy

Published in: Technology, Business
  • Be the first to like this

Talk of the City: Londoners and Social Media

  1. 1. Londoners and Social Media:Track Community “Happiness” + Target Ads@danielequercia
  2. 2. <who am i>
  3. 3. daniele quercia
  4. 4. offline & online
  5. 5. <goal>
  6. 6. social media language personality social media
  7. 7. social media <why>
  8. 8. social media
  9. 9. Pop press pundits (Archbishop England&Walses) social media“Social-networking sites “dehumanize” community life”
  10. 10. social media
  11. 11. social media 1 Q&A
  12. 12. social media 2 Q&A
  13. 13. social media 3 Q&A
  14. 14. CS Researchers:“Twitter is NOT media social a social network but a news media”
  15. 15. Pop press pundits (Archbishop England&Wales): social media“Social-networking sites “dehumanize” community life”CS Researchers:“Twitter is NOT a social network but a news media”
  16. 16. Pop press pundits (Archbishop England&Wales) social media“Social-networking sites “dehumanize” community life”CS Researchers:“Twitter is NOT a social network but a news media” er” ;-) g to diff “I be
  17. 17. social media language personality social media
  18. 18. community deprivation  well-being  use of words ?
  19. 19. community deprivation  well-being  use of words
  20. 20. community deprivation  well-being  use of words
  21. 21. Goalcommunity deprivation  well-being  use of words1 collect profiles & geo-reference them2 classify sentiment of profiles3 match sentiment with (census) deprivation
  22. 22. 1 collect profiles & geo-reference them 3 seeds: newspaper accounts 250K profiles in London (31.5M tweets) 1,323 in London neighborhoods  573 in 51 neighborhoods
  23. 23. 2 classify sentiment of profiles Word Count vs. Maximum Entropy
  24. 24. Word Count
  25. 25. social media language personality
  26. 26. social media language personality
  27. 27. social media language personality
  28. 28. Max EntropyTraining?Upon 300K tweets with smiley and frowny faces  
  29. 29. Word Count vs. Max Entropy
  30. 30. Word Count vs. Max Entropy
  31. 31. 3 match sentiment with (census) deprivation Index of Multiple Deprivation
  32. 32. predicting socioeconomic well-being with twitter r=.350 word count r=.365 MaxEnt
  33. 33. [CSCW’12] Tracking Gross Community Happiness from Tweets
  34. 34. Going beyond sentiment … Look at the subject matter of tweets!
  35. 35. Extract topics from tweets. Easiest way?Matching Keywords
  36. 36. Extract topics from tweets. Easiest way?Matching Keywords
  37. 37. Dictionary of keywords?A machine learning model?Training?
  38. 38. Use machine learning model (no training required)
  39. 39. Latent Dirichlet Allocation (LDA)
  40. 40. read profiles & define topicscreate virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words if co-occur more than chance: keep them in the bin else: put them into another bin (@ random)
  41. 41. read profiles & define topicscreate virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words Facebook if co-occur more than chance: Twitter keep them in the bin else: put them into another bin (@ random)
  42. 42. read profiles & define topicscreate virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words Facebook social if co-occur more than chance: Twitter keep them in the bin else: econometrics put them into another bin (@ random)
  43. 43. read profiles & define topicscreate virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words Facebook social if co-occur more than chance: Twitter keep them in the bin else: econometrics put them into another bin (@ random)
  44. 44. Latent Dirichlet Allocation (LDA)
  45. 45. Latent Dirichlet Allocation (LDA)
  46. 46. social mediaenvironmentsportshealth wedding parties Spanish/Portuguese celebrity gossips
  47. 47. Support Vector Regression IMD <- SVR(topics) accuracy: 8.14 in [13.12,46.88]
  48. 48. Some areas have very few profiles! residents +
  49. 49. Some areas have very few profiles! residents + visitors
  50. 50. Analyze geo-referenced tweets(not only residents but also visitors)
  51. 51. Linear Regression R2=.49 (49% of IMD variability explained)
  52. 52. So what?
  53. 53. Theoretical Implications
  54. 54. Practical Implications
  55. 55. Ads and the City:Considering Geographic Distance Goes a Long Way
  56. 56. Problem Statement: Given a venue (new bar/restaurant), suggests guests
  57. 57. Problem Statement: Given a venue (new bar/restaurant), suggests guests
  58. 58. Problem Statement: Given a venue (new bar/restaurant), suggests guests
  59. 59. Web ≠ people move!
  60. 60. Web ≠ people move!
  61. 61. On people mobility (from the literature): 1) likes might matter 2) distance matters 3) “power users” are special
  62. 62. On people mobility (from the literature): 1) likes might matter 2) distance matters 3) “power users” are special
  63. 63. On people mobility (from the literature): 1) likes might matter 2) distance matters 3) “power users” are special
  64. 64. The extent one is a power user ;)
  65. 65. HIGH α  travel farther
  66. 66. HIGH α  travel farther
  67. 67. 1) Naïve Bayesian2) Bayesian3) Linear Regression (learn weights)
  68. 68. (2)
  69. 69. (2)
  70. 70. (2)
  71. 71. (2)
  72. 72. (2)
  73. 73. (2)
  74. 74. (2)
  75. 75. Future (well, current & you could help)
  76. 76. 1 complex buildings
  77. 77. “Who talks to whom”
  78. 78. Network
  79. 79. 2 tools for topical & sentiment analysis
  80. 80. 3
  81. 81. 3
  82. 82. 1 Complex Buildings2 Tools for topical & sentiment analysis3 urbanopticon.org
  83. 83. @danielequercia

×