Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Machine learning approaches for understanding
social interactions on Twitter
May 6, 2014

Alice Oh

alice.oh@kaist.edu

ao...
Our Research
• Topic Modeling

• ICML 2014: Hierarchical Dirichlet scaling process

• IJCAI 2013: Context-dependent concep...
Contact Information
• At Harvard until end of July, 2014 and open for
• Collaborations: writing papers, sharing data, etc....
What is topic modeling?
Blei, Communications of the ACM, 2012
Motivation
Motivation
• What are the topics discussed in the article?

• Is the article related to

• household finances?

• price of ...
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp
nascar, races, track, raceway, race, cars, fuel, auto...
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, ...
Input to LDA
10
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
Topics Discovered by LDA
nascar 0.12 spending 0.09 sports 0.12
races 0.1 economic 0.07 team 0.11
cars 0.1 recession 0.06 g...
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, ...
Graphical Representation of LDA
Topic Distributions
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic...
Do you feel what I feel?
Social Aspects of Emotions in Twitter Conversations
Suin Kim, JinYeong Bak, Alice Oh
ICWSM 2012
14
Twitter conversation data
• Twitter conversation data: approx 220k dyads who “reply” to each other,
1,670k conversational ...
Emotion Cycles
16
Emotion cycles
We propose that organizational dyads and groups inhabit
emotion cycles: Emotions of an individual influence ...
Topic model with a twist
• Dirichlet forest prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichl...
Domain knowledge in Dirichlet forest prior
19
Seed Words
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervo...
Emotion Topics How do we express emotions?
JoyAnticipation Anger
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank...
Emotion Topics How do we express emotions?
JoyAnticipation
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follo...
Emotion-tagged
conversations 22
A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on
@mrunmaiy's health -...
Emotion Transitions Plutchik’s Wheel of Emotions
Joy
39.7%
0.51
Acceptance
10.4%
0.23
Fear
2.6%
0.11
Surprise
7.4%
0.17
An...
Defining “Influence”
emotion influencing tweet
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
J...
Topic 117
tweet
people
don’t
read
post
Topic 59
hurt
got
bad
pain
feel
Emotion Influences What can you say to make your
par...
0
0.075
0.15
0.225
0.3
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.041
0.0710.082
0.053
0.26...
Self-disclosure topic model
JinYeong Bak, Chin-Yew Lin, and Alice Oh

ACL 2014 Workshop on Social Dynamics and Personal At...
Self-disclosure Research using Twitter
• People disclose personal and secretive information
• to build and maintain interp...
Self-disclosure in Twitter conversations
29
Conversa)on	
  2:	
  
I'm	
  moving	
  out.	
  
@xxxx	
  ???	
  What's	
  goin...
Data
• Full data
• 88k users, 51k dyads
• 1.3M conversations
• 10.5M tweets
• Longitudinal data from August 2007 to July 2...
Graphical Representation of SDTM
3 sets of topics, one for G, M, and H levels
By using a topic model, we can !
-classify t...
Seed Words
• Medium level: frequent trigrams for personally identifiable
information
!
!
!
!
• High level: automatically ex...
Classification Results
33
Direct Classification using the Models
Classification with SVM using

Features Learned from Models
Self-disclosure topics
34
SD level & conversation frequency
35
Sociolinguistic Analysis of Twitter in Multilingual
Societies
Suin Kim, Ingmar Weber, Li Wei, and Alice Oh

Under Review
36
Data
Data
Visualization of the network
How are they
connected?
• English monolinguals and X-EN
bilinguals bridge the network
Closer look at Bilinguals: Which language do they
choose?
Closer look at Bilinguals: Hashtag usage
Closer look at Bilinguals: Topics (Results of LDA)
Closer look at Bilinguals: Topics (Results of LDA)
Future directions
• Develop model for prediction of language choice in bilinguals

• Look at how English is used throughou...
Upcoming SlideShare
Loading in …5
×

Talk at MIT HCI Seminar

1,017 views

Published on

Machine learning approaches for understanding social interactions on Twitter

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Talk at MIT HCI Seminar

  1. 1. Machine learning approaches for understanding social interactions on Twitter May 6, 2014 Alice Oh alice.oh@kaist.edu aoh@seas.harvard.edu http://uilab.kaist.ac.kr/members/aliceoh/
  2. 2. Our Research • Topic Modeling • ICML 2014: Hierarchical Dirichlet scaling process • IJCAI 2013: Context-dependent conceptualization • NIPS Big Learning Workshop 2012: Distributed online learning for latent Dirichlet allocation • CIKM 2012: Recursive Chinese restaurant processes for modeling topic hierarchies • ICML 2012: Dirichlet processes with mixed random measures • Social Media Analysis • ACL 2014 Workshop: Self-disclosure topic model • WWW 2014: Computational analysis of agenda setting theory • AAAI 2013: Hierarchical aspect-sentiment model • ICWSM 2012: Social aspects of emotions in Twitter conversations • ACL 2012: Self-disclosure and relationship strength in Twitter conversations • WSDM 2011: Aspect sentiment unification model for online review analysis 2
  3. 3. Contact Information • At Harvard until end of July, 2014 and open for • Collaborations: writing papers, sharing data, etc. • Discussions about topic modeling and computational social science • Going back to KAIST in August • http://uilab.kaist.ac.kr • alice.oh@kaist.edu • Can recommend students for intern, postdoc, and researcher positions • Please consider attending • ICWSM (program co-chair), Ann Arbor, MI • ACL Workshop on Social Dynamics and Personal Attributes (co- organizer), Baltimore, MD 3
  4. 4. What is topic modeling?
  5. 5. Blei, Communications of the ACM, 2012
  6. 6. Motivation
  7. 7. Motivation • What are the topics discussed in the article? • Is the article related to • household finances? • price of gasoline? • price of Apple stock? • How would you build an automatic system for answering these questions?
  8. 8. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition 8
  9. 9. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html? nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over wordsTopic Distributions
  10. 10. Input to LDA 10 http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
  11. 11. Topics Discovered by LDA nascar 0.12 spending 0.09 sports 0.12 races 0.1 economic 0.07 team 0.11 cars 0.1 recession 0.06 game 0.1 racing 0.09 save 0.05 player 0.1 track 0.08 money 0.05 athlete 0.09 speed 0.06 cut 0.04 win 0.07 ... ... ... money 0.002 speed 0.003 nascar 0.001 Topics: multinomial over vocabulary 11
  12. 12. http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html? nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over wordsTopic Distributions
  13. 13. Graphical Representation of LDA Topic Distributions nascar, races, track, raceway, race, cars, fuel, auto, racing economic, slowdown, sales, recession, costs, spending, save fans, spectators, sports, leagues, teams, competition Topics: multinomial over words Topics sales xxx slowdown recession cars races spending xxx save costs fuel
 13
  14. 14. Do you feel what I feel? Social Aspects of Emotions in Twitter Conversations Suin Kim, JinYeong Bak, Alice Oh ICWSM 2012 14
  15. 15. Twitter conversation data • Twitter conversation data: approx 220k dyads who “reply” to each other, 1,670k conversational chains (We now have about 5x this amount) ! 1! 2! 3! 4!
  16. 16. Emotion Cycles 16
  17. 17. Emotion cycles We propose that organizational dyads and groups inhabit emotion cycles: Emotions of an individual influence the emotions, thoughts and behaviors of others; others’ reactions can then influence their future interactions with the individual expressing the original emotion, as well as that individual’s future emotions and behaviors. People can mimic the emotions of others, thereby extending the social presence of a specific emotion, but can also respond to others’ emotions, extending the range of emotions present. 17
  18. 18. Topic model with a twist • Dirichlet forest prior (Andrzejewski et al.) • Mixture of Dirichlet tree distribution • Dirichlet tree: Generalization of Dirichlet distribution • Knowledge is expressed using Must-link and Cannot-link primitives • Must-link(love, sweetheart) • Cannot-link(exciting, bored) 18 q ⌘ DF-LDA
  19. 19. Domain knowledge in Dirichlet forest prior 19 Seed Words anticipation hope wait await inspir excit bore readi expect nervou calm motiv prepar certain anxiou optimist forese joy awesom amaz wonder excit glad fine beauti high lucki super perfect complet special bless safe proud anger shit bitch ass mean damn mad jealou piss annoi angri upset moron rage screw stuck irrit surprise amaz wow wonder weird lucki differ awkward confus holi strang shock odd embarrass overwhelm astound astonish fear scare stress horror nervou terror alarm behind panic fear afraid desper threaten tens terrifi fright anxiou sadness sorri bad aw sad wrong hurt blue dead lost crush weak depress wors low terribl lone disgust sick wrong evil fat ugli horribl gross terribl selfish miser pathet disgust worthless aw asham fuck acceptance okai ok same alright safe lazi relax peac content normal secur complet numb fulfil comfort defeat Must-link within a class Cannot-link between classes
  20. 20. Emotion Topics How do we express emotions? JoyAnticipation Anger Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 159 good day hope morning thank Topic 158 love thank miss hug Topic 125 hope better feel thank soon Topic 26 good thank hope miss Topic 146 come wait week day june Topic 146 good day time work Topic 131 lmao fuck ass bitch shit Topic 4 ass yo lmao nigga Topic 19 lmao shit damn fuck oh Topic 13 shit nigga smh yea Fear Topic 48 omg oh lmao shit scare Topic 78 happen heart attack hospital Topic 27 don’t come night sleep outside Topic 140 time got work day Surprise Topic 172 yeag know think true funny Topic 89 know don’t think look Topic 15 think don’t know make really Topic 94 haha dont think really 29 70 21 14 5 Sadness Disgust Topic 6 oh sorry haha know didnt Topic 59 hurt got good bad Topic 106 tweet reply didn’t read sorry Topic 155 oh really make feel Topic 116 oh fuck don’t ye ew Topic 116 look haha oh know Topic 22 don’t oh think yeah lmao Topic 174 don’t think say people Acceptance Topic 43 ok oh thank cool okay Topic 102 know try let ok Topic 199 xx thank good okay follow Topic 8 night love good sleep 17 7 18 Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account Topic 184 account google app work email Topic 67 food chicken cook rt 19 20
  21. 21. Emotion Topics How do we express emotions? JoyAnticipation Topic 114 omg love haha thank really Topic 107 love thank follow wow Topic 125 hope better feel thank soon Topic 26 good thank hope miss Sadness Topic 6 oh sorry haha know didnt Topic 59 hurt got good bad Neutral Topic 180 com www http check youtube Topic 156 twitter facebook people account GreetingCaring Sympathy
 IT/Tech 21
  22. 22. Emotion-tagged conversations 22 A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on @mrunmaiy's health - hope she is recovering well? B (neut): @labnol @dhempe she is recovering but slow. The injury is on the spine therefore worrisome. Still in icu. A (Sadness): @amithpr thanks for the update.. extremely said to hear that news.. B (neut): @labnol #prayformrun She is a fighter and will come out of this B (neut): @AyeItsMeiMei just tell ur followers to report her for spam. then she'll be kicked off twitter A (Anger): @Jakeosaurous dude I didn't even do shit to her I'm just here tweeting & she calls me a ugly bitch? I was like oh wow thanks? B (neut): @AyeItsMeiMei yeah clearly shes so ugly she cant even use her real pic:P so dont feel bad A (Love): @Jakeosaurous haha. I don't care. She's getting spammed with hate. Hahaha. (": thanks though. B (neut): @AyeItsMeiMei np
  23. 23. Emotion Transitions Plutchik’s Wheel of Emotions Joy 39.7% 0.51 Acceptance 10.4% 0.23 Fear 2.6% 0.11 Surprise 7.4% 0.17 Anticipation 15.1% 0.26 Disgust 2.9% 0.11 Sadness 9.1% 0.19 0.31 Anger 12.8% 0.37 0.33 0.32 0.31 0.33 0.21 0.34 0.15 0.14 0.13 0.15 23
  24. 24. Defining “Influence” emotion influencing tweet User A User B Having a tough day today. RIP Harrison. I’ll miss you a ton :/ Just pray about it. God will help you. Not really religious, but thanks man. :) If you need talk you know I’m here. Time (Sadness) (Acceptance) (Anticipation) 24
  25. 25. Topic 117 tweet people don’t read post Topic 59 hurt got bad pain feel Emotion Influences What can you say to make your partner feel better? Joy → SadnessSadness → Joy Topic 18 wear look think love black Topic 24 love thank great new look Anticipation → Surprise Topic 96 music listen play song good Topic 178 follow tweet people twitter thank Acceptance → Anger Topic 31 i’m got lmax shit da Topic 13 lmao shit nigga smh yea Disgust → Joy Topic 61 watch new live tv tonight Topic 63 watch good think know look Suggesting Greeting Sympathy
 Swear words Complaining 25
  26. 26. 0 0.075 0.15 0.225 0.3 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.041 0.0710.082 0.053 0.265 0.061 0.081 0.0420.051 Emotion Influence: Sadness to Joy Emotion Influence: Joy to Anger 0 0.09 0.18 0.27 0.36 Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral 0.211 0.230.2140.209 0.191 0.2370.253 0.358 0.273 Expressing Anger has 26.5% of chance of changing the partner’s emotion from Joy to Anger. 26 Expressing Joy has 35.8% of chance of changing the partner’s emotion from Sadness to Joy.
  27. 27. Self-disclosure topic model JinYeong Bak, Chin-Yew Lin, and Alice Oh ACL 2014 Workshop on Social Dynamics and Personal Attributes 27
  28. 28. Self-disclosure Research using Twitter • People disclose personal and secretive information • to build and maintain interpersonal relationship • to get social support • Twitter is a great source for naturally-occurring, large- scale, longitudinal data on self-disclosure behavior • We develop a topic model for classifying self-disclosure behavior into three categories: G (general, no disclosure), M (medium disclosure), H (high disclosure) • We look at the correlation of self-disclosure behavior and frequency of Twitter conversations in longitudinal data 28
  29. 29. Self-disclosure in Twitter conversations 29 Conversa)on  2:   I'm  moving  out.   @xxxx  ???  What's  going  on  bb?   @yyyy  Mother.  Done  with  her.  I  am  planning  to  get  out  now.  There's  nothing  I  can  do,  we  dont  get  along   @xxxx  I'm.sorry  hunn.  That's  rough.  Where  are  you  going  to  go  though?   @yyyy  Probably  stay  at  a  friends  place  in  the  Cmebeing  unCl  I  find  a  place  to  live!   @xxxx  :/  well  I'm  glad  your  geHng  out  if  she  is  being  horrible  to  you   Conversa)on  3:   Oh,  prepregnancy  pants,  you  are  so  uncomfortable.   @eeee  You  can  put  them  on?  Jealous.   @ffff  they  are  cuHng  into  my  flesh  and  are  giving  me  a  ridiculous  muffin  top.  It  isn't  preOy.  But  we  have  company  coming   over.   @eeee  Yea,  I  tried  yesterday.  I  got  one  pair  of  shorts  to  buOon  painfully  and  my  jeans  just  laughed  at  me. Conversa)on  1:   So  my  brother  is  going  to  Roskilde  FesCval  and    my  mother  and  sister  is  going  to  England..  That  leaves  me,  my  dad  and  my   dog.   @cccc  why  aren't  you  going  to  england?   @dddd  because  my  sister  is  going  with  3  of  her  friends  and  my  mom's  just  there...  to  be  there.  And  my  sister  didn't  want   me  to  come  :(  
  30. 30. Data • Full data • 88k users, 51k dyads • 1.3M conversations • 10.5M tweets • Longitudinal data from August 2007 to July 2013 • Labeled data (gold standard for self-disclosure level) • 101 conversations • 673 tweets 30
  31. 31. Graphical Representation of SDTM 3 sets of topics, one for G, M, and H levels By using a topic model, we can ! -classify the levels of disclosure! -discover topics associated with each level! -generalize to other social media sites using the same set of seed words
  32. 32. Seed Words • Medium level: frequent trigrams for personally identifiable information ! ! ! ! • High level: automatically extracted from sixbillionsecrets Website 32
  33. 33. Classification Results 33 Direct Classification using the Models Classification with SVM using
 Features Learned from Models
  34. 34. Self-disclosure topics 34
  35. 35. SD level & conversation frequency 35
  36. 36. Sociolinguistic Analysis of Twitter in Multilingual Societies Suin Kim, Ingmar Weber, Li Wei, and Alice Oh Under Review 36
  37. 37. Data
  38. 38. Data
  39. 39. Visualization of the network
  40. 40. How are they connected? • English monolinguals and X-EN bilinguals bridge the network
  41. 41. Closer look at Bilinguals: Which language do they choose?
  42. 42. Closer look at Bilinguals: Hashtag usage
  43. 43. Closer look at Bilinguals: Topics (Results of LDA)
  44. 44. Closer look at Bilinguals: Topics (Results of LDA)
  45. 45. Future directions • Develop model for prediction of language choice in bilinguals • Look at how English is used throughout the world • Cognitive studies of first- and second- language • Self-disclosure and relationship building • Email me for data sharing, collaborating, discussing, … • alice.oh@kaist.edu

×