Ibm cog institutetalk_diab

637 views

Published on

Cognitive Systems Institute Speaker Series talk by Mona Diab from George Washington University on May 14, 2015 "Towards Building Effective Computational Sociopragmatics Models of Human Cognition."

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
637
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
10
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ibm cog institutetalk_diab

  1. 1. Towards  building  effec2ve   computa2onal  sociopragma2cs  models   of  human  cogni2on   Mona  Diab   George  Washington  University  
  2. 2. Acknowledgement   •  Many  collaborators:  Dragomir  Radev,  Amjad  Abu   Jbara,  Pradeep  Dasigi,  Weiwei  Guo,  Owen   Rambow,  Julia  Hirschberg,  Kathy  Mckeown,   Mustafa  Mughazy,  Heba  Elfardy,  Vinod   Prabhakaran,  Greg    Werner,  Muhammad   Abdulmageed   •  Research  supported  by  IARPA  SCIL  program  and   DARPA  DEFT  &  BOLT  programs  and  Google   Faculty  award   •  Slides  adapted  from  several  publica2on   presenta2ons  
  3. 3. What  is  sociopragma2cs?   •  The  aspect  of  language  use  that  relates  to   everyday  social  prac.ces.   hVp://www.wordsense.eu  dic2onary   – What  are  social  prac2ces?   Well  …  from  our  language  focused  prism  J     •  Interac2ons,  expressions  of  emo2ons/beliefs/opinions,   etc.    
  4. 4.    Text  and  Social  Rela2ons   We   can   use   linguis2c   analysis   techniques   to   understand   the   implicit   rela2ons   that   develop   in   on-­‐line  communi2es   Image  source:  clair.si.umich.edu  
  5. 5. Overarching  Agenda   •  Goal:  AVempt  to  mine  social  media  text  for   clues  and  cues  on  understanding  human   interac2ons   •  How:  Iden2fy  interes2ng  sociolinguis2c   behaviors  and  correlate  them  with  linguis2c   usage  that  are  quan2fiable  devices  and  build   effec2ve  models  in  the  process   •  Compare  these  devices  cross  linguis2cally  
  6. 6.    Many  Different  Forms  of  Social  Media   •  Communica2on     •  Collabora2on     •  Mul2media     •  Reviews  &  opinions      
  7. 7.  Social  Media  Explosion   source:  www.internetworldstats.com   >3  billion  Internet   users  worldwide   >42.3%  popula2on   penetra2on  (>48%  in   the  MENA  region)   75%  of  them  used   “Social  Media”  
  8. 8.    Text  in  Social  Media   Some  social  media  applica2ons  are  all  about  text  
  9. 9.    Text  in  Social  Media   Even  the  ones  based  on  photos,  videos,  etc.  generate  a  lot  of   discussions  
  10. 10.    Text  in  Social  Media   Huge  amount  of  text  exchanged  in  discussions  
  11. 11. Do  you  s2ll  need  convincing  that  text  is   important!   Yeah  I  thought  not!  Just  checking  J  
  12. 12. Interes2ng  Sociolinguis2c  Phenomena:   Social  Constructs   Mul2ple  Viewpoints  (Subgroups)   Influencers   Pursuit  of  Power   Disputed  Topics  
  13. 13. Approach  to  processing  social  construct   phenomena     (Direc.ve  from  the  IARPA  SCIL  Program)   •  Iden2fy  language  uses  (LU)  per2nent  to  the   different  social  constructs  (SC)     •  Correlate  the  LUs  with  Linguis2c  Construc2ons/ Cons2tuents  (LC)      
  14. 14. Social  Construct:  Influencer  (inf)   •  Language  Uses   – AVempt  to  Persuade   – Agreement/Disagreement   – Level  of  CommiVed  Belief   Influencers  
  15. 15. Social  Construct:  Pursuit  of  Power     (PoP)   •  Language  Uses   –  AVempt  to  Persuade   –  Agreement/Disagreement   –  Level  of  CommiVed  Belief   –  Nega2ve/Posi2ve  Aktude     –  Who  is  talking  about  whom     –  Dialog  PaVerns  (non  linguis2c)   Pursuit  of  Power  
  16. 16. Social  Construct:  Subgroup    (Sub)   •  Language  Uses   – Agreement/Disagreement   – Nega2ve/Posi2ve  Aktude     – Sarcasm   – Level  of  CommiVed  Belief   – Signed  Network  (non  linguis2c)   Mul2ple  Viewpoints  (Subgroups)  
  17. 17. LUs  in  our  approach   •  AVempt  to  Persuade  (Inf,  PoP)   •  Agreement/Disagreement  (Inf,  PoP,  Sub)   •  Level  of  CommiVed  Belief  (Inf,  PoP)   •  Nega2ve/posi2ve  aktude  (Sub,  PoP)   •  Sarcasm  (Sub)   •  Who  is  talking  about  whom  (PoP)   •  Dialog  PaVerns  (PoP)   •  Signed  Network  (Sub)   Do  not  depend  on  linguis6c  analysis   Rely  on  linguis6c  analysis    
  18. 18. Cross  language  comparison:   Generaliza2ons   •  In  general  similar  LU  level  devices  cross  linguis2cally   •  AVempt  to  persuade   –  Claim:  grounding  in  experience,  commonly  respected   sources     –  Argumenta2on:  evidence  and  support  from  other   discussants     •  Agreement/Disagreement   –  Shared  opinion  (explicit  expression),  shared  perspec2ve   (implicit  aktude)   •  Level  of  CommiVed  Belief   –  CommiVed:  The  sun  will  rise  tomorrow   –  Non  commiVed:  John  may  believe  that  the  moon  is  made   of  cheese  
  19. 19. Generaliza2ons   •  -­‐ve/+ve  aktude   – Nega2ve  language     – Sen2ment/word  polarity   •  Who  is  talking  about  whom   – Use  of  men2ons  and  their  frequency  
  20. 20. But  how  do  they  differ  in  their   linguis2c  expression?   •  Arabic  vs.  English  social  media  use  different   linguis2c  cons2tuents  (LC)  to  exhibit  language   use      
  21. 21. Focus  of  this  talk   Influencers   Pursuit  of  Power   Disputed  Topics   Mul2ple  Viewpoints  (Subgroups)  
  22. 22. Subgroup  Detec2on  Problem   Discussion     Thread   Subgroups   Discussant  
  23. 23. Example   The  new  immigra2on  law  is  good.  Illegal   immigra2on  is  bad.   Peter   I  totally  disagree  with  you.  This  law  is  blatant   racism.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct.   Illegal  immigra2on  is  bad  and  must  be  stopped.   John   You  are  clueless,  Peter.    Stop  suppor2ng  racism.   Alexander   Peter   John   Support  the  new  law   Against  the  new  law   Mary   Alexander  
  24. 24. Sample  thread  
  25. 25. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  26. 26. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  27. 27. 1  -­‐  Thread  Parsing   The  new  immigra2on  law  is  good.  Illegal   immigra2on  is  bad.   Peter   I  totally  disagree  with  you.  This  law  is  blatant   racism.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct.   Illegal  immigra2on  is  bad  and  must  be  stopped.   John   You  are  clueless,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   Iden2fy  Posts,  Discussants,  and  the  reply  structure  of  the  discussion  thread  
  28. 28. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  29. 29. 2  -­‐  Iden2fy  Opinion  Words*   The  new  immigra2on  law  is  good+.  Illegal   immigra2on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct+.   Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   *Iden2fying  opinion  words  using  Opinion  Finder  with  an  extended  lexicon   (implemented  using  random  walks  –  Hassan  &  Radev,  2011)  
  30. 30. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  31. 31. 3-­‐  Iden2fy  Candidate  Targets  of  Opinion   Target   Discussant  (  e.g.  you,    Peter)`   Topic/En1ty  (e.g.  The  new  immigra2on  Law,                                  Illegal  Immigra2on)    
  32. 32. Candidate   Targets   3-­‐  Iden2fy  Candidate  Targets  of  Opinion   The  new  immigra2on  law  is  good+.  Illegal   immigra2on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct+.   Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   All  discussants  are  candidate  Targets  
  33. 33. Candidate   Targets   3-­‐  Iden2fy  Candidate  Targets  of  Opinion   The  new  immigra2on  law  is  good+.  Illegal   immigra2on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   Iden2fy  discussant  men2ons  (2pp  or  name)     in  the  discussion   D2  
  34. 34. Candidate   Targets   3-­‐  Iden2fy  Candidate  Targets  of  Opinion   The  new  immigra2on  law  is  good+.  Illegal   immigra2on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Iden2fy  anaphoric  men2ons  of  discussants   D2  
  35. 35. Candidate   Targets   3-­‐  Iden2fy  Candidate  Targets  of  Opinion   The  new  immigra1on  law  is  good+.  Illegal   immigra1on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigra1on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Topic1   Topic1   Topic2   Topic2   D2   Topic  1   Topic  2  
  36. 36. 3-­‐  Iden2fy  Candidate  Targets  of  Opinion   •  Techniques  used  to  iden2fy  topical  targets   – Named  En2ty  Recogni2on   – Noun  phrase  chunking    
  37. 37. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  38. 38. 4-­‐  Opinion-­‐Target  Pairing   I  totally  disagree-­‐  with  you.  The  new  immigra1on   law  is  blatant-­‐  racism-­‐.   Mary   P2   D1   Topic1   nsubj(disagree-3, I-1) advmod(disagree-3, totally-2) root(ROOT-0, disagree-3) prep_with (disagree-3, you-5)Rule     nsubj(racism-­‐-4, Topic1-1) cop(racist-4, is-2) amod(racism-4, blatant-3) root(ROOT-0, racist-4) Rule    
  39. 39. Named  en2ty  rules  
  40. 40. Candidate   Targets   4-­‐  Opinion-­‐Target  Pairing   The  new  immigra1on  law  is  good+.  Illegal   immigra1on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This    law  is  blatant-­‐   racism-­‐.   Mary   Read  all  what  Peter  wrote.  He  is  correct+.  Illegal   immigra1on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Topic1   Topic1   Topic2   Topic2   Topic  1   Topic  2  
  41. 41. 4-­‐  Opinion-­‐Target  Pairing   •  Language  Uses  (LUs)  present  in  this  step:   – Targeted  sen2ment  toward  other  discussants  (2nd   person)   – Targeted  Sen2ment  toward  topic  men2ons  (3rd   person)   I  totally  disagree -­‐  with  you.   This  law  is  blatant -­‐  racism -­‐.  
  42. 42. 4-­‐  Opinion-­‐Target  Pairing   •  LU  details   – Rule-­‐based  detec2on  of  sen2ment  targets   (we’ve  also  been  experimen2ng  with  supervised  target   detec2on  methods)   – Discussant  targets  are  iden2fied  by  2nd  person   pronouns  (you,  your,  yourself,  etc.)  and  by   username  men2ons  (casper3912,  etc.)  
  43. 43. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  44. 44. 5-­‐  Discussant  Aktude  Profile   Target1   ………   Targetn   +   -­‐   #  IA   +   -­‐   #  IA   +   -­‐   #  IA   DAP1   DAP2   #  IA  is  the  number  of  interac2ons  
  45. 45. 5-­‐  Discussant  Aktude  Profile   Peter   Mary   John   Alexander   Topic  1   Topic  2   Targets   Discussants   0   0   0   0   0   0   1   0   1   0   0   0   1   0   1   0   1   1   0   0   0   0   0   0   0   1   1   1   0   1   0   2   2   0   0   0   0   0   0   1   0   1   1   0   2   0   0   0   0   0   0   0   1   1   1   0   1   0   0   0   0   1   1   0   0   0   0   0   0   0   0   0  
  46. 46. 5-­‐  Discussant  Aktude  Profile   Peter   Mary   John   Alexander   Topic  1   Topic  2   Targets   Discussants   0   0   0   0   0   0   1   0   1   0   0   0   1   0   1   0   1   1   0   0   0   0   0   0   0   1   1   1   0   1   0   2   2   0   0   0   0   0   0   1   0   1   1   0   2   0   0   0   0   0   0   0   1   1   1   0   1   0   0   0   0   1   1   0   0   0   0   0   0   0   0   0   Each  Discussant  is   implicitly  posi1ve   toward  himself  
  47. 47. Subgroup  Detec2on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden2fica2on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden2fica2on   ..........you……...   ......................... ......conserva1ves   ideologues……….   ……………………… ....…..Immigra1on   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva2ve     Ideologues   bad   Immigra2on   law   Reply  Structure   Candidate       Target   Iden2fica2on   Clustering   Discussant  A9tude   Profiles  (DAPs)            
  48. 48. Clustering   Peter  Mary   John  Alexander   Subgroup  2  Subgroup  1   (Peter -­‐,  Topic1 -­‐)   (Peter -­‐)   (Topic1 +,  Topic  2 -­‐)   (Peter +,  Topic  2 -­‐)  
  49. 49. Evalua2on   (Abu-­‐Jbara  et  al.,  ACL  2012)   (Abu-­‐Jbara  et  al.,  ACL  2013)    
  50. 50. English  Data     •  117    Discussions     •  Short  threads       •  short  posts   •  Human  annota2on   •  More  formal   •  12    Polls  +  Discussions     •  Long  threads   •  Long  and  short  posts   •  Data  self-­‐labeled   •  Less  formal   •  30    debates   •  Long  threads   •  Long  and  short  posts   •  Data  self-­‐labeled   •  Less  formal  
  51. 51. English  Evalua2on  Datasets  
  52. 52. Arabic  Data   •  Forum  for  2  sided  self  labeled  poli2cal  debates     www.naqeshny.com     •  36  debates  comprising  711  posts  corresponding  to   326  users   •      •  The  average  number  of  posts  per  discussion  19.75   and  average  number  of  discussants  per  topic  13.08  
  53. 53. Evalua2on  Metrics     1.  Purity   Source:  hVp://nlp.stanford.edu/IR-­‐book/html/htmledi2on/evalua2on-­‐of-­‐clustering-­‐1.html  
  54. 54. Evalua2on  Metrics     2.  Entropy   3.  F-­‐Measure   where  P(I,  j)  is  the  probability  of  finding  an  element   from  the  category  i  in  the  cluster  j,  nj  is  the  number  of   items  in  cluster  j,  and  n  the  total  number  of  items  in   the  distribu2on.  
  55. 55. Baselines   •  Interac2on  Graph  Clustering  (GC)   –  Nodes:  Par2cipants   –  Edges:  interac2ons  (connect  two  par2cipants  if  they   exchange  posts)   •  Text  Classifica2on  (TC)   –  Build  TF-­‐IDF  vectors  for  each  par2cipant  (using  all  his/ her  posts)   –  Cluster  the  vector  space  
  56. 56. English  Clustering  Algorithm   •  K-­‐means   •  Expecta2on  Maximiza2on  (EM)   •  Farthest  First  (FF)    
  57. 57. English  Clustering  Algorithm   •  K-­‐means   •  Expecta2on  Maximiza2on  (EM)   •  Farthest  First  (FF)  
  58. 58. Arabic  Clustering  Algorithm   •  K-­‐means   •  Expecta2on  Maximiza2on  (EM)   •  Farthest  First  (FF)  
  59. 59. Arabic  Clustering  Algorithm   •  K-­‐means   •  Expecta2on  Maximiza2on  (EM):  Purity  0.67   Entropy  0.72  (Best  Results)   •  Farthest  First  (FF)  
  60. 60. Comparison  to  baselines   Our System English  Results     Arabic  Results     Method   P   E   Signed  Network   0.71   0.68   Our  System   0.67   0.72  
  61. 61. Wikipedia   Poli1cal  Forum   Create  debate   Purity   0.66   0.61   0.64   Entropy   0.55   0.80   0.68   F-­‐measure   0.61   0.56   0.60   English  Results  
  62. 62. Wikipedia   Poli1cal  Forum   Create  debate   Purity   0.66   0.61   0.64   Entropy   0.55   0.80   0.68   F-­‐measure   0.61   0.56   0.60   English  Results   Best  performing  
  63. 63. Wikipedia   Poli1cal  Forum   Create  debate   Purity   0.66   0.61   0.64   Entropy   0.55   0.80   0.68   F-­‐measure   0.61   0.56   0.60   English  Results   Best  Performing      &    Worst  Performing  
  64. 64. Component  Evalua2on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  Sen1ment   No  Interac1on   No  Anaphora  Resolu1on   No  Named  En1ty  Recog.   No  NP  Chunking  
  65. 65. Component  Evalua2on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  Sen1ment   No  Interac1on   No  Anaphora  Resolu1on   No  Named  En1ty  Recog.   No  NP  Chunking   Not really a linguistic feature
  66. 66. Component  Evalua2on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  Sen1ment   No  Interac1on   No  Anaphora  Resolu1on   No  Named  En1ty  Recog.   No  NP  Chunking   More of a linguistic feature!
  67. 67. Deeper  look  at  Agreement/ Disagreement   •  So  far  we  employed  shared/divergent  opinion   in  the  form  of  explicit  polarity  indicators   – Sen2ment  polarity  towards  other  discussants   •  A:  So,  no  maHer  how  much  faith  you  have,  one  of  you   MUST  be  wrong!  (nega.ve)   •  B:  You  are  a  scien.st?!  May  I  ask  in  which  field?   (nega.ve)   – Sen2ment  polarity  towards  an  en.ty     •  A:  Here  is  an  excellent  verse  from  the  Bible..  (posi.ve)   •  B:  The  Bible  rightly  says  that...  (posi.ve)  
  68. 68. Implicit  Opinion/Perspec2ve   •  Observa2on:  People  sharing  similar  beliefs/perspec2ve   tend  to  use  the  same  evidence  to  support  their  point     –  Believers:  faith,  peace,  love,  ci2ng  verses  from  the  Bible...     –  Atheists:  reason,  science,  aVack  on  the  “logical”  flaws  in   Bible...     •  However  it  is  not  always  explicit  (using  similar  words  and   similar  aktudes)   •  Peter:  God  is  the  creator  of  mankind   •  Mary:  The  belief  in  an  ul2mate  divine  being  has  sustained  me  over  the   years     –  Not  necessarily  posi2ve/nega2ve   –  High  dimensional  similarity  between  both  sentences  is  low!     –  BUT  we  know  Mary  and  Peter  share  the  same  perspec1ve   and  will  tend  to  be  in  agreement  with  each  other  
  69. 69. Modeling  of  implicit  agreement/ disagreement     •  Implicit  agreement  or  disagreement   (perspec2ve)  –  using  text  similarity  to  help   iden2fy  subgroups     •  Perspec2ve  modeling  is  used  to  complement   explicit  aktude     •  Perspec2ve  granularity  has  to  be  collected  on   the  level  of  a  thread  rather  than  a  single  post   •  Hence  we  summarize  all  the  posts  in  the   thread.      
  70. 70. Our  Model   •  Explicit  high  dimensional  aktude  toward   other  discussants  and  en22es     •  Modeling  shared  perspec2ve  among   discussants  over  threads  using  textual   similarity  on  the  post  level  in  the  latent  space  
  71. 71. Extrac2ng  implicit  perspec2ve   •  Run  Latent  Dirichlet  Alloca2on(LDA)  on  the   thread   •  Extract  the  topic  distribu2on  of  each  post   •  Aggregate  the  distribu2ons  of  all  posts   between  each  pair  of  discussants  
  72. 72. FEATURE  REPRESENTATION:  ATTITUDE  PROFILES       •  Vector  Representa2on     •  Explicit  aktude  towards  other  discussants  and   En22es     A   B   C   E1   E2   A   0        0        0   1      1        2   0      1        1   1      0        1   0      0    0   B   …   C   -­‐-­‐  
  73. 73. FEATURE  REPRESENTATION:  ATTITUDE  PROFILES       •  Vector  Representa2on     •  Implicit  agreement  with  other  discussants     A   B   C   E1   E2   A   B   C   A   0        0        0   1      1        2   0      1        1   1      0        1   0      0    0   1    1    1   1    0    0.5   0.5  0    0   B   …   C   -­‐-­‐   1  1  1    
  74. 74. Data   •  English   –  Create  Debate  (CD)     •  www.createdebate.com     •  Deba2ng  on  a  certain  topic     •  Sides  are  explicitly  indicated  by  discussants  in  a  poll  Informal   language     –  Wikipedia  Discussion  Forum  (WIKI)   •  en.wikipedia.org     •  Groups  labels  are  manually  annotated     •  Formal  language,  not  much  nega2ve  polarity     •  Arabic   –  www.naqeshny.com   –  Self  labeled  poli2cal  debates    
  75. 75. Experimental  Condi2ons   •  Clustering  algorithm   –  S-­‐Link   #  of  clusters  by  rule  of  thumb  =  √n/2   •  Evalua2on  Metrics   –  Purity,  Entropy,  F-­‐measure     •  Baseline   –  RAND-­‐BASE:  Assign  discussants  to  clusters  randomly   –  SWD-­‐BASE:  Calculate  surface  word  distribu2on,  as  a   simpler  form  of  perspec2ve  
  76. 76. English  Results   Condi1on   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  
  77. 77. Observa2ons   Condi1on   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   Best  Performance  is  when  we  combine  explicit  aktude  (SD  Sen2ment   toward  other  discussants,  SE  Sen2ment  toward  En22es)  with  implicit   perspec2ve  (PERS),  regardless  of  genre  
  78. 78. Observa2ons   Condi1on   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   Wiki  seems  to  gain  more  from  implicit  perspec2ve  compared  to  CD    Explicit  Aktude  is  a  beVer  feature  for  CD:  people  express  their    sen2ments  openly,  while  in  Wiki  people  are  more  constrained  and    subtle  in  their  expressions  
  79. 79. Observa2ons   Condi1on   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   BeVer  results  obtained  on  the  same  data  set  from  the  previous  results  for   Wiki  (P  0.66,  E  0.55)  CD  (P  0.64,  E  0.68)  
  80. 80. Arabic  Results   Using  EM   Purity   Entropy   F-­‐measure   Signed  Network  BASELINE   0.71   0.68   0.67   Explicit  Aktude   0.67   0.72   0.65   Implicit/Perspec2ve   0.64   0.74   0.65   Our  System  (combined)   0.77   0.50   0.76  
  81. 81. Arabic  Results   Using  EM   Purity   Entropy   F-­‐measure   Signed  Network  BASELINE   0.71   0.68   0.67   Explicit  Aktude   0.67   0.72   0.65   Implicit/Perspec2ve   0.64   0.74   0.65   Our  System  (combined)   0.77   0.50   0.76   Significant  improvement  over  baseline  
  82. 82. Arabic  Results   Using  EM   Purity   Entropy   F-­‐measure   Signed  Network  BASELINE   0.71   0.68   0.67   Explicit  Aktude   0.67   0.72   0.65   Implicit/Perspec2ve   0.64   0.74   0.65   Our  System  (combined)   0.77   0.50   0.76   Significant  improvement  over  baseline   Complementarity  between  Explicit  aktude  and  Perspec2ve  
  83. 83. Conclusions   •  We  can  successfully  model  sociopragma2c   phenomena   – Golden  rule  of  computer  science  (divide  and   conquer)   Form  subgroups  J   •  There  is  significant  room  for  improvement   •  It  takes  a  large  team  of  computer  scien2sts   and  significant  collabora2on  with  the   humani2es  to  get  this  program  going  
  84. 84. Where  are  we  now?   •  Extensive  work  on  Sen2ment  and  Emo2on   Intensity  characteriza2on/detec2on   •  Work  on  Rumor  Detec2on   •  Work  on  Level  of  CommiVed  Belief  Tagging   (check  us  out  at  *SEM  2015,  and  EXPROM   2015)   •  Work  on  Ideological  Perspec2ve  Detec2on   (check  us  out  at  *SEM  2015)  
  85. 85. Thank  you   Ques.ons?  

×