SlideShare a Scribd company logo
1 of 107
Computational Modeling of
Sociopragmatic Language Use in
Arabic and English Social Media
	
  
Mona Diab
The George Washington University
Acknowledgement	
  
•  Joint	
  work	
  on	
  subgroup	
  detec5on	
  with	
  Dragomir	
  
Radev,	
  Amjad	
  Abu	
  Jbara	
  	
  
•  My	
  students:	
  Muhammad	
  AbdulMageed,	
  
Pradeep	
  Dasigi,	
  Weiwei	
  Guo	
  
•  Collabora5ve	
  work	
  with	
  Owen	
  Rambow	
  and	
  
Kathy	
  Mckeown,	
  and	
  their	
  respec5ve	
  groups	
  
•  Collabora5ve	
  sociolinguis5c	
  observa5ons	
  with	
  
Mustafa	
  Mughazy	
  
•  Work	
  funded	
  by	
  IARPA	
  SCIL	
  program	
  
•  Several	
  slides	
  adapted	
  from	
  several	
  presenta5ons	
  
where	
  papers	
  published	
  on	
  work	
  
Our	
  Overarching	
  Research	
  Interest	
  
•  Goal:	
  AKempt	
  to	
  mine	
  social	
  media	
  text	
  for	
  
clues	
  and	
  cues	
  toward	
  building	
  an	
  
understanding	
  human	
  interac5ons	
  
•  How:	
  Iden5fy	
  interes5ng	
  sociolinguis5c	
  
behaviors	
  and	
  correlate	
  them	
  with	
  linguis5c	
  
usage	
  that	
  is	
  quan%fiable	
  and	
  explicitly	
  
characterizable	
  as	
  a	
  diagnos%c	
  device	
  
•  Compare	
  these	
  devices	
  cross	
  linguis5cally	
  
 	
  Text	
  and	
  Social	
  Rela5ons	
  
We	
   can	
   use	
   linguis5c	
  
analysis	
   techniques	
   to	
  
understand	
   the	
   implicit	
  
rela5ons	
   that	
   develop	
   in	
  
on-­‐line	
  communi5es	
  
Image	
  source:	
  clair.si.umich.edu	
  
 	
  Many	
  Different	
  Forms	
  of	
  Social	
  Media	
  
•  Communica5on	
  
	
  
•  Collabora5on	
  
	
  
•  Mul5media	
  
	
  
•  Reviews	
  &	
  opinions	
  	
  
	
  
 Social	
  Media	
  Explosion	
  
source:	
  www.internetworldstats.com	
  
1.73	
  billion	
  Internet	
  
users	
  worldwide.	
  	
  
	
  
75%	
  of	
  them	
  used	
  
“Social	
  Media”	
  
 	
  Text	
  in	
  Social	
  Media	
  
Some	
  social	
  media	
  applica5ons	
  are	
  all	
  about	
  text	
  
 	
  Text	
  in	
  Social	
  Media	
  
Even	
  the	
  ones	
  based	
  on	
  photos,	
  videos,	
  etc.	
  have	
  a	
  lot	
  of	
  
discussions	
  
 	
  Text	
  in	
  Social	
  Media	
  
Huge	
  amount	
  of	
  text	
  exchanged	
  in	
  discussions	
  
A	
  significant	
  treasure	
  trove	
  	
  
Interes5ng	
  Sociolinguis5c	
  Phenomena:	
  
Social	
  Constructs	
  
Mul5ple	
  Viewpoints	
  (Subgroups)	
   Influencers	
  
Pursuit	
  of	
  Power	
   Disputed	
  Topics	
  
Approach	
  to	
  processing	
  social	
  construct	
  
phenomena	
  
•  Like	
  any	
  good	
  scien5st	
  (or	
  imperialist):	
  
divide	
  and	
  conquer	
  
– Iden5fy	
  language	
  uses	
  (LU)	
  per5nent	
  to	
  the	
  
different	
  social	
  constructs	
  (SC)	
  	
  
– Correlate	
  and	
  map	
  these	
  LUs	
  with	
  Linguis5c	
  
Construc5ons/Cons5tuents	
  (LC)	
  	
  	
  
Granularity	
  Level	
  Thread	
  
Discover	
  relevant	
  LUs	
  
•  AKempt	
  to	
  persuade	
  
•  Agreement/disagreement	
  
•  Nega5ve/posi5ve	
  aetude	
  	
  
•  Who	
  is	
  talking	
  about	
  whom	
  	
  
•  Dialog	
  paKerns	
  	
  
•  Signed	
  network	
  	
  
Do	
  not	
  depend	
  on	
  linguis%c	
  analysis	
  
Rely	
  on	
  linguis%c	
  analysis	
  	
  
	
  
	
  
Discover	
  relevant	
  LUs	
  
•  AKempt	
  to	
  persuade	
  
•  Agreement/disagreement	
  
•  Nega5ve/posi5ve	
  aetude	
  	
  
•  Who	
  is	
  talking	
  about	
  whom	
  	
  
•  Dialog	
  paKerns	
  	
  
•  Signed	
  network	
  	
  
Do	
  not	
  depend	
  on	
  linguis%c	
  modeling	
  
Rely	
  on	
  linguis%c	
  modeling	
  
	
  
	
  
LU:	
  AKempt	
  to	
  Persuade	
  
•  An	
  expression	
  of	
  opinion	
  (a	
  claim)	
  followed	
  by	
  
explicit	
  jus5fica5on	
  of	
  the	
  claim	
  (an	
  argumenta5on)	
  
–  Persuade	
  to	
  believe,	
  not	
  persuade	
  to	
  act	
  
	
  
– Claim:	
  grounding	
  in	
  experience,	
  commonly	
  
respected	
  sources	
  	
  
– Argumenta5on:	
  evidence	
  and	
  support	
  from	
  other	
  
discussants	
  	
  
	
  
CLAIM:	
  There	
  seems	
  to	
  be	
  a	
  much	
  beKer	
  list	
  at	
  the	
  Na5onal	
  Cancer	
  
Ins5tute	
  than	
  the	
  one	
  we’ve	
  got.	
  
ARGUMENTATION:	
  It	
  5es	
  much	
  beKer	
  to	
  the	
  actual	
  publica5on	
  (the	
  
same	
  11	
  sec5ons,	
  in	
  the	
  same	
  order).	
  	
  
LU:	
  Agreement	
  and	
  Disagreement	
  
•  Examine	
  pairs	
  of	
  phrases	
  to	
  model	
  others’	
  
acceptance	
  of	
  the	
  par5cipant’s	
  ideas	
  
	
   P1	
  by	
  Arcadian:	
  There	
  seems	
  to	
  be	
  a	
  much	
  beKer	
  list	
  at	
  the	
  Na5onal	
  
Cancer	
  Ins5tute	
  than	
  the	
  one	
  we’ve	
  got.	
  It	
  5es	
  much	
  beKer	
  to	
  the	
  actual	
  
publica5on	
  (the	
  same	
  11	
  sec5ons,	
  in	
  the	
  same	
  order).	
  I’d	
  like	
  to	
  replace	
  
that	
  sec5on	
  in	
  this	
  ar5cle.	
  Any	
  objec5ons?	
  
	
  P2	
  by	
  JFW:	
  Not	
  a	
  problem.	
  Perhaps	
  we	
  can	
  also	
  insert	
  the	
  rela5ve	
  
incidence	
  as	
   	
  published	
  in	
  this	
  month’s	
  wiki	
  Blood	
  journal	
  
Example	
  of	
  Agreement	
  
•  Shared	
  opinion	
  (explicit	
  expression),	
  shared	
  
perspec5ve	
  (implicit	
  aetude)	
  
•  Using	
  word	
  similarity	
  and	
  overlap	
  
LU: -ve/+ve Attitude
•  The attitude of a discussant/participant in a
conversation toward another participant or topic or
entity mentioned in the thread
•  Characterize –ve and +ve sentences
•  Positive: praise, express liking, etc.
•  You are great
•  Simply elegant and beautiful
•  Negative: insult, dislike, disagreement, sarcasm, etc.
•  You're a liar.
•  You know, you're a pretty absurd individual even by Usenet standards.
•  You're just pathetic.
LU:	
  Aetude	
  towards	
  another	
  person	
  
(2)	
  PER2:	
  No	
  it	
  hasn't	
  that's	
  a	
  bold	
  faced	
  lie.	
  A	
  definate	
  
	
  majority	
  of	
  Americans	
  support	
  the	
  public	
  option.	
  	
  The	
  only	
  
	
  people	
  who	
  are	
  against	
  it	
  are	
  the	
  insurance	
  companies	
  and	
  
	
  moron	
  social	
  conservatives	
  like	
  you	
  who	
  don't	
  even	
  
	
  understand	
  what	
   	
  socialism	
  is.	
  
LU:	
  Aetude	
  towards	
  another	
  person	
  
(2)	
  PER2:	
  No	
  it	
  hasn't	
  that's	
  a	
  bold	
  faced	
  lie.	
  A	
  definate	
  
	
  majority	
  of	
  Americans	
  support	
  the	
  public	
  option.	
  	
  The	
  only	
  
	
  people	
  who	
  are	
  against	
  it	
  are	
  the	
  insurance	
  companies	
  and	
  
	
  moron	
  social	
  conservatives	
  like	
  you	
  who	
  don't	
  even	
  
	
  understand	
  what	
   	
  socialism	
  is.	
  
	
  
Using	
  nega5ve	
  and	
  insul5ng	
  language.	
  Sen5ment	
  
and	
  word	
  polarity	
  are	
  the	
  devices	
  used	
  
LU: Who is talking about whom
How often a person refers to, or is referred to by, other
discourse participants
Use of mentions and their frequencies
IsMyNameUsedByOthers
HaveIUsedOthersName
%OfUsersReferencedByMe
%OfUsersReferencedMe
%OfReferencesByMe
%OfReferencesToMe
ReferencesByMeToWordsRatio
users references made by me/total number of words I
wrote.
ReferencesToMeToWordsRatio
no. of references / total number of words by others
 	
  LU:	
  Signed	
  Network	
  
1	
  
1000	
  
2841	
  
Par55on	
  the	
  social	
  medium	
  network	
  into	
  posi5ve	
  and	
  nega5ve	
  links	
  based	
  
on	
  polarity	
  of	
  words	
  used	
  	
  
	
  
What	
  is	
  the	
  public	
  opinion	
  on	
  the	
  health	
  care	
  reform?	
  
2841	
  posts	
  
More	
  than	
  300K	
  words	
  
 	
  LU:	
  Signed	
  Network	
  
Par5cipants	
  
Interac5ons	
  
 	
  LU:	
  Signed	
  Network	
  
Par5cipants	
  Nega5ve	
  Interac5on	
  
Posi5ve	
  Interac5on	
  
Very	
  Hot	
  Topic	
  	
  
(high	
   percentage	
   of	
  
nega5ve	
  links)	
  
 	
  LU:	
  Signed	
  Network	
  
Against	
  Reform	
  
(55%)	
  
Pro	
  Reform	
  
(45%)	
  
LU:	
  Dialog	
  PaKerns	
  
•  Dialog	
  PaKerns	
  are	
  based	
  on	
  metadata	
  (e.g.,	
  
the	
  thread	
  structure),	
  not	
  the	
  text	
  
– Ini5a5ve 	
   	
   	
  who	
  started	
  the	
  thread	
  
– Investment 	
   	
  share	
  of	
  par5cipa5on	
  
– Irrelevance 	
   	
  how	
  omen	
  ignored	
  by	
  others	
  
– Interjec5on 	
   	
  at	
  what	
  point	
  joined	
  conversa5on	
  
– Incita5on 	
   	
   	
  how	
  long	
  are	
  branches	
  started	
  
– Inquisi5veness	
  	
   	
  the	
  number	
  of	
  ques5on	
  marks	
  
Interes5ng	
  Sociolinguis5c	
  Phenomena:	
  
Social	
  Constructs	
  
Mul5ple	
  Viewpoints	
  (Subgroups)	
   Influencers	
  
Pursuit	
  of	
  Power	
   Disputed	
  Topics	
  
Who	
  is	
  an	
  Influencer?	
  
•  Someone	
  whose	
  opinions/ideas	
  profoundly	
  affect	
  the	
  conversa5on	
  
•  An	
  influencer	
  may	
  have	
  the	
  following	
  characteris5cs	
  (Katz	
  and	
  Lazarsfeld,	
  
1955)	
  
–  alter	
  the	
  opinions	
  of	
  their	
  audience	
  
–  resolve	
  disagreements	
  where	
  no	
  one	
  else	
  can	
  
–  be	
  recognized	
  by	
  others	
  as	
  one	
  who	
  makes	
  important	
  contribu5ons	
  
–  omen	
  con5nue	
  to	
  influence	
  a	
  group	
  even	
  when	
  not	
  present	
  	
  
–  have	
  other	
  conversa5onal	
  par5cipants	
  adopt	
  their	
  ideas	
  and	
  even	
  the	
  
words	
  they	
  use	
  to	
  express	
  their	
  ideas	
  
•  More	
  formally,	
  an	
  influencer:	
  
–  Has	
  credibility	
  in	
  the	
  group	
  
–  Persists	
  in	
  aKemp5ng	
  to	
  convince	
  others,	
  even	
  if	
  some	
  disagreement	
  
occurs	
  
–  Introduces	
  topics/ideas	
  that	
  others	
  pick	
  up	
  on	
  or	
  support	
  
Social	
  Construct:	
  Influencer	
  (inf)	
  
•  Language	
  Uses	
  
– AKempt	
  to	
  Persuade	
  
– Agreement/disagreement	
  
	
  
Influencers	
  
What is Pursuit of Power?
•  Individual makes repeated efforts to gain power within the
group.
•  The individual attempts to control the actions or goals of the
group.
•  Individual’s behavior causes tension within the group
Social	
  Construct:	
  Pursuit	
  of	
  Power	
  	
  
(PoP)	
  
•  Language	
  Uses	
  
–  AKempt	
  to	
  Persuade	
  
–  Agreement/disagreement	
  
–  Nega5ve/posi5ve	
  aetude	
  	
  
–  Who	
  is	
  talking	
  about	
  whom	
  	
  
–  Dialog	
  paKerns	
  (non	
  linguis5c)	
  
Pursuit	
  of	
  Power	
  
Social	
  Construct:	
  Subgroup	
  Detec5on	
  
Discussion	
  	
  
Thread	
   Subgroups	
  
Discussant	
  
Social	
  Construct:	
  Subgroup	
  	
  (Sub)	
  
•  Language	
  Uses	
  
– Agreement/disagreement	
  
– Nega5ve/posi5ve	
  aetude	
  	
  
– Signed	
  Network	
  (non	
  linguis5c)	
  
Mul5ple	
  Viewpoints	
  (Subgroups)	
  
Cross	
  Linguis5c	
  Comparison	
  
•  The	
  SC	
  in	
  both	
  languages	
  use	
  same	
  LUs	
  
•  But	
  do	
  Arabic	
  and	
  English	
  social	
  media	
  use	
  
different	
  linguis5c	
  cons5tuents	
  to	
  show	
  
language	
  use?	
  
•  A	
  qualita5ve	
  view:	
  	
  
	
  
AKempt	
  to	
  persuade	
  
•  Claims	
  
–  A	
  lot	
  more	
  grounding	
  using	
  religious	
  references	
  
–  Religion	
  plays	
  a	
  significant	
  role	
  in	
  Arabic	
  discourse	
  
structure	
  therefore	
  used	
  to	
  establish	
  credibility	
  and	
  
accordingly	
  influence	
  and	
  power	
  differen5als	
  
•  Easily	
  detected	
  using	
  simple	
  devices	
  such	
  as	
  explicit	
  
diacri5za5on	
  	
  
–  Less	
  subjec5ve	
  language	
  (less	
  usage	
  of	
  “I”	
  more	
  of	
  
“we”,	
  or	
  exple5ves	
  such	
  as	
  “there,	
  it”)	
  
‫ﺗﺘﻔﻬﻢ‬ ‫أن‬ ‫ ﺣﺎول‬ –‫ﻧﺤﺎول‬ ‫أﻧﻨﺎ‬‫إﺷﻜﺎﻟﻴﺔ‬ ‫ﺛﺎﻧﻴﺎ‬ .. ‫ﻣﻌﺎﺻﺮة‬ ‫ﺑﻠﻐﺔ‬ ‫ﻣﻮﺳﻮﻋﺔ‬ ‫ﺑﻨﺎء‬ ‫ﻫﻨﺎ‬
	
  ‫ص‬‫وﻗﺎ‬ ‫ﺑﻦ‬ ‫ﻋﻠﻘﻤﺔ‬ ‫ﺣﻴﺎة‬ ‫ﻓﻲ‬ ‫ﳑﻴﺰ‬ ‫ﺣﺪث‬ ‫ﻋﻦ‬ ‫ﺗﺨﺒﺮﻧﻲ‬ ‫أن‬ ‫ﳝﻜﻨﻚ‬ ‫ﻫﻞ‬ ... ‫اﳌﻠﺤﻮﻇﻴﺔ‬
Agreement/Disagreement	
  
•  Sharing	
  the	
  same	
  opinion	
  regarding	
  a	
  topic	
  
–  Explicit	
  agreement	
  
•  “I	
  agree	
  with	
  you	
  about	
  …”	
  
‫ﻫﺬا‬ ‫ﲟﺜﻞ‬ ‫ﺻﻴﺎﻏﺘﻬﺎ‬ ‫ﻋﻠﻰ‬ ‫أﺷﻜﺮك‬ ،ً‫ﺎ‬‫ﲤﺎﻣ‬ ‫ﻓﻴﻬﺎ‬ ‫أواﻓﻘﻚ‬ ‫ﻟﻠﻐﺎﻳﺔ‬ ‫ﻫﺎﻣﺔ‬ ‫ﻧﻘﻄﺔ‬ ‫ﻫﺬه‬
	
  ‫ح‬‫اﻟﻮﺿﻮ‬
‫أﻧﺎ‬‫أواﻓﻘﻚ‬	
  ‫ء‬‫اﻟﺒﻨﺎ‬ ‫ﻃﻮر‬ ‫ﻓﻲ‬ ‫ﻣﻮﺳﻮﻋﺔ‬ ‫أﻧﻨﺎ‬
–  Implicit	
  similar	
  aetude	
  toward	
  a	
  topic	
  
•  Challenge	
  	
  
•  Pervasive	
  sarcasm	
  
•  Pervasive	
  use	
  of	
  MWE	
  and	
  references	
  to	
  cultural	
  knowledge	
  
Detec5ng	
  (dis)agreements/aetudes?	
  
•  The	
  role	
  of	
  idiom/metaphor/sarcasm	
  in	
  Arabic	
  seems	
  to	
  
be	
  more	
  pervasive	
  
–  Tongue	
  twisters,	
  WiKy	
  language,	
  Puns	
  	
  
	
‫ﺲ‬‫ﻠ‬h‫ا‬ ‫ﻓﻲ‬ ‫واﻻدﻗﻦ‬ ‫ﺷﻌﺮ‬ ‫ ﺣﻤﺰاوي‬ •
•  MP	
  Hamzawy	
  being	
  liberal	
  has	
  long	
  hair	
  compared	
  to	
  the	
  MB	
  
candidates	
  who	
  have	
  beards,	
  so	
  the	
  bet	
  on	
  whether	
  he	
  will	
  grow	
  his	
  
hair	
  longer	
  or	
  grow	
  a	
  beard	
  	
  
	
  ‫ﺔ‬‫ﺑﻄﻴﺨ‬ ‫اﷲ‬ ‫ﺷﺎء‬ ‫ﻣﺎ‬ ‫اﻟﺮاﺟﻞ‬ ‫ﻗﻠﺐ‬ ‫وﻟﻜﻦ‬ ‫واﺣﺪة‬ ‫ﺑﺬرة‬ ‫ﻳﺴﺎع‬ ‫ﻣﺎﳒﻪ‬ ‫اﳌﺮأة‬ ‫ ﻗﻠﺐ‬ •
•  Heart	
  of	
  a	
  woman	
  is	
  like	
  a	
  mango	
  can	
  only	
  hold	
  one	
  seed,	
  but	
  a	
  man’s	
  
heart	
  is	
  “God	
  Bless”	
  a	
  melon	
  
–  Sarcasm	
  
	
!‫ﺑﺴﻴﻄﻪ‬ ‫ ﻳﺎﻻ‬ •
•  no	
  problem,	
  it	
  is	
  easy!	
  (We	
  are	
  screwed	
  regardless!)	
  
Nega5ve/posi5ve	
  Aetude	
  
•  Very	
  flowery	
  language	
  compared	
  to	
  English	
  	
  
•  Strong	
  condescending	
  language	
  to	
  show	
  nega5ve	
  aetude	
  
•  Code	
  switching	
  into	
  dialectal	
  Arabic	
  expressions	
  to	
  show	
  
support	
  
–  Manipulate	
  different	
  registers	
  for	
  code	
  switching	
  depending	
  on	
  
context:	
  CA	
  with	
  MSA/DA	
  code	
  switching	
  to	
  reflect	
  influence	
  
•  Ben	
  Ali,	
  Tunisian	
  President	
  vs.	
  Mubarak,	
  Egyp5an	
  president	
  in	
  ouster	
  
speech	
  
•  Mubarak	
  –	
  Ex-­‐Egyp5an	
  President	
  on	
  visit	
  to	
  factories/ouster	
  from	
  
posi5on	
  in	
  last	
  revolu5on	
  
•  Mubarak	
  vs.	
  Nasser	
  vs.	
  Sadat	
  
–  Balance	
  between	
  familiarity	
  and	
  distance	
  
Nega5ve/posi5ve	
  Aetude	
  
•  Plural	
  first	
  person	
  pronouns	
  allow	
  the	
  speaker	
  
to	
  reduce	
  his/her	
  power	
  to	
  establish	
  rapport	
  
and	
  show	
  posi5ve	
  aetude,	
  	
  
– e.g.,	
   ‫إﺣﻨﺎ‬	
  ‫ﺟﺎﻟﻨﺎ‬	
  ‫اﻟﺸﺮف‬ 	
  vs.	
   ‫أﻧﺎ‬	
  ‫ﺟﺎﻟﻲ‬	
  ‫اﻟﺸﺮف‬ 	
  	
  
– We	
  are	
  honored	
  vs.	
  I	
  am	
  the	
  honored	
  one	
  
•  English	
  plural	
  pronouns	
  in	
  such	
  contexts	
  
sound	
  patronizing	
  (the	
  textbook	
  “we”),	
  
whereas	
  the	
  “royal	
  we”	
  is	
  disused.	
  
Nega5ve/Posi5ve	
  aetude	
  
•  Humor	
  is	
  commonly	
  used	
  in	
  Arabic	
  as	
  a	
  strategy	
  
that	
  levels	
  power	
  rela5ons,	
  but	
  that	
  would	
  be	
  
inappropriate	
  in	
  English.	
  
•  Slightly	
  offensive	
  expressions	
  are	
  used	
  in	
  Arabic	
  
to	
  maintain	
  power	
  balance	
  and	
  solidarity,	
  e.g.,	
  
	
    •‫اﺳﻜﺖ‬	
  ،‫ﻣﺶ‬	
  ‫ﻣﺤﻤﺪ‬	
  ‫ﳒﺢ‬	
  
  •‫واﻟﻨﺒﻲ‬	
  ‫ﻧﻘﻄﻨﺎ‬	
  ‫ﺑﺴﻜﺎﺗﻚ‬	
  	
  .
•  Only	
  very	
  few	
  such	
  expressions	
  are	
  acceptable	
  in	
  
English	
  and	
  in	
  very	
  close	
  contexts,	
  e.g.,	
  shut	
  up	
  
and	
  get	
  out	
  of	
  here.	
  
Talking	
  about	
  whom	
  and	
  to	
  whom	
  
•  More	
  manipula5on	
  of	
  power	
  differen5al	
  
–  MSA	
  terms	
  of	
  address	
  add	
  formality,	
  and	
  therefore	
  
power	
  to	
  the	
  speaker,	
  whereas	
  colloquial	
  terms	
  of	
  
address	
  establish	
  informal/equal	
  levels	
  of	
  power.	
  	
  
•  Compare	
   ‫ﻳﺎ‬	
  ‫ﺳﻴﺪي‬	
  ‫اﻟﻌﺰﻳﺰ‬ 	
  to	
   ‫ﻳﺎ‬	
  ‫ﺧﻮﻳﺎ‬ .	
  	
  
•  English	
  does	
  not	
  have	
  such	
  as	
  a	
  rich	
  con5nuum	
  of	
  
formality/informality	
  expressions.	
  
•  Usage	
  of	
  expressions	
  such	
  as	
  	
  
–  Mona:	
  Mona	
  could	
  not	
  dare	
  refuse	
  a	
  request	
  from	
  Ali	
  
–  Considered	
  strange	
  self	
  reference	
  in	
  English	
  but	
  it	
  is	
  
used	
  as	
  means	
  of	
  showing	
  modesty	
  and	
  familiarity	
  
Focus	
  of	
  this	
  talk	
  
Influencers	
  
Pursuit	
  of	
  Power	
   Disputed	
  Topics	
  
Mul5ple	
  Viewpoints	
  (Subgroups)	
  
Focus	
  of	
  this	
  talk	
  
The	
  new	
  immigra5on	
  law	
  is	
  good.	
  Illegal	
  
immigra5on	
  is	
  bad.	
  
Peter	
  
I	
  totally	
  disagree	
  with	
  you.	
  This	
  law	
  is	
  blatant	
  
racism.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct.	
  
Illegal	
  immigra5on	
  is	
  bad	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
Peter	
   John	
  
Support	
  the	
  new	
  law	
  
Against	
  the	
  new	
  law	
  
Mary	
   Alexander	
  
Sample	
  thread	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
1	
  -­‐	
  Thread	
  Parsing	
  
The	
  new	
  immigra5on	
  law	
  is	
  good.	
  Illegal	
  
immigra5on	
  is	
  bad.	
  
Peter	
  
I	
  totally	
  disagree	
  with	
  you.	
  This	
  law	
  is	
  blatant	
  
racism.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct.	
  
Illegal	
  immigra5on	
  is	
  bad	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
Iden5fy	
  Posts,	
  Discussants,	
  and	
  the	
  reply	
  structure	
  of	
  the	
  discussion	
  thread	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
2	
  -­‐	
  Iden5fy	
  Opinion	
  Words*	
  
The	
  new	
  immigra5on	
  law	
  is	
  good+.	
  Illegal	
  
immigra5on	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct+.	
  
Illegal	
  immigra5on	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
*Iden5fying	
  opinion	
  words	
  using	
  Opinion	
  Finder	
  with	
  an	
  extended	
  lexicon	
  
(implemented	
  using	
  random	
  walks	
  –	
  Hassan	
  &	
  Radev,	
  2011)	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
Target	
  
Discussant	
  (	
  e.g.	
  you,	
  	
  Peter)`	
  
Topic/EnEty	
  (e.g.	
  The	
  new	
  immigra5on	
  Law,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Illegal	
  Immigra5on)	
  	
  
Candidate	
  
Targets	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
The	
  new	
  immigra5on	
  law	
  is	
  good+.	
  Illegal	
  
immigra5on	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct+.	
  
Illegal	
  immigra5on	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
All	
  discussants	
  are	
  candidate	
  Targets	
  
Candidate	
  
Targets	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
The	
  new	
  immigra5on	
  law	
  is	
  good+.	
  Illegal	
  
immigra5on	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct
+.	
  Illegal	
  immigra5on	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
D1	
  
D1	
  
D1	
  
Iden5fy	
  discussant	
  men5ons	
  (2pp	
  or	
  name)	
  	
  
in	
  the	
  discussion	
  
D2	
  
Candidate	
  
Targets	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
The	
  new	
  immigra5on	
  law	
  is	
  good+.	
  Illegal	
  
immigra5on	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct
+.	
  Illegal	
  immigra5on	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
D1	
  
D1	
  
D1	
  
D1	
  
Peter	
  
Iden5fy	
  anaphoric	
  men5ons	
  of	
  discussants	
  
D2	
  
Candidate	
  
Targets	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
The	
  new	
  immigraEon	
  law	
  is	
  good+.	
  Illegal	
  
immigraEon	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Have	
  you	
  read	
  all	
  what	
  Peter	
  wrote?	
  He	
  is	
  correct
+.	
  Illegal	
  immigraEon	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
D1	
  
D1	
  
D1	
  
D1	
  
Peter	
  
Topic1	
  
Topic1	
  
Topic2	
  
Topic2	
  
D2	
  
Topic	
  1	
   Topic	
  2	
  
3-­‐	
  Iden5fy	
  Candidate	
  Targets	
  of	
  Opinion	
  
•  Techniques	
  used	
  to	
  iden5fy	
  topical	
  targets	
  :	
  
– Named	
  En5ty	
  Recogni5on	
  
– Noun	
  phrase	
  chunking	
  	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
4-­‐	
  Opinion-­‐Target	
  Pairing	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  The	
  new	
  immigraEon	
  
law	
  is	
  blatant-­‐	
  racism-­‐.	
  
Mary	
   P2	
  
D1	
   Topic1	
  
nsubj(disagree-3, I-1)
advmod(disagree-3, totally-2)
root(ROOT-0, disagree-3)
prep_with (disagree-3, you-5)Rule	
  	
  
nsubj(racism-­‐-4, Topic1-1)
cop(racist-4, is-2)
amod(racism-4, blatant-3)
root(ROOT-0, racist-4)
Rule	
  	
  
Named	
  en5ty	
  rules	
  
Candidate	
  
Targets	
  
4-­‐	
  Opinion-­‐Target	
  Pairing	
  
The	
  new	
  immigraEon	
  law	
  is	
  good+.	
  Illegal	
  
immigraEon	
  is	
  bad-­‐.	
  
Peter	
  
I	
  totally	
  disagree-­‐	
  with	
  you.	
  This	
  	
  law	
  is	
  blatant-­‐	
  
racism-­‐.	
  
Mary	
  
Read	
  all	
  what	
  Peter	
  wrote.	
  He	
  is	
  correct+.	
  Illegal	
  
immigraEon	
  is	
  bad-­‐	
  and	
  must	
  be	
  stopped.	
  
John	
  
You	
  are	
  clueless-­‐,	
  Peter.	
  	
  Stop	
  suppor5ng	
  racism.	
  
Alexander	
  
P1	
  
P2	
  
P3	
  
P4	
  
D1	
  
D2	
  
D3	
  
D4	
  
D1	
  
D1	
  
D1	
  
D1	
  
Peter	
  
Topic1	
  
Topic1	
  
Topic2	
  
Topic2	
  
Topic	
  1	
   Topic	
  2	
  
4-­‐	
  Opinion-­‐Target	
  Pairing	
  
•  Language	
  Uses	
  (LUs)	
  present	
  in	
  this	
  step:	
  
– Targeted	
  sen5ment	
  toward	
  other	
  discussants	
  (2nd	
  
person)	
  
– Targeted	
  Sen5ment	
  toward	
  topic	
  men5ons	
  (3rd	
  
person)	
  
I	
  totally	
  disagree
-­‐	
  with	
  you.	
  
This	
  law	
  is	
  blatant
-­‐	
  racism
-­‐.	
  
4-­‐	
  Opinion-­‐Target	
  Pairing	
  
•  LU	
  details	
  
– Rule-­‐based	
  detec5on	
  of	
  sen5ment	
  targets	
  
(we’ve	
  also	
  been	
  experimen5ng	
  with	
  supervised	
  target	
  
detec5on	
  methods)	
  
– Discussant	
  targets	
  are	
  iden5fied	
  by	
  2nd	
  person	
  
pronouns	
  (you,	
  your,	
  yourself,	
  etc.)	
  and	
  by	
  
username	
  men5ons	
  (casper3912,	
  etc.)	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
5-­‐	
  Discussant	
  Aetude	
  Profile	
  
Target1	
   ………	
   Targetn	
  
+	
   -­‐	
   int	
   +	
   -­‐	
   int	
   +	
   -­‐	
   int	
  
DAP1	
  
DAP2	
  
5-­‐	
  Discussant	
  Aetude	
  Profile	
  
Peter	
  
Mary	
  
John	
  
Alexander	
  
Topic	
  1	
   Topic	
  2	
  
Targets	
  
Discussants	
  
0	
   0	
   0	
   0	
   0	
   0	
   1	
   0	
   1	
   0	
   0	
   0	
   1	
   0	
   1	
   0	
   1	
   1	
  
0	
   0	
   0	
   0	
   0	
   0	
   0	
   1	
   1	
   1	
   0	
   1	
   0	
   2	
   2	
   0	
   0	
   0	
  
0	
   0	
   0	
   1	
   0	
   1	
   1	
   0	
   2	
   0	
   0	
   0	
   0	
   0	
   0	
   0	
   1	
   1	
  
1	
   0	
   1	
   0	
   0	
   0	
   0	
   1	
   1	
   0	
   0	
   0	
   0	
   0	
   0	
   0	
   0	
   0	
  
Subgroup	
  Detec5on	
  System	
  Overview	
  
Discussion	
  	
  
Thread	
  
Subgroups	
  
Discussant	
  
Opinion	
  Expressions	
  
	
  
Iden5fica5on	
  
Thread	
  
	
  
Parsing	
  
…
disagree……
….......
…………
like……………
…………………
bad…….	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
..........you……...	
  
.........................
......conservaEves	
  
ideologues……….	
  
………………………
....…..ImmigraEon	
  
law…………………	
  
Opinion-­‐Target	
  
Pairing	
  
disagree	
   You	
  
like	
  
Conserva5ve	
  	
  
Ideologues	
  
bad	
  
Immigra5on	
  
law	
  
Reply	
  Structure	
  
Candidate	
  	
  
	
  
Target	
  
Iden5fica5on	
  
Clustering	
  
Discussant	
  AJtude	
  
Profiles	
  (DAPs)	
  	
  
	
  
	
  
	
  
	
  
Clustering	
  
Peter	
  Mary	
  
John	
  Alexander	
  
Subgroup	
  2	
  Subgroup	
  1	
  
(Peter
-­‐,	
  Topic1
-­‐)	
  
(Peter
-­‐)	
  
(Topic1
+,	
  Topic	
  2
-­‐)	
  
(Peter
+,	
  Topic	
  2
-­‐)	
  
Evalua5on	
  
Data	
  
•  117	
  	
  Discussions	
  	
  
•  Short	
  threads	
  	
  	
  
•  short	
  posts	
  
•  Human	
  annota5on	
  
•  More	
  formal	
  
•  12	
  	
  Polls	
  +	
  Discussions	
  	
  
•  Long	
  threads	
  
•  Long	
  and	
  short	
  posts	
  
•  Data	
  self-­‐labeled	
  
•  Less	
  formal	
  
•  30	
  	
  debates	
  
•  Long	
  threads	
  
•  Long	
  and	
  short	
  posts	
  
•  Data	
  self-­‐labeled	
  
•  Less	
  formal	
  
Evalua5on	
  dataset	
  
Evalua5on	
  Metrics	
  	
  
1.  Purity	
  
Source:	
  hKp://nlp.stanford.edu/IR-­‐book/html/htmledi5on/evalua5on-­‐of-­‐clustering-­‐1.html	
  
Evalua5on	
  Metrics	
  	
  
2.  Entropy	
  
3.  F-­‐Measure	
  
where	
  P(I,	
  j)	
  is	
  the	
  probability	
  of	
  finding	
  an	
  element	
  
from	
  the	
  category	
  i	
  in	
  the	
  cluster	
  j,	
  nj	
  is	
  the	
  number	
  of	
  
items	
  in	
  cluster	
  j,	
  and	
  n	
  the	
  total	
  number	
  of	
  items	
  in	
  
the	
  distribu5on.	
  
Wikipedia	
   PoliEcal	
  Forum	
   Create	
  debate	
  
Purity	
   0.66	
   0.61	
   0.64	
  
Entropy	
   0.55	
   0.80	
   0.68	
  
F-­‐measure	
   0.61	
   0.56	
   0.60	
  
English	
  Results	
  
Baselines	
  
•  Interac5on	
  Graph	
  Clustering	
  (GC)	
  
–  Nodes:	
  Par5cipants	
  
–  Edges:	
  interac5ons	
  (connect	
  two	
  par5cipants	
  if	
  they	
  
exchange	
  posts)	
  
•  Text	
  Classifica5on	
  (TC)	
  
–  Build	
  TF-­‐IDF	
  vectors	
  for	
  each	
  par5cipant	
  (using	
  all	
  his/
her	
  posts)	
  
–  Cluster	
  the	
  vector	
  space	
  
Comparison	
  to	
  baselines	
  
Our System
Choice	
  of	
  Clustering	
  Algorithm	
  
•  K-­‐means	
  
•  Expecta5on	
  Maximiza5on	
  (EM)	
  
•  Farthest	
  First	
  (FF)	
  
Choice	
  of	
  Clustering	
  Algorithm	
  
•  K-­‐means	
  
•  Expecta5on	
  Maximiza5on	
  (EM)	
  
•  Farthest	
  First	
  (FF)	
  
Component	
  Evalua5on	
  
Our	
  System	
  
No	
  Topical	
  Targets	
  
No	
  Discussant	
  Targets	
  
No	
  SenEment	
  
No	
  InteracEon	
  
No	
  Anaphora	
  ResoluEon	
  
No	
  Named	
  EnEty	
  Recog.	
  
No	
  NP	
  Chunking	
  
Component	
  Evalua5on	
  
Our	
  System	
  
No	
  Topical	
  Targets	
  
No	
  Discussant	
  Targets	
  
No	
  SenEment	
  
No	
  InteracEon	
  
No	
  Anaphora	
  ResoluEon	
  
No	
  Named	
  EnEty	
  Recog.	
  
No	
  NP	
  Chunking	
  
Not really a linguistic feature
Component	
  Evalua5on	
  
Our	
  System	
  
No	
  Topical	
  Targets	
  
No	
  Discussant	
  Targets	
  
No	
  SenEment	
  
No	
  InteracEon	
  
No	
  Anaphora	
  ResoluEon	
  
No	
  Named	
  EnEty	
  Recog.	
  
No	
  NP	
  Chunking	
  
More of a linguistic feature!
Deeper	
  look	
  at	
  Agreement/
Disagreement	
  and	
  Aetude	
  
•  So	
  far	
  we	
  employed	
  shared/divergent	
  opinion	
  
in	
  the	
  form	
  of	
  explicit	
  polarity	
  indicators	
  
– Sen5ment	
  polarity	
  towards	
  other	
  discussants	
  
•  A:	
  So,	
  no	
  maBer	
  how	
  much	
  faith	
  you	
  have,	
  one	
  of	
  you	
  
MUST	
  be	
  wrong!	
  (negaHve)	
  
•  B:	
  You	
  are	
  a	
  scienHst?!	
  May	
  I	
  ask	
  in	
  which	
  field?	
  
(negaHve)	
  
– Sen5ment	
  polarity	
  towards	
  an	
  enHty	
  	
  
•  A:	
  Here	
  is	
  an	
  excellent	
  verse	
  from	
  the	
  Bible..	
  (posiHve)	
  
•  B:	
  The	
  Bible	
  rightly	
  says	
  that...	
  (posiHve)	
  
Implicit	
  Opinion/Perspec5ve	
  
•  Observa5on:	
  People	
  sharing	
  similar	
  beliefs/perspec5ve	
  
tend	
  to	
  use	
  the	
  same	
  evidence	
  to	
  support	
  their	
  point	
  	
  
–  Believers:	
  faith,	
  peace,	
  love,	
  ci5ng	
  verses	
  from	
  the	
  Bible...	
  	
  
–  Atheists:	
  reason,	
  science,	
  aKack	
  on	
  perceived	
  logical	
  flaws	
  in	
  
Bible...	
  	
  
•  However	
  it	
  is	
  not	
  always	
  explicit	
  (using	
  similar	
  words	
  and	
  
similar	
  aetudes)	
  
•  Peter:	
  God	
  is	
  the	
  creator	
  of	
  mankind	
  
•  Mary:	
  The	
  belief	
  in	
  an	
  ul5mate	
  divine	
  being	
  has	
  sustained	
  me	
  over	
  the	
  
years	
  	
  
–  Not	
  necessarily	
  posi5ve/nega5ve	
  
–  High	
  dimensional	
  similarity	
  (looking	
  at	
  the	
  surface	
  words)	
  
between	
  both	
  sentences	
  is	
  low!	
  	
  
–  BUT	
  we	
  know	
  Mary	
  and	
  Peter	
  share	
  the	
  same	
  perspecEve	
  
and	
  will	
  tend	
  to	
  be	
  in	
  agreement	
  with	
  each	
  other	
  
Modeling	
  of	
  implicit	
  agreement/
disagreement	
  	
  
•  Implicit	
  agreement	
  or	
  disagreement	
  
(perspec5ve)	
  –	
  using	
  text	
  similarity	
  to	
  help	
  
iden5fy	
  subgroups	
  	
  
•  Perspec5ve	
  modeling	
  is	
  used	
  to	
  complement	
  
explicit	
  aetude	
  	
  
•  Perspec5ve	
  granularity	
  has	
  to	
  be	
  collected	
  on	
  
the	
  level	
  of	
  a	
  thread	
  rather	
  than	
  a	
  single	
  post	
  
– Hence	
  we	
  summarize	
  all	
  the	
  posts	
  in	
  the	
  thread	
  
	
  
Our	
  Model	
  
•  Explicit	
  high	
  dimensional	
  aetude	
  toward	
  
other	
  discussants	
  
•  Explicit	
  high	
  dimensional	
  aetude	
  toward	
  
named	
  en55es	
  
•  Model	
  shared	
  perspec5ve	
  among	
  discussants	
  
over	
  threads	
  using	
  textual	
  similarity	
  on	
  the	
  
post	
  level	
  in	
  the	
  latent	
  space	
  
	
  
Extrac5ng	
  explicit	
  aetude	
  toward	
  
other	
  discussants	
  	
  
•  Iden5fy	
  polarity	
  of	
  each	
  sentence	
  	
  
•  Use	
  the	
  thread	
  structure	
  of	
  the	
  discussion	
  to	
  
iden5fy	
  the	
  target	
  discussant	
  	
  
•  If	
  the	
  sentence	
  has	
  second	
  person	
  pronouns	
  
(Hassan	
  et	
  al.,	
  2010),	
  then	
  the	
  polarity	
  is	
  
assumed	
  to	
  be	
  towards	
  the	
  target	
  of	
  the	
  
sentence	
  
Extrac5ng	
  explicit	
  aetude	
  toward	
  
named	
  en55es	
  
•  Iden5fy	
  polarity	
  of	
  each	
  sentence	
  
•  Run	
  Stanford	
  Named	
  En5ty	
  Tagger	
  on	
  
sentences	
  
•  If	
  the	
  sentence	
  has	
  Named	
  En55es,	
  then	
  the	
  
polarity	
  is	
  assumed	
  to	
  be	
  towards	
  those	
  
en55es	
  
Extrac5ng	
  implicit	
  perspec5ve	
  
•  Run	
  Latent	
  Dirichlet	
  Alloca5on	
  (LDA)	
  on	
  the	
  
thread	
  
•  Extract	
  the	
  topic	
  distribu5on	
  of	
  each	
  post	
  
•  Aggregate	
  the	
  distribu5ons	
  of	
  all	
  posts	
  
between	
  each	
  pair	
  of	
  discussants	
  
Feature	
  Representa5on:	
  Aetude	
  Profiles	
  	
  
	
  
•  Vector	
  Representa5on	
  	
  
•  Explicit	
  aetude	
  toward	
  other	
  discussants	
  	
  
A	
   B	
   C	
  
A	
   0	
  	
  	
  1	
  	
  	
  1	
   1  1	
  	
  	
  	
  2	
   0	
  	
  	
  0	
  	
  	
  0	
  
B	
   …	
  
C	
   -­‐-­‐	
  
Feature	
  Representa5on:	
  Aetude	
  Profiles	
  	
  
	
  
•  Vector	
  Representa5on	
  	
  
•  Explicit	
  aetude	
  toward	
  En55es	
  
A	
   B	
   C	
   E1	
   E2	
  
A	
   0	
  	
  	
  1	
  	
  	
  1	
   1  1	
  	
  	
  	
  2	
   0	
  	
  	
  0	
  	
  	
  0	
   1	
  	
  	
  1	
  	
  	
  2	
   1	
  	
  	
  0	
  	
  	
  1	
  
B	
   …	
  
C	
   -­‐-­‐	
  
Feature	
  Representa5on:	
  Aetude	
  Profiles	
  	
  
	
  
•  Vector	
  Representa5on	
  	
  
•  Implicit	
  aetude	
  toward	
  other	
  discussants	
  	
  
	
  
A	
   B	
   C	
   E1	
   E2	
   A	
   B	
   C	
  
A	
   0	
  	
  	
  1	
  	
  	
  1	
   1  1	
  	
  	
  	
  2	
   0	
  	
  	
  0	
  	
  	
  0	
   1	
  	
  	
  1	
  	
  	
  2	
   1	
  	
  	
  0	
  	
  	
  1	
   1	
  	
  1	
  	
  1	
   1	
  	
  0	
  	
  0.5	
   0.5	
  0	
  	
  0	
  
B	
   …	
  
C	
   -­‐-­‐	
   1	
  	
  1	
  	
  1	
  	
  
Data	
  
•  Create	
  Debate	
  (CD)	
  	
  
–  www.createdebate.com	
  	
  
–  Deba5ng	
  on	
  a	
  certain	
  topic	
  	
  
–  Sides	
  are	
  explicitly	
  indicated	
  by	
  discussants	
  in	
  a	
  poll	
  	
  
–  Informal	
  language	
  	
  
•  Wikipedia	
  Discussion	
  Forum	
  (WIKI)	
  
–  en.wikipedia.org	
  	
  
–  Groups	
  labels	
  are	
  manually	
  annotated	
  	
  
–  Formal	
  language,	
  not	
  much	
  nega5ve	
  polarity	
  	
  
Experimental	
  Condi5ons	
  
•  Clustering	
  algorithm	
  
–  S-­‐Link	
  
#	
  of	
  clusters	
  by	
  rule	
  of	
  thumb	
  =	
  √n/2	
  
•  Evalua5on	
  Metrics	
  
–  Purity,	
  Entropy,	
  F-­‐measure	
  	
  
•  Baseline	
  
–  RAND-­‐BASE:	
  Assign	
  discussants	
  to	
  clusters	
  randomly	
  
–  SWD-­‐BASE:	
  Calculate	
  surface	
  word	
  distribu5on,	
  as	
  a	
  
simpler	
  form	
  of	
  perspec5ve	
  
Results	
  
CondiEon	
   Wiki	
   CD	
  
Purity	
   Entropy	
   Fmeasure	
   Purity	
   Entropy	
   Fmeasure	
  
RAND-­‐BASE	
   0.675	
   0.563	
   0.652	
   0.399	
   0.966	
   0.41	
  
SWD-­‐BASE	
   0.772	
   0.475	
   0.646	
   0.452	
   0.932	
   0.432	
  
SD	
   0.834	
   0.360	
   0.667	
   0.824	
   0.394	
   0.596	
  
SE	
   0.827	
   0.383	
   0.655	
   0.793	
   0.422	
   0.582	
  
SD+SE	
   0.835	
   0.362	
   0.665	
   0.82	
   0.385	
   0.604	
  
PERS	
   0.853	
   0.321	
   0.699	
   0.787	
   0.399	
   0.589	
  
SD+PERS	
   0.853	
   0.320	
   0.698	
   0.849	
   0.333	
   0.615	
  
SE+PERS	
   0.853	
   0.321	
   0.702	
   0.789	
   0.399	
   0.591	
  
SD+SE+PERS	
   0.857	
   0.310	
   0.703	
   0.861	
   0.315	
   0.625	
  
Observa5ons	
  
CondiEon	
   Wiki	
   CD	
  
Purity	
   Entropy	
   Fmeasure	
   Purity	
   Entropy	
   Fmeasure	
  
RAND-­‐BASE	
   0.675	
   0.563	
   0.652	
   0.399	
   0.966	
   0.41	
  
SWD-­‐BASE	
   0.772	
   0.475	
   0.646	
   0.452	
   0.932	
   0.432	
  
SD	
   0.834	
   0.360	
   0.667	
   0.824	
   0.394	
   0.596	
  
SE	
   0.827	
   0.383	
   0.655	
   0.793	
   0.422	
   0.582	
  
SD+SE	
   0.835	
   0.362	
   0.665	
   0.82	
   0.385	
   0.604	
  
PERS	
   0.853	
   0.321	
   0.699	
   0.787	
   0.399	
   0.589	
  
SD+PERS	
   0.853	
   0.320	
   0.698	
   0.849	
   0.333	
   0.615	
  
SE+PERS	
   0.853	
   0.321	
   0.702	
   0.789	
   0.399	
   0.591	
  
SD+SE+PERS	
   0.857	
   0.310	
   0.703	
   0.861	
   0.315	
   0.625	
  
Best	
  Performance	
  is	
  when	
  we	
  combine	
  explicit	
  aetude	
  (SD	
  Sen5ment	
  
toward	
  other	
  discussants,	
  SE	
  Sen5ment	
  toward	
  En55es)	
  with	
  implicit	
  
perspec5ve	
  (PERS),	
  regardless	
  of	
  genre	
  
Observa5ons	
  
CondiEon	
   Wiki	
   CD	
  
Purity	
   Entropy	
   Fmeasure	
   Purity	
   Entropy	
   Fmeasure	
  
RAND-­‐BASE	
   0.675	
   0.563	
   0.652	
   0.399	
   0.966	
   0.41	
  
SWD-­‐BASE	
   0.772	
   0.475	
   0.646	
   0.452	
   0.932	
   0.432	
  
SD	
   0.834	
   0.360	
   0.667	
   0.824	
   0.394	
   0.596	
  
SE	
   0.827	
   0.383	
   0.655	
   0.793	
   0.422	
   0.582	
  
SD+SE	
   0.835	
   0.362	
   0.665	
   0.82	
   0.385	
   0.604	
  
PERS	
   0.853	
   0.321	
   0.699	
   0.787	
   0.399	
   0.589	
  
SD+PERS	
   0.853	
   0.320	
   0.698	
   0.849	
   0.333	
   0.615	
  
SE+PERS	
   0.853	
   0.321	
   0.702	
   0.789	
   0.399	
   0.591	
  
SD+SE+PERS	
   0.857	
   0.310	
   0.703	
   0.861	
   0.315	
   0.625	
  
WIKI	
  seems	
  to	
  gain	
  more	
  from	
  implicit	
  perspec5ve	
  compared	
  to	
  CD	
  
	
  Explicit	
  Aetude	
  is	
  a	
  beKer	
  feature	
  for	
  CD:	
  people	
  express	
  their	
  
	
  sen5ments	
  openly,	
  while	
  in	
  WIKI	
  people	
  are	
  more	
  constrained	
  and	
  
	
  subtle	
  in	
  their	
  expressions	
  
Observa5ons	
  
CondiEon	
   Wiki	
   CD	
  
Purity	
   Entropy	
   Fmeasure	
   Purity	
   Entropy	
   Fmeasure	
  
RAND-­‐BASE	
   0.675	
   0.563	
   0.652	
   0.399	
   0.966	
   0.41	
  
SWD-­‐BASE	
   0.772	
   0.475	
   0.646	
   0.452	
   0.932	
   0.432	
  
SD	
   0.834	
   0.360	
   0.667	
   0.824	
   0.394	
   0.596	
  
SE	
   0.827	
   0.383	
   0.655	
   0.793	
   0.422	
   0.582	
  
SD+SE	
   0.835	
   0.362	
   0.665	
   0.82	
   0.385	
   0.604	
  
PERS	
   0.853	
   0.321	
   0.699	
   0.787	
   0.399	
   0.589	
  
SD+PERS	
   0.853	
   0.320	
   0.698	
   0.849	
   0.333	
   0.615	
  
SE+PERS	
   0.853	
   0.321	
   0.702	
   0.789	
   0.399	
   0.591	
  
SD+SE+PERS	
   0.857	
   0.310	
   0.703	
   0.861	
   0.315	
   0.625	
  
BeKer	
  results	
  obtained	
  on	
  the	
  same	
  data	
  set	
  from	
  the	
  previous	
  results	
  for	
  
WIKI	
  (P	
  0.66,	
  E	
  0.55)	
  CD	
  (P	
  0.64,	
  E	
  0.68)	
  
Our	
  Social	
  Constructs	
  
Mul5ple	
  Viewpoints	
  (Subgroups)	
  
Influencers	
  
Pursuit	
  of	
  Power	
  
The	
  LUs	
  used	
  in	
  Final	
  System	
  
•  AKempt	
  to	
  persuade	
  (Inf)	
  
•  Agreement/disagreement	
  (Inf,	
  Sub)	
  
•  -­‐ve/+ve	
  aetude	
  without	
  perspec5ve	
  (sub)	
  
•  Who	
  is	
  talking	
  about	
  whom	
  (PoP)	
  
•  Dialog	
  paKerns	
  (PoP)	
  
•  Signed	
  network	
  (Sub)	
  
Do	
  not	
  depend	
  on	
  linguis%c	
  analysis	
  
Rely	
  on	
  linguis%c	
  analysis	
  	
  
	
  
	
  
LUs	
  and	
  SCs	
  
LU/SC	
   Influencer	
   Pursuit	
  of	
  Power	
   Subgroup	
  
AKempt	
  to	
  Persuade	
   ✔	
  
Agreement/Disagreement	
   ✔	
   ✔	
  
-­‐ve/+ve	
  aetude	
   ✔	
   ✔	
  
Who	
  is	
  talking	
  about	
  whom	
   ✔	
  
Dialogue	
  PaKerns	
   ✔	
  
Signed	
  Networks	
   ✔	
  
Challenges	
  with	
  processing	
  Arabic	
  
Social	
  media	
  
•  Genre	
  
– WikiPedia	
  
•  MSA	
  with	
  dialectal	
  style	
  and	
  mul5word	
  expressions/
lexical	
  items	
  
– Blogs	
  from	
  BOLT	
  mostly	
  dialectal	
  with	
  pervasive	
  
code	
  switching	
  and	
  seman5c	
  faux	
  amis	
  
•  Implica5ons	
  for	
  preprocessing	
  	
  
– Our	
  tools	
  are	
  trained	
  on	
  formal	
  MSA	
  genres	
  	
  
•  Hence	
  degrada5on	
  in	
  basic	
  NLP	
  processing,	
  for	
  
example	
  POS	
  tagging	
  in	
  MSA	
  is	
  97%	
  accuracy,	
  in	
  Blog	
  
data	
  we	
  are	
  at	
  94%	
  (on	
  a	
  good	
  day!)	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
In	
  general,	
  Subgroup	
  is	
  the	
  hardest	
  	
  
	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
In	
  general,	
  Subgroup	
  is	
  the	
  hardest	
  	
  
	
  
Pursuit	
  of	
  power	
  relies	
  mostly	
  on	
  
shallow	
  linguis5c	
  features	
  (men5ons)	
  
and	
  dialog	
  structure	
  	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
Fora	
  are	
  harder	
  to	
  deal	
  with	
  than	
  WIKI	
  genre	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
Arabic	
  WIKI	
  did	
  beBer	
  than	
  English	
  WIKI	
  
Formal	
  Gov.	
  Evalua5on	
  (nDCG%)	
  
09/2012	
  	
  
En-­‐WIKI	
   En-­‐Fora	
   Ar-­‐WIKI	
  	
   Ar-­‐Fora	
  
Subgroup	
  (without	
  perspec%ve)	
   48.2	
   50.6	
   57.4	
   37.5	
  
Influencer	
   82.8	
   78.3	
   85.1	
   84.9	
  
Pursuit	
  of	
  Power	
   87.8	
   77.7	
   91.6	
   74.6	
  
Arabic	
  Influencer	
  significantly	
  impacted	
  by	
  
simple	
  diacriHzaHon	
  detecHon	
  for	
  claims	
  
(grounding)	
  
Conclusions	
  
•  We	
  can	
  successfully	
  computa5onally	
  model	
  
sociopragma5c	
  phenomena	
  
– There	
  is	
  significant	
  room	
  for	
  improvement	
  
•  S5ll	
  discovering	
  how	
  to	
  model	
  the	
  phenomena	
  
in	
  a	
  more	
  language	
  specific	
  manner	
  
– We	
  are	
  just	
  scratching	
  the	
  surface	
  of	
  understanding	
  
the	
  sociopragma5c	
  linguis5c	
  features	
  
•  NOW	
  more	
  than	
  ever	
  collabora5ons	
  are	
  
necessary	
  
Any	
  takers!	
  
	
  ‫ﻢ‬‫ودﻣﺘ‬
	
  
Thank	
  you	
  
Ques5ons?	
  

More Related Content

Viewers also liked

Icwsm 2014 modeling user attitude v.7
Icwsm 2014 modeling user attitude v.7Icwsm 2014 modeling user attitude v.7
Icwsm 2014 modeling user attitude v.7jmahmud22
 
Ideal Standard Egypt - Content Plan
Ideal Standard Egypt - Content PlanIdeal Standard Egypt - Content Plan
Ideal Standard Egypt - Content PlanNessma Zakarya
 
Develop a Digital Plan
Develop a Digital PlanDevelop a Digital Plan
Develop a Digital PlanTony Passey
 
Social medial workshop irex arabic
Social medial workshop irex   arabicSocial medial workshop irex   arabic
Social medial workshop irex arabicDigiArabs
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
Developing a Roadmap for Digital Transformation
Developing a Roadmap for Digital TransformationDeveloping a Roadmap for Digital Transformation
Developing a Roadmap for Digital TransformationJohn Sinke
 

Viewers also liked (7)

Icwsm 2014 modeling user attitude v.7
Icwsm 2014 modeling user attitude v.7Icwsm 2014 modeling user attitude v.7
Icwsm 2014 modeling user attitude v.7
 
Ideal Standard Egypt - Content Plan
Ideal Standard Egypt - Content PlanIdeal Standard Egypt - Content Plan
Ideal Standard Egypt - Content Plan
 
Develop a Digital Plan
Develop a Digital PlanDevelop a Digital Plan
Develop a Digital Plan
 
Social medial workshop irex arabic
Social medial workshop irex   arabicSocial medial workshop irex   arabic
Social medial workshop irex arabic
 
How to develop a digital strategy
How to develop a digital strategyHow to develop a digital strategy
How to develop a digital strategy
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Developing a Roadmap for Digital Transformation
Developing a Roadmap for Digital TransformationDeveloping a Roadmap for Digital Transformation
Developing a Roadmap for Digital Transformation
 

Similar to Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media

Audience theories
Audience theoriesAudience theories
Audience theoriesJDunn43
 
Abstraction and Expression on the Web
Abstraction and Expression on the WebAbstraction and Expression on the Web
Abstraction and Expression on the WebSrinath Srinivasa
 
Task Your task is to follow a conversation in society right now .docx
Task Your task is to follow a conversation in society right now .docxTask Your task is to follow a conversation in society right now .docx
Task Your task is to follow a conversation in society right now .docxjosies1
 
Synthesis Assignment Instructions and RubricSynthesis Essay..docx
Synthesis Assignment Instructions and RubricSynthesis Essay..docxSynthesis Assignment Instructions and RubricSynthesis Essay..docx
Synthesis Assignment Instructions and RubricSynthesis Essay..docxsimba35
 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Laura Martinez
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inpersonWarNik Chow
 
Audience Positioning and Media Effects
Audience Positioning and Media EffectsAudience Positioning and Media Effects
Audience Positioning and Media Effectshughes82
 
Inventing arguments chap 3 5
Inventing arguments chap 3 5Inventing arguments chap 3 5
Inventing arguments chap 3 5palderman
 
Research frameworks argument and data what is enough?
Research frameworks argument and data what is enough?Research frameworks argument and data what is enough?
Research frameworks argument and data what is enough?DoctoralNet Limited
 
Inventing arguments 6 7
Inventing arguments 6 7Inventing arguments 6 7
Inventing arguments 6 7palderman
 
Critical discourse analysis and an application
Critical discourse analysis and an applicationCritical discourse analysis and an application
Critical discourse analysis and an applicationSuaad Zahawi
 
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...Wikimania2010 - Reflect: a tool for discussion summarization and active liste...
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...jtmorgan
 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic LiteracySpelman College
 
Synthesis Assignment Instructions and RubricSynthesis Essay. .docx
Synthesis Assignment Instructions and RubricSynthesis Essay.  .docxSynthesis Assignment Instructions and RubricSynthesis Essay.  .docx
Synthesis Assignment Instructions and RubricSynthesis Essay. .docxmattinsonjanel
 
Data Science Popup Austin: The Science of Sharing
Data Science Popup Austin: The Science of Sharing Data Science Popup Austin: The Science of Sharing
Data Science Popup Austin: The Science of Sharing Domino Data Lab
 
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docx
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docxEssay #2 Proposing a SolutionIn ClassFor this essay, you.docx
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docxrusselldayna
 

Similar to Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media (20)

Audience theories
Audience theoriesAudience theories
Audience theories
 
Pml 8
Pml 8Pml 8
Pml 8
 
Pml 8
Pml 8Pml 8
Pml 8
 
Abstraction and Expression on the Web
Abstraction and Expression on the WebAbstraction and Expression on the Web
Abstraction and Expression on the Web
 
Task Your task is to follow a conversation in society right now .docx
Task Your task is to follow a conversation in society right now .docxTask Your task is to follow a conversation in society right now .docx
Task Your task is to follow a conversation in society right now .docx
 
Synthesis Assignment Instructions and RubricSynthesis Essay..docx
Synthesis Assignment Instructions and RubricSynthesis Essay..docxSynthesis Assignment Instructions and RubricSynthesis Essay..docx
Synthesis Assignment Instructions and RubricSynthesis Essay..docx
 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
Audience Positioning and Media Effects
Audience Positioning and Media EffectsAudience Positioning and Media Effects
Audience Positioning and Media Effects
 
Inventing arguments chap 3 5
Inventing arguments chap 3 5Inventing arguments chap 3 5
Inventing arguments chap 3 5
 
Research frameworks argument and data what is enough?
Research frameworks argument and data what is enough?Research frameworks argument and data what is enough?
Research frameworks argument and data what is enough?
 
Inventing arguments 6 7
Inventing arguments 6 7Inventing arguments 6 7
Inventing arguments 6 7
 
Critical discourse analysis and an application
Critical discourse analysis and an applicationCritical discourse analysis and an application
Critical discourse analysis and an application
 
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...Wikimania2010 - Reflect: a tool for discussion summarization and active liste...
Wikimania2010 - Reflect: a tool for discussion summarization and active liste...
 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic Literacy
 
Synthesis Assignment Instructions and RubricSynthesis Essay. .docx
Synthesis Assignment Instructions and RubricSynthesis Essay.  .docxSynthesis Assignment Instructions and RubricSynthesis Essay.  .docx
Synthesis Assignment Instructions and RubricSynthesis Essay. .docx
 
Data Science Popup Austin: The Science of Sharing
Data Science Popup Austin: The Science of Sharing Data Science Popup Austin: The Science of Sharing
Data Science Popup Austin: The Science of Sharing
 
March 11 Lab
March 11 LabMarch 11 Lab
March 11 Lab
 
1
11
1
 
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docx
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docxEssay #2 Proposing a SolutionIn ClassFor this essay, you.docx
Essay #2 Proposing a SolutionIn ClassFor this essay, you.docx
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media

  • 1. Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media   Mona Diab The George Washington University
  • 2. Acknowledgement   •  Joint  work  on  subgroup  detec5on  with  Dragomir   Radev,  Amjad  Abu  Jbara     •  My  students:  Muhammad  AbdulMageed,   Pradeep  Dasigi,  Weiwei  Guo   •  Collabora5ve  work  with  Owen  Rambow  and   Kathy  Mckeown,  and  their  respec5ve  groups   •  Collabora5ve  sociolinguis5c  observa5ons  with   Mustafa  Mughazy   •  Work  funded  by  IARPA  SCIL  program   •  Several  slides  adapted  from  several  presenta5ons   where  papers  published  on  work  
  • 3. Our  Overarching  Research  Interest   •  Goal:  AKempt  to  mine  social  media  text  for   clues  and  cues  toward  building  an   understanding  human  interac5ons   •  How:  Iden5fy  interes5ng  sociolinguis5c   behaviors  and  correlate  them  with  linguis5c   usage  that  is  quan%fiable  and  explicitly   characterizable  as  a  diagnos%c  device   •  Compare  these  devices  cross  linguis5cally  
  • 4.    Text  and  Social  Rela5ons   We   can   use   linguis5c   analysis   techniques   to   understand   the   implicit   rela5ons   that   develop   in   on-­‐line  communi5es   Image  source:  clair.si.umich.edu  
  • 5.    Many  Different  Forms  of  Social  Media   •  Communica5on     •  Collabora5on     •  Mul5media     •  Reviews  &  opinions      
  • 6.  Social  Media  Explosion   source:  www.internetworldstats.com   1.73  billion  Internet   users  worldwide.       75%  of  them  used   “Social  Media”  
  • 7.    Text  in  Social  Media   Some  social  media  applica5ons  are  all  about  text  
  • 8.    Text  in  Social  Media   Even  the  ones  based  on  photos,  videos,  etc.  have  a  lot  of   discussions  
  • 9.    Text  in  Social  Media   Huge  amount  of  text  exchanged  in  discussions   A  significant  treasure  trove    
  • 10. Interes5ng  Sociolinguis5c  Phenomena:   Social  Constructs   Mul5ple  Viewpoints  (Subgroups)   Influencers   Pursuit  of  Power   Disputed  Topics  
  • 11. Approach  to  processing  social  construct   phenomena   •  Like  any  good  scien5st  (or  imperialist):   divide  and  conquer   – Iden5fy  language  uses  (LU)  per5nent  to  the   different  social  constructs  (SC)     – Correlate  and  map  these  LUs  with  Linguis5c   Construc5ons/Cons5tuents  (LC)      
  • 13. Discover  relevant  LUs   •  AKempt  to  persuade   •  Agreement/disagreement   •  Nega5ve/posi5ve  aetude     •  Who  is  talking  about  whom     •  Dialog  paKerns     •  Signed  network     Do  not  depend  on  linguis%c  analysis   Rely  on  linguis%c  analysis        
  • 14. Discover  relevant  LUs   •  AKempt  to  persuade   •  Agreement/disagreement   •  Nega5ve/posi5ve  aetude     •  Who  is  talking  about  whom     •  Dialog  paKerns     •  Signed  network     Do  not  depend  on  linguis%c  modeling   Rely  on  linguis%c  modeling      
  • 15. LU:  AKempt  to  Persuade   •  An  expression  of  opinion  (a  claim)  followed  by   explicit  jus5fica5on  of  the  claim  (an  argumenta5on)   –  Persuade  to  believe,  not  persuade  to  act     – Claim:  grounding  in  experience,  commonly   respected  sources     – Argumenta5on:  evidence  and  support  from  other   discussants       CLAIM:  There  seems  to  be  a  much  beKer  list  at  the  Na5onal  Cancer   Ins5tute  than  the  one  we’ve  got.   ARGUMENTATION:  It  5es  much  beKer  to  the  actual  publica5on  (the   same  11  sec5ons,  in  the  same  order).    
  • 16. LU:  Agreement  and  Disagreement   •  Examine  pairs  of  phrases  to  model  others’   acceptance  of  the  par5cipant’s  ideas     P1  by  Arcadian:  There  seems  to  be  a  much  beKer  list  at  the  Na5onal   Cancer  Ins5tute  than  the  one  we’ve  got.  It  5es  much  beKer  to  the  actual   publica5on  (the  same  11  sec5ons,  in  the  same  order).  I’d  like  to  replace   that  sec5on  in  this  ar5cle.  Any  objec5ons?    P2  by  JFW:  Not  a  problem.  Perhaps  we  can  also  insert  the  rela5ve   incidence  as    published  in  this  month’s  wiki  Blood  journal   Example  of  Agreement   •  Shared  opinion  (explicit  expression),  shared   perspec5ve  (implicit  aetude)   •  Using  word  similarity  and  overlap  
  • 17. LU: -ve/+ve Attitude •  The attitude of a discussant/participant in a conversation toward another participant or topic or entity mentioned in the thread •  Characterize –ve and +ve sentences •  Positive: praise, express liking, etc. •  You are great •  Simply elegant and beautiful •  Negative: insult, dislike, disagreement, sarcasm, etc. •  You're a liar. •  You know, you're a pretty absurd individual even by Usenet standards. •  You're just pathetic.
  • 18. LU:  Aetude  towards  another  person   (2)  PER2:  No  it  hasn't  that's  a  bold  faced  lie.  A  definate    majority  of  Americans  support  the  public  option.    The  only    people  who  are  against  it  are  the  insurance  companies  and    moron  social  conservatives  like  you  who  don't  even    understand  what    socialism  is.  
  • 19. LU:  Aetude  towards  another  person   (2)  PER2:  No  it  hasn't  that's  a  bold  faced  lie.  A  definate    majority  of  Americans  support  the  public  option.    The  only    people  who  are  against  it  are  the  insurance  companies  and    moron  social  conservatives  like  you  who  don't  even    understand  what    socialism  is.     Using  nega5ve  and  insul5ng  language.  Sen5ment   and  word  polarity  are  the  devices  used  
  • 20. LU: Who is talking about whom How often a person refers to, or is referred to by, other discourse participants Use of mentions and their frequencies IsMyNameUsedByOthers HaveIUsedOthersName %OfUsersReferencedByMe %OfUsersReferencedMe %OfReferencesByMe %OfReferencesToMe ReferencesByMeToWordsRatio users references made by me/total number of words I wrote. ReferencesToMeToWordsRatio no. of references / total number of words by others
  • 21.    LU:  Signed  Network   1   1000   2841   Par55on  the  social  medium  network  into  posi5ve  and  nega5ve  links  based   on  polarity  of  words  used       What  is  the  public  opinion  on  the  health  care  reform?   2841  posts   More  than  300K  words  
  • 22.    LU:  Signed  Network   Par5cipants   Interac5ons  
  • 23.    LU:  Signed  Network   Par5cipants  Nega5ve  Interac5on   Posi5ve  Interac5on   Very  Hot  Topic     (high   percentage   of   nega5ve  links)  
  • 24.    LU:  Signed  Network   Against  Reform   (55%)   Pro  Reform   (45%)  
  • 25. LU:  Dialog  PaKerns   •  Dialog  PaKerns  are  based  on  metadata  (e.g.,   the  thread  structure),  not  the  text   – Ini5a5ve      who  started  the  thread   – Investment    share  of  par5cipa5on   – Irrelevance    how  omen  ignored  by  others   – Interjec5on    at  what  point  joined  conversa5on   – Incita5on      how  long  are  branches  started   – Inquisi5veness      the  number  of  ques5on  marks  
  • 26. Interes5ng  Sociolinguis5c  Phenomena:   Social  Constructs   Mul5ple  Viewpoints  (Subgroups)   Influencers   Pursuit  of  Power   Disputed  Topics  
  • 27. Who  is  an  Influencer?   •  Someone  whose  opinions/ideas  profoundly  affect  the  conversa5on   •  An  influencer  may  have  the  following  characteris5cs  (Katz  and  Lazarsfeld,   1955)   –  alter  the  opinions  of  their  audience   –  resolve  disagreements  where  no  one  else  can   –  be  recognized  by  others  as  one  who  makes  important  contribu5ons   –  omen  con5nue  to  influence  a  group  even  when  not  present     –  have  other  conversa5onal  par5cipants  adopt  their  ideas  and  even  the   words  they  use  to  express  their  ideas   •  More  formally,  an  influencer:   –  Has  credibility  in  the  group   –  Persists  in  aKemp5ng  to  convince  others,  even  if  some  disagreement   occurs   –  Introduces  topics/ideas  that  others  pick  up  on  or  support  
  • 28. Social  Construct:  Influencer  (inf)   •  Language  Uses   – AKempt  to  Persuade   – Agreement/disagreement     Influencers  
  • 29. What is Pursuit of Power? •  Individual makes repeated efforts to gain power within the group. •  The individual attempts to control the actions or goals of the group. •  Individual’s behavior causes tension within the group
  • 30. Social  Construct:  Pursuit  of  Power     (PoP)   •  Language  Uses   –  AKempt  to  Persuade   –  Agreement/disagreement   –  Nega5ve/posi5ve  aetude     –  Who  is  talking  about  whom     –  Dialog  paKerns  (non  linguis5c)   Pursuit  of  Power  
  • 31. Social  Construct:  Subgroup  Detec5on   Discussion     Thread   Subgroups   Discussant  
  • 32. Social  Construct:  Subgroup    (Sub)   •  Language  Uses   – Agreement/disagreement   – Nega5ve/posi5ve  aetude     – Signed  Network  (non  linguis5c)   Mul5ple  Viewpoints  (Subgroups)  
  • 33. Cross  Linguis5c  Comparison   •  The  SC  in  both  languages  use  same  LUs   •  But  do  Arabic  and  English  social  media  use   different  linguis5c  cons5tuents  to  show   language  use?   •  A  qualita5ve  view:      
  • 34. AKempt  to  persuade   •  Claims   –  A  lot  more  grounding  using  religious  references   –  Religion  plays  a  significant  role  in  Arabic  discourse   structure  therefore  used  to  establish  credibility  and   accordingly  influence  and  power  differen5als   •  Easily  detected  using  simple  devices  such  as  explicit   diacri5za5on     –  Less  subjec5ve  language  (less  usage  of  “I”  more  of   “we”,  or  exple5ves  such  as  “there,  it”)   ‫ﺗﺘﻔﻬﻢ‬ ‫أن‬ ‫ ﺣﺎول‬ –‫ﻧﺤﺎول‬ ‫أﻧﻨﺎ‬‫إﺷﻜﺎﻟﻴﺔ‬ ‫ﺛﺎﻧﻴﺎ‬ .. ‫ﻣﻌﺎﺻﺮة‬ ‫ﺑﻠﻐﺔ‬ ‫ﻣﻮﺳﻮﻋﺔ‬ ‫ﺑﻨﺎء‬ ‫ﻫﻨﺎ‬  ‫ص‬‫وﻗﺎ‬ ‫ﺑﻦ‬ ‫ﻋﻠﻘﻤﺔ‬ ‫ﺣﻴﺎة‬ ‫ﻓﻲ‬ ‫ﳑﻴﺰ‬ ‫ﺣﺪث‬ ‫ﻋﻦ‬ ‫ﺗﺨﺒﺮﻧﻲ‬ ‫أن‬ ‫ﳝﻜﻨﻚ‬ ‫ﻫﻞ‬ ... ‫اﳌﻠﺤﻮﻇﻴﺔ‬
  • 35. Agreement/Disagreement   •  Sharing  the  same  opinion  regarding  a  topic   –  Explicit  agreement   •  “I  agree  with  you  about  …”   ‫ﻫﺬا‬ ‫ﲟﺜﻞ‬ ‫ﺻﻴﺎﻏﺘﻬﺎ‬ ‫ﻋﻠﻰ‬ ‫أﺷﻜﺮك‬ ،ً‫ﺎ‬‫ﲤﺎﻣ‬ ‫ﻓﻴﻬﺎ‬ ‫أواﻓﻘﻚ‬ ‫ﻟﻠﻐﺎﻳﺔ‬ ‫ﻫﺎﻣﺔ‬ ‫ﻧﻘﻄﺔ‬ ‫ﻫﺬه‬  ‫ح‬‫اﻟﻮﺿﻮ‬ ‫أﻧﺎ‬‫أواﻓﻘﻚ‬  ‫ء‬‫اﻟﺒﻨﺎ‬ ‫ﻃﻮر‬ ‫ﻓﻲ‬ ‫ﻣﻮﺳﻮﻋﺔ‬ ‫أﻧﻨﺎ‬ –  Implicit  similar  aetude  toward  a  topic   •  Challenge     •  Pervasive  sarcasm   •  Pervasive  use  of  MWE  and  references  to  cultural  knowledge  
  • 36. Detec5ng  (dis)agreements/aetudes?   •  The  role  of  idiom/metaphor/sarcasm  in  Arabic  seems  to   be  more  pervasive   –  Tongue  twisters,  WiKy  language,  Puns     ‫ﺲ‬‫ﻠ‬h‫ا‬ ‫ﻓﻲ‬ ‫واﻻدﻗﻦ‬ ‫ﺷﻌﺮ‬ ‫ ﺣﻤﺰاوي‬ • •  MP  Hamzawy  being  liberal  has  long  hair  compared  to  the  MB   candidates  who  have  beards,  so  the  bet  on  whether  he  will  grow  his   hair  longer  or  grow  a  beard      ‫ﺔ‬‫ﺑﻄﻴﺨ‬ ‫اﷲ‬ ‫ﺷﺎء‬ ‫ﻣﺎ‬ ‫اﻟﺮاﺟﻞ‬ ‫ﻗﻠﺐ‬ ‫وﻟﻜﻦ‬ ‫واﺣﺪة‬ ‫ﺑﺬرة‬ ‫ﻳﺴﺎع‬ ‫ﻣﺎﳒﻪ‬ ‫اﳌﺮأة‬ ‫ ﻗﻠﺐ‬ • •  Heart  of  a  woman  is  like  a  mango  can  only  hold  one  seed,  but  a  man’s   heart  is  “God  Bless”  a  melon   –  Sarcasm   !‫ﺑﺴﻴﻄﻪ‬ ‫ ﻳﺎﻻ‬ • •  no  problem,  it  is  easy!  (We  are  screwed  regardless!)  
  • 37. Nega5ve/posi5ve  Aetude   •  Very  flowery  language  compared  to  English     •  Strong  condescending  language  to  show  nega5ve  aetude   •  Code  switching  into  dialectal  Arabic  expressions  to  show   support   –  Manipulate  different  registers  for  code  switching  depending  on   context:  CA  with  MSA/DA  code  switching  to  reflect  influence   •  Ben  Ali,  Tunisian  President  vs.  Mubarak,  Egyp5an  president  in  ouster   speech   •  Mubarak  –  Ex-­‐Egyp5an  President  on  visit  to  factories/ouster  from   posi5on  in  last  revolu5on   •  Mubarak  vs.  Nasser  vs.  Sadat   –  Balance  between  familiarity  and  distance  
  • 38. Nega5ve/posi5ve  Aetude   •  Plural  first  person  pronouns  allow  the  speaker   to  reduce  his/her  power  to  establish  rapport   and  show  posi5ve  aetude,     – e.g.,   ‫إﺣﻨﺎ‬  ‫ﺟﺎﻟﻨﺎ‬  ‫اﻟﺸﺮف‬  vs.   ‫أﻧﺎ‬  ‫ﺟﺎﻟﻲ‬  ‫اﻟﺸﺮف‬     – We  are  honored  vs.  I  am  the  honored  one   •  English  plural  pronouns  in  such  contexts   sound  patronizing  (the  textbook  “we”),   whereas  the  “royal  we”  is  disused.  
  • 39. Nega5ve/Posi5ve  aetude   •  Humor  is  commonly  used  in  Arabic  as  a  strategy   that  levels  power  rela5ons,  but  that  would  be   inappropriate  in  English.   •  Slightly  offensive  expressions  are  used  in  Arabic   to  maintain  power  balance  and  solidarity,  e.g.,      •‫اﺳﻜﺖ‬  ،‫ﻣﺶ‬  ‫ﻣﺤﻤﺪ‬  ‫ﳒﺢ‬     •‫واﻟﻨﺒﻲ‬  ‫ﻧﻘﻄﻨﺎ‬  ‫ﺑﺴﻜﺎﺗﻚ‬    . •  Only  very  few  such  expressions  are  acceptable  in   English  and  in  very  close  contexts,  e.g.,  shut  up   and  get  out  of  here.  
  • 40. Talking  about  whom  and  to  whom   •  More  manipula5on  of  power  differen5al   –  MSA  terms  of  address  add  formality,  and  therefore   power  to  the  speaker,  whereas  colloquial  terms  of   address  establish  informal/equal  levels  of  power.     •  Compare   ‫ﻳﺎ‬  ‫ﺳﻴﺪي‬  ‫اﻟﻌﺰﻳﺰ‬  to   ‫ﻳﺎ‬  ‫ﺧﻮﻳﺎ‬ .     •  English  does  not  have  such  as  a  rich  con5nuum  of   formality/informality  expressions.   •  Usage  of  expressions  such  as     –  Mona:  Mona  could  not  dare  refuse  a  request  from  Ali   –  Considered  strange  self  reference  in  English  but  it  is   used  as  means  of  showing  modesty  and  familiarity  
  • 41. Focus  of  this  talk   Influencers   Pursuit  of  Power   Disputed  Topics   Mul5ple  Viewpoints  (Subgroups)  
  • 42. Focus  of  this  talk   The  new  immigra5on  law  is  good.  Illegal   immigra5on  is  bad.   Peter   I  totally  disagree  with  you.  This  law  is  blatant   racism.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct.   Illegal  immigra5on  is  bad  and  must  be  stopped.   John   You  are  clueless,  Peter.    Stop  suppor5ng  racism.   Alexander   Peter   John   Support  the  new  law   Against  the  new  law   Mary   Alexander  
  • 44. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 45. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 46. 1  -­‐  Thread  Parsing   The  new  immigra5on  law  is  good.  Illegal   immigra5on  is  bad.   Peter   I  totally  disagree  with  you.  This  law  is  blatant   racism.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct.   Illegal  immigra5on  is  bad  and  must  be  stopped.   John   You  are  clueless,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   Iden5fy  Posts,  Discussants,  and  the  reply  structure  of  the  discussion  thread  
  • 47. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 48. 2  -­‐  Iden5fy  Opinion  Words*   The  new  immigra5on  law  is  good+.  Illegal   immigra5on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct+.   Illegal  immigra5on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   *Iden5fying  opinion  words  using  Opinion  Finder  with  an  extended  lexicon   (implemented  using  random  walks  –  Hassan  &  Radev,  2011)  
  • 49. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 50. 3-­‐  Iden5fy  Candidate  Targets  of  Opinion   Target   Discussant  (  e.g.  you,    Peter)`   Topic/EnEty  (e.g.  The  new  immigra5on  Law,                                  Illegal  Immigra5on)    
  • 51. Candidate   Targets   3-­‐  Iden5fy  Candidate  Targets  of  Opinion   The  new  immigra5on  law  is  good+.  Illegal   immigra5on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct+.   Illegal  immigra5on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   All  discussants  are  candidate  Targets  
  • 52. Candidate   Targets   3-­‐  Iden5fy  Candidate  Targets  of  Opinion   The  new  immigra5on  law  is  good+.  Illegal   immigra5on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigra5on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   Iden5fy  discussant  men5ons  (2pp  or  name)     in  the  discussion   D2  
  • 53. Candidate   Targets   3-­‐  Iden5fy  Candidate  Targets  of  Opinion   The  new  immigra5on  law  is  good+.  Illegal   immigra5on  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigra5on  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Iden5fy  anaphoric  men5ons  of  discussants   D2  
  • 54. Candidate   Targets   3-­‐  Iden5fy  Candidate  Targets  of  Opinion   The  new  immigraEon  law  is  good+.  Illegal   immigraEon  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐   racism-­‐.   Mary   Have  you  read  all  what  Peter  wrote?  He  is  correct +.  Illegal  immigraEon  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Topic1   Topic1   Topic2   Topic2   D2   Topic  1   Topic  2  
  • 55. 3-­‐  Iden5fy  Candidate  Targets  of  Opinion   •  Techniques  used  to  iden5fy  topical  targets  :   – Named  En5ty  Recogni5on   – Noun  phrase  chunking    
  • 56. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 57. 4-­‐  Opinion-­‐Target  Pairing   I  totally  disagree-­‐  with  you.  The  new  immigraEon   law  is  blatant-­‐  racism-­‐.   Mary   P2   D1   Topic1   nsubj(disagree-3, I-1) advmod(disagree-3, totally-2) root(ROOT-0, disagree-3) prep_with (disagree-3, you-5)Rule     nsubj(racism-­‐-4, Topic1-1) cop(racist-4, is-2) amod(racism-4, blatant-3) root(ROOT-0, racist-4) Rule    
  • 59. Candidate   Targets   4-­‐  Opinion-­‐Target  Pairing   The  new  immigraEon  law  is  good+.  Illegal   immigraEon  is  bad-­‐.   Peter   I  totally  disagree-­‐  with  you.  This    law  is  blatant-­‐   racism-­‐.   Mary   Read  all  what  Peter  wrote.  He  is  correct+.  Illegal   immigraEon  is  bad-­‐  and  must  be  stopped.   John   You  are  clueless-­‐,  Peter.    Stop  suppor5ng  racism.   Alexander   P1   P2   P3   P4   D1   D2   D3   D4   D1   D1   D1   D1   Peter   Topic1   Topic1   Topic2   Topic2   Topic  1   Topic  2  
  • 60. 4-­‐  Opinion-­‐Target  Pairing   •  Language  Uses  (LUs)  present  in  this  step:   – Targeted  sen5ment  toward  other  discussants  (2nd   person)   – Targeted  Sen5ment  toward  topic  men5ons  (3rd   person)   I  totally  disagree -­‐  with  you.   This  law  is  blatant -­‐  racism -­‐.  
  • 61. 4-­‐  Opinion-­‐Target  Pairing   •  LU  details   – Rule-­‐based  detec5on  of  sen5ment  targets   (we’ve  also  been  experimen5ng  with  supervised  target   detec5on  methods)   – Discussant  targets  are  iden5fied  by  2nd  person   pronouns  (you,  your,  yourself,  etc.)  and  by   username  men5ons  (casper3912,  etc.)  
  • 62. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 63. 5-­‐  Discussant  Aetude  Profile   Target1   ………   Targetn   +   -­‐   int   +   -­‐   int   +   -­‐   int   DAP1   DAP2  
  • 64. 5-­‐  Discussant  Aetude  Profile   Peter   Mary   John   Alexander   Topic  1   Topic  2   Targets   Discussants   0   0   0   0   0   0   1   0   1   0   0   0   1   0   1   0   1   1   0   0   0   0   0   0   0   1   1   1   0   1   0   2   2   0   0   0   0   0   0   1   0   1   1   0   2   0   0   0   0   0   0   0   1   1   1   0   1   0   0   0   0   1   1   0   0   0   0   0   0   0   0   0  
  • 65. Subgroup  Detec5on  System  Overview   Discussion     Thread   Subgroups   Discussant   Opinion  Expressions     Iden5fica5on   Thread     Parsing   … disagree…… …....... ………… like…………… ………………… bad…….   Candidate       Target   Iden5fica5on   ..........you……...   ......................... ......conservaEves   ideologues……….   ……………………… ....…..ImmigraEon   law…………………   Opinion-­‐Target   Pairing   disagree   You   like   Conserva5ve     Ideologues   bad   Immigra5on   law   Reply  Structure   Candidate       Target   Iden5fica5on   Clustering   Discussant  AJtude   Profiles  (DAPs)            
  • 66. Clustering   Peter  Mary   John  Alexander   Subgroup  2  Subgroup  1   (Peter -­‐,  Topic1 -­‐)   (Peter -­‐)   (Topic1 +,  Topic  2 -­‐)   (Peter +,  Topic  2 -­‐)  
  • 68. Data   •  117    Discussions     •  Short  threads       •  short  posts   •  Human  annota5on   •  More  formal   •  12    Polls  +  Discussions     •  Long  threads   •  Long  and  short  posts   •  Data  self-­‐labeled   •  Less  formal   •  30    debates   •  Long  threads   •  Long  and  short  posts   •  Data  self-­‐labeled   •  Less  formal  
  • 70. Evalua5on  Metrics     1.  Purity   Source:  hKp://nlp.stanford.edu/IR-­‐book/html/htmledi5on/evalua5on-­‐of-­‐clustering-­‐1.html  
  • 71. Evalua5on  Metrics     2.  Entropy   3.  F-­‐Measure   where  P(I,  j)  is  the  probability  of  finding  an  element   from  the  category  i  in  the  cluster  j,  nj  is  the  number  of   items  in  cluster  j,  and  n  the  total  number  of  items  in   the  distribu5on.  
  • 72. Wikipedia   PoliEcal  Forum   Create  debate   Purity   0.66   0.61   0.64   Entropy   0.55   0.80   0.68   F-­‐measure   0.61   0.56   0.60   English  Results  
  • 73. Baselines   •  Interac5on  Graph  Clustering  (GC)   –  Nodes:  Par5cipants   –  Edges:  interac5ons  (connect  two  par5cipants  if  they   exchange  posts)   •  Text  Classifica5on  (TC)   –  Build  TF-­‐IDF  vectors  for  each  par5cipant  (using  all  his/ her  posts)   –  Cluster  the  vector  space  
  • 75. Choice  of  Clustering  Algorithm   •  K-­‐means   •  Expecta5on  Maximiza5on  (EM)   •  Farthest  First  (FF)  
  • 76. Choice  of  Clustering  Algorithm   •  K-­‐means   •  Expecta5on  Maximiza5on  (EM)   •  Farthest  First  (FF)  
  • 77. Component  Evalua5on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  SenEment   No  InteracEon   No  Anaphora  ResoluEon   No  Named  EnEty  Recog.   No  NP  Chunking  
  • 78. Component  Evalua5on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  SenEment   No  InteracEon   No  Anaphora  ResoluEon   No  Named  EnEty  Recog.   No  NP  Chunking   Not really a linguistic feature
  • 79. Component  Evalua5on   Our  System   No  Topical  Targets   No  Discussant  Targets   No  SenEment   No  InteracEon   No  Anaphora  ResoluEon   No  Named  EnEty  Recog.   No  NP  Chunking   More of a linguistic feature!
  • 80. Deeper  look  at  Agreement/ Disagreement  and  Aetude   •  So  far  we  employed  shared/divergent  opinion   in  the  form  of  explicit  polarity  indicators   – Sen5ment  polarity  towards  other  discussants   •  A:  So,  no  maBer  how  much  faith  you  have,  one  of  you   MUST  be  wrong!  (negaHve)   •  B:  You  are  a  scienHst?!  May  I  ask  in  which  field?   (negaHve)   – Sen5ment  polarity  towards  an  enHty     •  A:  Here  is  an  excellent  verse  from  the  Bible..  (posiHve)   •  B:  The  Bible  rightly  says  that...  (posiHve)  
  • 81. Implicit  Opinion/Perspec5ve   •  Observa5on:  People  sharing  similar  beliefs/perspec5ve   tend  to  use  the  same  evidence  to  support  their  point     –  Believers:  faith,  peace,  love,  ci5ng  verses  from  the  Bible...     –  Atheists:  reason,  science,  aKack  on  perceived  logical  flaws  in   Bible...     •  However  it  is  not  always  explicit  (using  similar  words  and   similar  aetudes)   •  Peter:  God  is  the  creator  of  mankind   •  Mary:  The  belief  in  an  ul5mate  divine  being  has  sustained  me  over  the   years     –  Not  necessarily  posi5ve/nega5ve   –  High  dimensional  similarity  (looking  at  the  surface  words)   between  both  sentences  is  low!     –  BUT  we  know  Mary  and  Peter  share  the  same  perspecEve   and  will  tend  to  be  in  agreement  with  each  other  
  • 82. Modeling  of  implicit  agreement/ disagreement     •  Implicit  agreement  or  disagreement   (perspec5ve)  –  using  text  similarity  to  help   iden5fy  subgroups     •  Perspec5ve  modeling  is  used  to  complement   explicit  aetude     •  Perspec5ve  granularity  has  to  be  collected  on   the  level  of  a  thread  rather  than  a  single  post   – Hence  we  summarize  all  the  posts  in  the  thread    
  • 83. Our  Model   •  Explicit  high  dimensional  aetude  toward   other  discussants   •  Explicit  high  dimensional  aetude  toward   named  en55es   •  Model  shared  perspec5ve  among  discussants   over  threads  using  textual  similarity  on  the   post  level  in  the  latent  space    
  • 84. Extrac5ng  explicit  aetude  toward   other  discussants     •  Iden5fy  polarity  of  each  sentence     •  Use  the  thread  structure  of  the  discussion  to   iden5fy  the  target  discussant     •  If  the  sentence  has  second  person  pronouns   (Hassan  et  al.,  2010),  then  the  polarity  is   assumed  to  be  towards  the  target  of  the   sentence  
  • 85. Extrac5ng  explicit  aetude  toward   named  en55es   •  Iden5fy  polarity  of  each  sentence   •  Run  Stanford  Named  En5ty  Tagger  on   sentences   •  If  the  sentence  has  Named  En55es,  then  the   polarity  is  assumed  to  be  towards  those   en55es  
  • 86. Extrac5ng  implicit  perspec5ve   •  Run  Latent  Dirichlet  Alloca5on  (LDA)  on  the   thread   •  Extract  the  topic  distribu5on  of  each  post   •  Aggregate  the  distribu5ons  of  all  posts   between  each  pair  of  discussants  
  • 87. Feature  Representa5on:  Aetude  Profiles       •  Vector  Representa5on     •  Explicit  aetude  toward  other  discussants     A   B   C   A   0      1      1   1  1        2   0      0      0   B   …   C   -­‐-­‐  
  • 88. Feature  Representa5on:  Aetude  Profiles       •  Vector  Representa5on     •  Explicit  aetude  toward  En55es   A   B   C   E1   E2   A   0      1      1   1  1        2   0      0      0   1      1      2   1      0      1   B   …   C   -­‐-­‐  
  • 89. Feature  Representa5on:  Aetude  Profiles       •  Vector  Representa5on     •  Implicit  aetude  toward  other  discussants       A   B   C   E1   E2   A   B   C   A   0      1      1   1  1        2   0      0      0   1      1      2   1      0      1   1    1    1   1    0    0.5   0.5  0    0   B   …   C   -­‐-­‐   1    1    1    
  • 90. Data   •  Create  Debate  (CD)     –  www.createdebate.com     –  Deba5ng  on  a  certain  topic     –  Sides  are  explicitly  indicated  by  discussants  in  a  poll     –  Informal  language     •  Wikipedia  Discussion  Forum  (WIKI)   –  en.wikipedia.org     –  Groups  labels  are  manually  annotated     –  Formal  language,  not  much  nega5ve  polarity    
  • 91. Experimental  Condi5ons   •  Clustering  algorithm   –  S-­‐Link   #  of  clusters  by  rule  of  thumb  =  √n/2   •  Evalua5on  Metrics   –  Purity,  Entropy,  F-­‐measure     •  Baseline   –  RAND-­‐BASE:  Assign  discussants  to  clusters  randomly   –  SWD-­‐BASE:  Calculate  surface  word  distribu5on,  as  a   simpler  form  of  perspec5ve  
  • 92. Results   CondiEon   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  
  • 93. Observa5ons   CondiEon   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   Best  Performance  is  when  we  combine  explicit  aetude  (SD  Sen5ment   toward  other  discussants,  SE  Sen5ment  toward  En55es)  with  implicit   perspec5ve  (PERS),  regardless  of  genre  
  • 94. Observa5ons   CondiEon   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   WIKI  seems  to  gain  more  from  implicit  perspec5ve  compared  to  CD    Explicit  Aetude  is  a  beKer  feature  for  CD:  people  express  their    sen5ments  openly,  while  in  WIKI  people  are  more  constrained  and    subtle  in  their  expressions  
  • 95. Observa5ons   CondiEon   Wiki   CD   Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure   RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41   SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432   SD   0.834   0.360   0.667   0.824   0.394   0.596   SE   0.827   0.383   0.655   0.793   0.422   0.582   SD+SE   0.835   0.362   0.665   0.82   0.385   0.604   PERS   0.853   0.321   0.699   0.787   0.399   0.589   SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615   SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591   SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625   BeKer  results  obtained  on  the  same  data  set  from  the  previous  results  for   WIKI  (P  0.66,  E  0.55)  CD  (P  0.64,  E  0.68)  
  • 96. Our  Social  Constructs   Mul5ple  Viewpoints  (Subgroups)   Influencers   Pursuit  of  Power  
  • 97. The  LUs  used  in  Final  System   •  AKempt  to  persuade  (Inf)   •  Agreement/disagreement  (Inf,  Sub)   •  -­‐ve/+ve  aetude  without  perspec5ve  (sub)   •  Who  is  talking  about  whom  (PoP)   •  Dialog  paKerns  (PoP)   •  Signed  network  (Sub)   Do  not  depend  on  linguis%c  analysis   Rely  on  linguis%c  analysis        
  • 98. LUs  and  SCs   LU/SC   Influencer   Pursuit  of  Power   Subgroup   AKempt  to  Persuade   ✔   Agreement/Disagreement   ✔   ✔   -­‐ve/+ve  aetude   ✔   ✔   Who  is  talking  about  whom   ✔   Dialogue  PaKerns   ✔   Signed  Networks   ✔  
  • 99. Challenges  with  processing  Arabic   Social  media   •  Genre   – WikiPedia   •  MSA  with  dialectal  style  and  mul5word  expressions/ lexical  items   – Blogs  from  BOLT  mostly  dialectal  with  pervasive   code  switching  and  seman5c  faux  amis   •  Implica5ons  for  preprocessing     – Our  tools  are  trained  on  formal  MSA  genres     •  Hence  degrada5on  in  basic  NLP  processing,  for   example  POS  tagging  in  MSA  is  97%  accuracy,  in  Blog   data  we  are  at  94%  (on  a  good  day!)  
  • 100. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6  
  • 101. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6   In  general,  Subgroup  is  the  hardest      
  • 102. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6   In  general,  Subgroup  is  the  hardest       Pursuit  of  power  relies  mostly  on   shallow  linguis5c  features  (men5ons)   and  dialog  structure    
  • 103. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6   Fora  are  harder  to  deal  with  than  WIKI  genre  
  • 104. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6   Arabic  WIKI  did  beBer  than  English  WIKI  
  • 105. Formal  Gov.  Evalua5on  (nDCG%)   09/2012     En-­‐WIKI   En-­‐Fora   Ar-­‐WIKI     Ar-­‐Fora   Subgroup  (without  perspec%ve)   48.2   50.6   57.4   37.5   Influencer   82.8   78.3   85.1   84.9   Pursuit  of  Power   87.8   77.7   91.6   74.6   Arabic  Influencer  significantly  impacted  by   simple  diacriHzaHon  detecHon  for  claims   (grounding)  
  • 106. Conclusions   •  We  can  successfully  computa5onally  model   sociopragma5c  phenomena   – There  is  significant  room  for  improvement   •  S5ll  discovering  how  to  model  the  phenomena   in  a  more  language  specific  manner   – We  are  just  scratching  the  surface  of  understanding   the  sociopragma5c  linguis5c  features   •  NOW  more  than  ever  collabora5ons  are   necessary  
  • 107. Any  takers!    ‫ﻢ‬‫ودﻣﺘ‬   Thank  you   Ques5ons?