Social Media and AI: Don’t forget the users
Mounia Lalmas
Social Media and AI: Don’t forget the users
AI
(Algorithms)
Social Media
User
Engagement
USERS
Social Media and AI: Don’t forget the users
AI
(Algorithms)
Social Media
User
Engagement
USERS
Source of
information
User
perception
Outline
•  User engagement
•  Engaging through diversity: serendipity
•  Engaging through diversity: awareness
We develop and deploy algorithms to “engage” users.
Use cases:
Diversity
Outline
•  User engagement
•  Engaging through diversity: serendipity
•  Engaging through diversity: awareness
We develop and deploy algorithms to “engage” users.
Source of
information
User
perception
Use cases:
Diversity
Outline
•  User engagement
•  Engaging through diversity: serendipity
•  Engaging through diversity: awareness
We develop and deploy algorithms to “engage” users.
Why is it important to engage users?
▪  In today’s wired world, users have enhanced expectations about their interactions with
technology
… resulting in increased competition amongst the
purveyors and designers of interactive systems.
▪  In addition to utilitarian factors, such as usability, we must consider the hedonic and
experiential factors of interacting with technology, such as fun, fulfillment, play, and user
engagement.
(O’Brien, Lalmas & Yom-Tov, 2014)
What is user engagement?
User engagement is a quality of the user experience that
emphasizes the positive aspects of interaction – in
particular the fact of being captivated by the technology
(Attfield et al, 2011).
user feelings: happy, sad,
excited, …
emotional, cognitive and behavioural connection
that exists, at any point in time and over time, between
a user and a technological resource
user interactions: click,
read, comment, buy…
user mental states: flow,
presence, immersion, …
(O’Brien, Lalmas & Yom-Tov, 2014)
Patterns of user engagement
Online sites differ concerning their engagement!
Games
Users spend much
time per visit
Search
Users come
frequently and do
not stay long
Social media
Users come
frequently and stay
long
Niche
Users come on
average once
a week e.g. weekly
post
News
Users come
periodically,
e.g. morning and
evening
Service
Users visit site,
when needed, e.g.
to renew
subscription
(Lehmann etal, 2012)
Why is it important to measure and interpret
user engagement well?
CTR
new ranking algorithm
Characteristics of user engagement
•  Users must be focused to be engaged
•  Distortions in the subjective perception of time used to measure it
Focused attention
(Webster & Ho, 1997; O’Brien, 2008)
•  Emotions experienced by user are intrinsically motivating
•  Initial affective “hook” can induce a desire for exploration, active
discovery or participation
Positive Affect
(O’Brien & Toms, 2008)
•  Sensory, visual appeal of interface stimulates user & promotes
focused attention
•  Linked to design principles (e.g. symmetry, balance, saliency)
Aesthetics
(Jacques et al, 1995; O’Brien, 2008)
•  People remember enjoyable, useful, engaging experiences and
want to repeat them
•  Reflected in e.g. the propensity of users to recommend an
experience/a site/a product
Endurability
(Read, MacFarlane, & Casey, 2002;
O’Brien, 2008)
Characteristics of user engagement
•  Novelty, surprise, unfamiliarity and the unexpected
•  Appeal to users’ curiosity; encourages inquisitive behavior
and promotes repeated engagement
Novelty
(Webster & Ho, 1997; O’Brien, 2008)
•  Richness captures the growth potential of an activity
•  Control captures the extent to which a person is able to
achieve this growth potential
Richness and control
(Jacques et al, 1995; Webster & Ho,
1997)
•  Trust is a necessary condition for user engagement
•  Implicit contract among people and entities which is more
than technological
Reputation, trust and
expectation (Attfield et al,
2011)
•  Difficulties in setting up “laboratory” style experiments
•  Why should users engage?
Motivation, interests,
incentives, and
benefits (Jacques et al., 1995;
O’Brien & Toms, 2008)
Outline
•  User engagement
•  Engaging through diversity: serendipity (same algorithm, different sources)
•  Engaging through diversity: awareness
We develop and deploy algorithms to “engage” users.
Source of
information
(Bordino, Mejova & Lalmas, 2013)
Use case I: Serendipity in entity search
Engaging through serendipity: which source?
community-driven question & answer
portal
•  67 336 144 questions & 261 770 047
answers
•  January 1, 2010 – December 31, 2011
•  English-language
community-driven encyclopedia
•  3 795 865 articles
•  as of end of December 2011
•  English Wikipedia
Entity
Search
build an entity-driven serendipitous search system based on entity
networks extracted from Wikipedia and Yahoo! Answers
Serendipity
finding something good or useful while not specifically looking
for it, serendipitous search systems provide relevant and
interesting results
Yahoo! Answers Wikipedia
Engaging through serendipity: which source?
community-driven question & answer
portal
•  67 336 144 questions & 261 770 047
answers
•  January 1, 2010 – December 31, 2011
•  English-language
community-driven encyclopedia
•  3 795 865 articles
•  as of end of December 2011
•  English Wikipedia
Entity
Search
build an entity-driven serendipitous search system based on entity
networks extracted from Wikipedia and Yahoo! Answers
Serendipity
finding something good or useful while not specifically looking
for it, serendipitous search systems provide relevant and
interesting results
Yahoo! Answers Wikipedia
curated
high-quality knowledge
variety of niche topics
minimally curated
opinions, gossip, personal info
variety of points of view
Wikipedia
Yahoo! Answers
retrieve entities most related to a
query entity using random walk
Exact same
algorithm!
| relevant & unexpected | / | unexpected |
number of serendipitous results out of all of the
unexpected results retrieved
| relevant & unexpected | / | retrieved |
serendipitous out of all retrieved
Baseline	 Data	
Top:	5	en&&es	that	occur	most	frequently	 WP	 0.63	(0.58)	
in	top	5	search	from	Bing	and	Google	 YA	 0.69	(0.63)	
Top	–	WP:	same	as	above,	but	excluding		 WP	 0.63	(0.58)	
Wikipedia	page	from	results	 YA	 0.70	(0.64)	
Rel:	top	5	en&&es	in	the	related	query		 WP	 0.64	(0.61)	
sugges&ons	provided	by	Bing	and	Google	 YA	 0.70	(0.65)	
Rel	+	Top:	union	of	Top	and	Rel	 WP	 0.61	(0.54)	
YA	 0.68	(0.57)	
Serendipity “making fortunate discoveries by accident”
Serendipity = unexpectedness + relevance
“Expected” result baselines from web search
Serendipitous ≠ Relevance
Serendipitous > Relevant
Relevant > Serendipitous
Oil Spill à
Penguins in Sweaters WP
Robert Pattinson à
Water for Elephants WP
Lady Gaga à Britney Spears WP
Egypt à Cairo WP
Netflix à Blu-ray Disc YA
Egypt à
Ptolemaic Kingdom WP & YA
Novelty
(see “common” knowledge in Weikum, 2017 WWW TempWeb Keynote)
Engaging through serendipity: which source?
•  Engagement in search is to view search activities as part of
the current overall task of a user, including task of a leisurely
or explorative nature
•  Not all social media sources provide serendipitous
search experience
(slides based on Bordini presentation @ CIKM 2016)
Source of
information
Outline
•  User engagement
•  Engaging through diversity: serendipity
•  Engaging through diversity: awareness (effective algorithm, perception)
We develop and deploy algorithms to “engage” users.
User
perception
(Graells-Garrido, Lalmas & Baeza-Yates, 2016)
Twitter is global
•  But there are cognitive and systemic biases that shape user behaviour
•  Can we do something about it?
Leetaru et al., 2013.
Use case II: Geographical bias on social media
•  Economic/political/media powers
are concentrated in Santiago
(the capital)
Región Metropolitana (RM) is
the capital region
•  Twitter activity is centralized –
RM receives more tweets from
other locations than expected
due to population distribution
Context: Chile, a centralized country
Chart: flow of tweets
activities between
administrative
regions
Context: Chile, a centralized country
Create a geographically diverse timeline
●  Proposed Method “PM”: Information entropy + sidelines (enforce location)
●  Baseline “DIV”: Information entropy only
●  Baseline “POP”: Most popular tweets (mostly tweets from Santiago/RM)
After reading timelines side-by-side, which one is more:
-  diverse?
-  interesting?
-  informative?
Participants answered
using a Likert scale
from -3 to 3
Algorithms to overcome geographical bias
Being from a central or peripheral
location makes a difference
For peripheral/NOT-RM users,
there was no perception of the
diversity present by design on
both algorithms (DIV and PM)!
Main Result
Statistical interaction between location and algorithm POP/PM
RM participants find PM more diverse than POP
NOT-RM do not
•  Users do not see the diversity in the timelines
because they cannot identify themselves (in the location
sense), even though diversity was present
•  There is a diversity and representation awareness
problem
•  How to make users aware of their representation in the
timeline, as well as the diversity inherent in it?
Algorithms are not enough to overcome bias
Previous work in news aggregators indicates that clustered representations help users to
become aware of diversity
Clustered Tweets by Location Standalone Tweets
•  Inspired by newsmap.jp, use treemaps to depict differences in a tweet geographical
origin, as well as giving every location a balanced amount of exposure
•  Allow users to filter locations by selecting a specific region
Purpose - evaluate user involvement
with the application as proxy of diversity
and representation awareness.
•  Diversity - do users click on content
related to different locations?
•  Representation - do users choose
to see only their location using the
filters?
•  Interestingness - how many
interactions with content do users
make? Social bot (@todocl) to generate timelines every
hour and broadcast them, mentioning featured users,
and retweeting their tweets
“In the wild” study
Experimental Setup
Between-subjects design.
N = 321 (RM = 193, NOT-RM = 128)
Main Results
treemap increases:
-  # of interaction events
# of locations interacted with
filter likelihood
Users interacted with more content, from
more locations, and filtered locations also!
(diversity)
Being from RM:
-  increases locations interacted with
decreases filter likelihood
* NOT-RM increases representation awareness -
they find themselves!
•  Bias (centralization) has effects on information perception and
user behavior
•  Algorithms are not enough! We need to find ways of “showing”
information to users
•  Not necessarily new algorithms – but new “I don’t know how to
call it”
Engaging through awareness: perception
(slides based on Graells-Garrido presentation @ IUI 2016)
User
perception
Final message
•  E. Graells-Garrido, M. Lalmas and R. Baeza-Yates. Encouraging Diversity- and Representation-
Awareness in Geographically Centralized Content, ACM IUI, 2016.
•  I. Bordino, Y. Mejova and M. Lalmas. Penguins in Sweaters, or Serendipitous Entity Search on User-
generated Content, ACM CIKM, 2016.
•  M. Lalmas, H. O'Brien and E. Yom-Tov. Measuring User Engagement, Synthesis Lectures on
Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, 2014.
Not every culture has same notion of relevance and
importance. Even within a country there are differences!!
We need algorithms that do not forget about these
differences.

Social Media and AI: Don’t forget the users

  • 1.
    Social Media andAI: Don’t forget the users Mounia Lalmas
  • 2.
    Social Media andAI: Don’t forget the users AI (Algorithms) Social Media User Engagement USERS
  • 3.
    Social Media andAI: Don’t forget the users AI (Algorithms) Social Media User Engagement USERS Source of information User perception
  • 4.
    Outline •  User engagement • Engaging through diversity: serendipity •  Engaging through diversity: awareness We develop and deploy algorithms to “engage” users. Use cases: Diversity
  • 5.
    Outline •  User engagement • Engaging through diversity: serendipity •  Engaging through diversity: awareness We develop and deploy algorithms to “engage” users. Source of information User perception Use cases: Diversity
  • 6.
    Outline •  User engagement • Engaging through diversity: serendipity •  Engaging through diversity: awareness We develop and deploy algorithms to “engage” users.
  • 7.
    Why is itimportant to engage users? ▪  In today’s wired world, users have enhanced expectations about their interactions with technology … resulting in increased competition amongst the purveyors and designers of interactive systems. ▪  In addition to utilitarian factors, such as usability, we must consider the hedonic and experiential factors of interacting with technology, such as fun, fulfillment, play, and user engagement. (O’Brien, Lalmas & Yom-Tov, 2014)
  • 8.
    What is userengagement? User engagement is a quality of the user experience that emphasizes the positive aspects of interaction – in particular the fact of being captivated by the technology (Attfield et al, 2011). user feelings: happy, sad, excited, … emotional, cognitive and behavioural connection that exists, at any point in time and over time, between a user and a technological resource user interactions: click, read, comment, buy… user mental states: flow, presence, immersion, … (O’Brien, Lalmas & Yom-Tov, 2014)
  • 9.
    Patterns of userengagement Online sites differ concerning their engagement! Games Users spend much time per visit Search Users come frequently and do not stay long Social media Users come frequently and stay long Niche Users come on average once a week e.g. weekly post News Users come periodically, e.g. morning and evening Service Users visit site, when needed, e.g. to renew subscription (Lehmann etal, 2012)
  • 10.
    Why is itimportant to measure and interpret user engagement well? CTR new ranking algorithm
  • 11.
    Characteristics of userengagement •  Users must be focused to be engaged •  Distortions in the subjective perception of time used to measure it Focused attention (Webster & Ho, 1997; O’Brien, 2008) •  Emotions experienced by user are intrinsically motivating •  Initial affective “hook” can induce a desire for exploration, active discovery or participation Positive Affect (O’Brien & Toms, 2008) •  Sensory, visual appeal of interface stimulates user & promotes focused attention •  Linked to design principles (e.g. symmetry, balance, saliency) Aesthetics (Jacques et al, 1995; O’Brien, 2008) •  People remember enjoyable, useful, engaging experiences and want to repeat them •  Reflected in e.g. the propensity of users to recommend an experience/a site/a product Endurability (Read, MacFarlane, & Casey, 2002; O’Brien, 2008)
  • 12.
    Characteristics of userengagement •  Novelty, surprise, unfamiliarity and the unexpected •  Appeal to users’ curiosity; encourages inquisitive behavior and promotes repeated engagement Novelty (Webster & Ho, 1997; O’Brien, 2008) •  Richness captures the growth potential of an activity •  Control captures the extent to which a person is able to achieve this growth potential Richness and control (Jacques et al, 1995; Webster & Ho, 1997) •  Trust is a necessary condition for user engagement •  Implicit contract among people and entities which is more than technological Reputation, trust and expectation (Attfield et al, 2011) •  Difficulties in setting up “laboratory” style experiments •  Why should users engage? Motivation, interests, incentives, and benefits (Jacques et al., 1995; O’Brien & Toms, 2008)
  • 13.
    Outline •  User engagement • Engaging through diversity: serendipity (same algorithm, different sources) •  Engaging through diversity: awareness We develop and deploy algorithms to “engage” users. Source of information (Bordino, Mejova & Lalmas, 2013)
  • 14.
    Use case I:Serendipity in entity search
  • 15.
    Engaging through serendipity:which source? community-driven question & answer portal •  67 336 144 questions & 261 770 047 answers •  January 1, 2010 – December 31, 2011 •  English-language community-driven encyclopedia •  3 795 865 articles •  as of end of December 2011 •  English Wikipedia Entity Search build an entity-driven serendipitous search system based on entity networks extracted from Wikipedia and Yahoo! Answers Serendipity finding something good or useful while not specifically looking for it, serendipitous search systems provide relevant and interesting results Yahoo! Answers Wikipedia
  • 16.
    Engaging through serendipity:which source? community-driven question & answer portal •  67 336 144 questions & 261 770 047 answers •  January 1, 2010 – December 31, 2011 •  English-language community-driven encyclopedia •  3 795 865 articles •  as of end of December 2011 •  English Wikipedia Entity Search build an entity-driven serendipitous search system based on entity networks extracted from Wikipedia and Yahoo! Answers Serendipity finding something good or useful while not specifically looking for it, serendipitous search systems provide relevant and interesting results Yahoo! Answers Wikipedia curated high-quality knowledge variety of niche topics minimally curated opinions, gossip, personal info variety of points of view
  • 17.
    Wikipedia Yahoo! Answers retrieve entitiesmost related to a query entity using random walk Exact same algorithm!
  • 18.
    | relevant &unexpected | / | unexpected | number of serendipitous results out of all of the unexpected results retrieved | relevant & unexpected | / | retrieved | serendipitous out of all retrieved Baseline Data Top: 5 en&&es that occur most frequently WP 0.63 (0.58) in top 5 search from Bing and Google YA 0.69 (0.63) Top – WP: same as above, but excluding WP 0.63 (0.58) Wikipedia page from results YA 0.70 (0.64) Rel: top 5 en&&es in the related query WP 0.64 (0.61) sugges&ons provided by Bing and Google YA 0.70 (0.65) Rel + Top: union of Top and Rel WP 0.61 (0.54) YA 0.68 (0.57) Serendipity “making fortunate discoveries by accident” Serendipity = unexpectedness + relevance “Expected” result baselines from web search
  • 19.
    Serendipitous ≠ Relevance Serendipitous> Relevant Relevant > Serendipitous Oil Spill à Penguins in Sweaters WP Robert Pattinson à Water for Elephants WP Lady Gaga à Britney Spears WP Egypt à Cairo WP Netflix à Blu-ray Disc YA Egypt à Ptolemaic Kingdom WP & YA Novelty (see “common” knowledge in Weikum, 2017 WWW TempWeb Keynote)
  • 20.
    Engaging through serendipity:which source? •  Engagement in search is to view search activities as part of the current overall task of a user, including task of a leisurely or explorative nature •  Not all social media sources provide serendipitous search experience (slides based on Bordini presentation @ CIKM 2016) Source of information
  • 21.
    Outline •  User engagement • Engaging through diversity: serendipity •  Engaging through diversity: awareness (effective algorithm, perception) We develop and deploy algorithms to “engage” users. User perception (Graells-Garrido, Lalmas & Baeza-Yates, 2016)
  • 22.
    Twitter is global • But there are cognitive and systemic biases that shape user behaviour •  Can we do something about it? Leetaru et al., 2013. Use case II: Geographical bias on social media
  • 23.
    •  Economic/political/media powers areconcentrated in Santiago (the capital) Región Metropolitana (RM) is the capital region •  Twitter activity is centralized – RM receives more tweets from other locations than expected due to population distribution Context: Chile, a centralized country
  • 24.
    Chart: flow oftweets activities between administrative regions Context: Chile, a centralized country
  • 25.
    Create a geographicallydiverse timeline ●  Proposed Method “PM”: Information entropy + sidelines (enforce location) ●  Baseline “DIV”: Information entropy only ●  Baseline “POP”: Most popular tweets (mostly tweets from Santiago/RM) After reading timelines side-by-side, which one is more: -  diverse? -  interesting? -  informative? Participants answered using a Likert scale from -3 to 3 Algorithms to overcome geographical bias
  • 26.
    Being from acentral or peripheral location makes a difference For peripheral/NOT-RM users, there was no perception of the diversity present by design on both algorithms (DIV and PM)! Main Result Statistical interaction between location and algorithm POP/PM RM participants find PM more diverse than POP NOT-RM do not
  • 27.
    •  Users donot see the diversity in the timelines because they cannot identify themselves (in the location sense), even though diversity was present •  There is a diversity and representation awareness problem •  How to make users aware of their representation in the timeline, as well as the diversity inherent in it? Algorithms are not enough to overcome bias
  • 28.
    Previous work innews aggregators indicates that clustered representations help users to become aware of diversity Clustered Tweets by Location Standalone Tweets
  • 29.
    •  Inspired bynewsmap.jp, use treemaps to depict differences in a tweet geographical origin, as well as giving every location a balanced amount of exposure •  Allow users to filter locations by selecting a specific region
  • 30.
    Purpose - evaluateuser involvement with the application as proxy of diversity and representation awareness. •  Diversity - do users click on content related to different locations? •  Representation - do users choose to see only their location using the filters? •  Interestingness - how many interactions with content do users make? Social bot (@todocl) to generate timelines every hour and broadcast them, mentioning featured users, and retweeting their tweets “In the wild” study
  • 31.
    Experimental Setup Between-subjects design. N= 321 (RM = 193, NOT-RM = 128) Main Results treemap increases: -  # of interaction events # of locations interacted with filter likelihood Users interacted with more content, from more locations, and filtered locations also! (diversity) Being from RM: -  increases locations interacted with decreases filter likelihood * NOT-RM increases representation awareness - they find themselves!
  • 32.
    •  Bias (centralization)has effects on information perception and user behavior •  Algorithms are not enough! We need to find ways of “showing” information to users •  Not necessarily new algorithms – but new “I don’t know how to call it” Engaging through awareness: perception (slides based on Graells-Garrido presentation @ IUI 2016) User perception
  • 33.
    Final message •  E.Graells-Garrido, M. Lalmas and R. Baeza-Yates. Encouraging Diversity- and Representation- Awareness in Geographically Centralized Content, ACM IUI, 2016. •  I. Bordino, Y. Mejova and M. Lalmas. Penguins in Sweaters, or Serendipitous Entity Search on User- generated Content, ACM CIKM, 2016. •  M. Lalmas, H. O'Brien and E. Yom-Tov. Measuring User Engagement, Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, 2014. Not every culture has same notion of relevance and importance. Even within a country there are differences!! We need algorithms that do not forget about these differences.