Engagement, Metrics &
Personalisation at Scale
Mounia Lalmas
Spotify’s mission is to
unlock the potential of
human creativity —
by giving a million
creative artists the
opportunity to live off
their art and billions of
fans the opportunity to
enjoy and be inspired
by it.
Personalisation at scale
Focus on music listening
Qualitative&quantitativeresearch
KPIs&businessmetrics
Algorithms
Training & Datasets
Optimisationmetrics
Offline & online evaluation
Explicit & implicit feedback
Features
(items)
Features
(users)
Features
(others) Bias
Personalisation at scale
humancomputation.com
HCOMP is the home of the human computation and
crowdsourcing community. It’s the premier venue for
presenting latest findings from research and practice into
frameworks, methods and systems that bring together
people and machine intelligence to achieve better results.
“
”
humancomputation.com
HCOMP is the home of the human computation and
crowdsourcing community. It’s the premier venue for
presenting latest findings from research and practice into
frameworks, methods and systems that bring together
people and machine intelligence to achieve better results.
“
”
The role of “human in the loop”
when personalising at scale.
Part 1: About putting humans in the loop
Part 2: How Spotify puts these ideas into practice
PZN Offsite 2019
About
Part I
User engagement
Metrics
Human in the loop
User engagement
Metrics
Human in the loop
PZN Offsite 2019
About
Part I
What is user engagement?
User engagement is the quality of the user
experience that emphasizes the positive aspects
of interaction – in particular the fact of wanting
to use the technology longer and often.
S Attfield, G Kazai, M Lalmas & B Piwowarski. Towards a science of user engagement (Position Paper).
WSDM Workshop on User Modelling for Web Applications, 2011.
“
”
Point of
engagement
Period of
engagement
Disengagement
Re-engagement
How engagement starts (acquisition & activation)
Aesthetics & novelty in sync with user interests & contexts
Ability to maintain user attention and interests
Often the focus personalisation algorithms and the focus of this talk
Loss of interests leads to passive usage & even stopping usage
Identifying users that are likely to churn often undertaken
Engage again after becoming disengaged
Triggered by relevance, novelty, convenience, remembering past positive
experience sometimes as result of campaign strategy
The engagement life cycle
New
Users
Acquisition
Active Users
Activation
Disengagement
Dormant Users
Churn
Disengagement Re-engagement
Period of engagement
relates to the quality of the user
experience with the product
during a session and across
sessions
The engagement life cycle
12
New
Users
Acquisition
Active Users
Activation
Disengagement
Dormant Users
Churn
Disengagement Re-engagement
Period of engagement
relates to user behaviour
with the product during a
session and across sessions.
12
The engagement life cycleQuality of the user experience during and
across sessions
We need metrics to quantify the quality of the
user experience
PZN Offsite 2019
About
Part I
User engagement
Metrics
Human in the loop
3. Optimisation metrics Objective metrics to train personalisation algorithms
Three levels of metrics
2. Behavioral metrics Online metrics
1. Business metrics KPIs
follow
post
percentage
completion
dwell time
abandonment
rateclick
impression
to click
ratesave
Optimisation metrics mostly quantify users’ feedback signals
within and across sessions and act as proxy of engagement
Why several metrics?
Games
Users visit infrequently
but stay a long time
Search
Users visit frequently
but do not stay for long
Social media
Users visit frequently
and stay a long time
Niche
Users visit once a week
News
Users visit periodically,
e.g. morning and evening
Service
Users visit site
when needed
Playlists differ in
listening patterns
Search has a particular
engagement pattern
Engagement varies by
media type and freshness
Home has its own
“star” engagement pattern
Why several metrics for Spotify?
Leaning backLeaning in Active Occupied
Playlists types
Pure discovery sets
Trending tracks
Fresh Finds
Playlist metrics
Downstreams
Artist discoveries
# or % of tracks sampled
Playlists types
Sleep
Chill at home
Ambient sounds
Playlist metrics
Session time
Playlists types
Workout
Study
Gaming
Playlist metrics
Session time
Skip rate
Playlists types
Hits flagships
Decades
Moods
Playlist metrics
Skip rate
Downstreams
Why several metrics for Spotify playlists?
PZN Offsite 2019
About
Part I
User engagement
Metrics
Human in the loop
What is human in the loop?
For the purpose of this talk
“Human-in-the-loop or HITL is defined as a model that requires human interaction.”
Wikipedia
“Human-in-the-loop (HITL) is a branch of artificial intelligence that leverages both human and
machine intelligence to create machine learning models.”
Appen, January 15, 2019
Some thinkings around designing AI systems with human in the loop
● L van Ahn & L Dabbish. Designing Games with a Purpose. Communications of the ACM 2008.
● S Amershi, D Weld, M Vorvoreanu, A Fourney, B Nushi, P Collisson, J Suh, S Iqbal, PN Bennett, K Inkpen, J Teevan, R Kikin-Gil & E Horvitz. Guidelines for Human-AI Interaction. CHI 2019.
● G Bansal, B Nushi, E Kamar, WS Lasecki, DS Weld & E Horvitz. Beyond accuracy: The role of mental models in human-AI team performance. HCOMP 2019.
Implicit
feedback
Human in the
loop
quantitative
proxy
Explicit
feedback
Personalised
results
How to incorporate human in
the loop to personalise at scale?
qualitative
proxy
Understanding intents
What do users want on Home?
What do users want from Search?
Optimizing for the right metric
How are users satisfied with Search?
How are users satisfied with Playlists?
Acting on segmentation
What music users listen to?
How do users listen to music?
Thinking about diversity
How to consider diversity in user satisfaction?
How to consider diversity in content?
How to incorporate human in the loop
to personalise at scale?
Some answers through the lens of our research at Spotify
1 2
3 4
PZN Offsite 2019
How
Part II
Understanding
intents
Optimising for
the right metric
Acting on
segmentation
Thinking
about diversity
Understanding
intents
[1] R Mehrotra, M Lalmas, D Kenney, T Lim-Meng & G
Hashemian. Jointly Leveraging Intent and Interaction
Signals to Predict User Satisfaction with Slate
Recommendations. WWW 2019.
What do users
want on Home?
Knowing user intent on Home helps interpreting
user implicit feedback
Passively Listening
- quickly access playlists or saved music (2)
- play music matching mood or activity (4)
- find music to play in background (6)
Other
Home is default screen (1)
Actively Engaging
- discover new music to listen to now (3)
- to find X (5)
- save new music or follow new playlists for later (7)
- explore artists or albums more deeply (8)
Considering intent
and learning across
intents improves
ability to infer user
satisfaction by 20%
FOCUSED
One specific thing in mind
● Find it or not
● Quickest/easiest path to results
is important
● From nothing good enough,
good enough to better than
good enough
● Willing to try things out
● But still want to fulfil their intent
EXPLORATORY
A path to explore
● Difficult for users to assess
how it went
● May be able to answer in
relative terms
● Users expect to be active when
in an exploratory mindset
● Effort is expected
[2] C Hosey, L Vujović, B St. Thomas, J Garcia-Gathright
& J Thom. Just Give Me What I Want: How People Use
and Evaluate Music Search. CHI 2019.
[3] A Li, J Thom, P Ravichandran, C Hosey, B St.
Thomas & J Garcia-Gathright. Search Mindsets:
Understanding Focused and Non-Focused
Information Seeking in Music Search. WWW 2019.
How users think about
results relate to how they
use Spotify tabs
What do users
want from
Search?
OPEN
A seed of an idea in mind
Important to consider user intent to predict
satisfaction, define optimisation metric or interpret
a metric.
K Shu, S Mukherjee, G Zheng, A Hassan Awadallah, M Shokouhi & S Dumais. Learning with Weak Supervision for Email Intent Detection. SIGIR 2020.
J Thom, A Nazarian, R Brillman, H Cramer & S Mennicken. "Play Music": User Motivations and Expectations for Non-Specific Voice Queries. ISMIR 2020.
N Martelaro, S Mennicken, J Thom, H Cramer & W Ju. Using Remote Controlled Speech Agents to Explore Music Experience in Context. DIS 2020.
N Su, J He, Y Liu, M Zhang & S Ma. User Intent, Behaviour, and Perceived Satisfaction in Product Search. WSDM 2018.
J Cheng, C Lo & J Leskovec. Predicting Intent Using Activity Logs: How Goal Specificity and Temporal Range Affect User Behavior. WWW 2017.
Understanding intent is hard
Optimising for the
right metric
[4] P Chandar, J Garcia-Gathright, C Hosey, B St
Thomas & J Thom. Developing Evaluation Metrics for
Instant Search Using Mixed Methods. SIGIR 2019.
Success rate: a composite
metric of all success-related
behaviors, is more sensitive
than click-through rate
Users evaluate their search experience in
terms of effort and success
TYPE
User communicates
with us
CONSIDER
User evaluates what
we show them
DECIDE
User ends the
search session
EFFORT
Depends on a user mindset:
focused, open, exploratory
SUCCESS
Depends on user goal:
listen, organize, share
Success
Click-through
How are users
satisfied with
Search?
[5] P Dragone, R Mehrotra & M Lalmas. Deriving User-
and Content-specific Rewards for Contextual
Bandits. WWW 2019.
Using playlist consumption time to inform metric
to optimise for playlist satisfaction on Home
Optimizing for mean consumption time led to +22.24% in predicted
stream rate. Defining per user x playlist cluster led to further +13%
mean of
consumption
time
co-clustering
user group x
playlist type
How are users
satisfied with
playlists?
Personalisation algorithms will be very good at
optimising for the chosen metric.
L Hong & M Lalmas. Tutorial on Online User Engagement: Metrics and Optimization. WWW 2019 & KDD 2020.
H Hohnhold, D O’Brien & D Tang. Focusing on the Long-term: It’s Good for Users and Business. KDD 2015.
G Dupret & M Lalmas. Absence time and user engagement: Evaluating Ranking Functions. WSDM 2013.
X Yi, L Hong, E Zhong, N Nan Liu & S Rajan. Beyond clicks: dwell time for personalization. RecSys 2014.
M Lalmas, H O’Brien & E Yom-Tov. Measuring user engagement. Morgan & Claypool Publishers, 2014.
J Lehmann, M Lalmas, E Yom-Tov and G Dupret. Models of User Engagement. UMAP 2012.
Choosing metric is important
Acting on
segmentation
[6] S Way, J Garcia-Gathright, and H Cramer. Local
Trends in Global Music Streaming. ICWSM 2020.
Despite access to a global catalog, countries
are increasingly streaming their own, local music
Global music trade is
strongly shaped by
language and geography
Used Gravity Modeling
to study how these
relationships are changing
over time, around the world
Local music is on the rise
What music
users listen to?
[7] A Anderson, L Maystre, R Mehrotra, I Anderson & M
Lalmas. Algorithmic Effects on the Diversity of
Consumption on Spotify. WWW 2020.
Generalists and specialists exhibit different
retention and conversion behaviorsHow do users
listen to music?
Generalist-Specialist Score (GS)
Specialist Generalist
Generalists churn less and convert more than specialists
Segmentation helps personalisation algorithms
to perform for users and contents across the
spectrum.
C Hanser, C Hansen, L Maystre, R Mehrotra, B Brost, F Tomasi & M Lalmas. Contextual and Sequential User Embeddings for Large-Scale Music Recommendation. RecSys 2020.
A Epps-Darling, R Takeo, H Cramer. Female creator representation in music streaming. ISMIR 2020.
Y Jinyun, W Chu & R White. Cohort modeling for enhanced personalized search. SIGIR 2014.
S Goel, A Broder, E Gabrilovich & B Pang. Anatomy of the long tail: ordinary people with extraordinary tastes. WSDM 2010.
R White, S Dumais & J Teevan. Characterizing the influence of domain expertise on web search behavior. WSDM 2009.
Optimizing for segmentation
Thinking about
diversity
[8] R Mehrotra, N Xue & M Lalmas. Bandit based
Optimization of Multiple Objectives on a Music
Streaming Platform. KDD 2020.
Optimizing for multiple satisfaction objectives
together performs better than single metric
optimisation
Optimising for multiple satisfaction
metrics performs better for each metric
than directly optimising that metric
● clicks
● streaming time
● total number of tracks played
Single objective models Multi-
objective
model
Learning more relevant patterns of user
satisfaction with more optimisation metrics
● positive correlation between objectives
● holistic view of user experience
How to
consider
diversity in user
satisfaction?
[9] C Hansen, R Mehrotra, C Hansen, B Brost, L Maystre
& M Lalmas. Shifting Consumption towards Diverse
Content on Music Streaming Platforms. WSDM 2021.
Personalisation algorithms need to explicitly
optimise for content diversity
As personalisation algorithms increase
in complexity, they improve satisfaction
but at the cost of content diversity
Choice of personalisation algorithm
is more important when considering
diversity compared to considering
satisfaction only
How to
consider
diversity of
content?
When thinking about diversity, personalisation
algorithms become informed about what and
who they serve.
M Abdool, M Haldar, P Ramanathan, T Sax, L Zhang, A Manasawala, S Yang, B Turnbull, Q Zhang & T Legrand. Managing Diversity in Airbnb Search. KDD 2020.
H Cramer, J Wortman-Vaughan, K Holstein, H Wallach, H Daume, M Dudík, S Reddy & J Garcia-Gathright. Algorithmic bias in practice. FAT* Industry Translation Tutorial, 2019.
H Steck. Calibrated recommendations. RecSys 2019.
P Shah, A Soni & T Chevalier. Online Ranking with Constraints: A Primal-Dual Algorithm and Applications to Web Traffic-Shaping. KDD 2017.
D Agarwal & S Chatterjee. Constrained optimization for homepage relevance. WWW 2015.
Understanding diversity
Let us recap
... to incorporate human in the loop
when personalising at scale
PZN Offsite 2019
Understanding intents
Optimising for the right metric
Acting on segmentation
Thinking about diversity
User intents help inform metric
optimisation & interpretation.
Intent, segmentation & diversity
help bring the human in the loop
in our personalisation algorithms.
Segmentation helps adapt
personalisation algorithms.
1
2
3
4
Recap
Qualitative&quantitativeresearch
KPIs&businessmetrics
Algorithms
Training & Datasets
Optimizationmetrics
Offline & online evaluation
Explicit & implicit feedback
Features
(items)
Features
(users)
Features
(others) Bias
Incorporating human in the loop to personalise
at scale
Understandingintents[1,2,3]
Actingonsegmentation[6,7]
Optimisingfortherightmetric[4,5]
Thinkingaboutdiversity[8,9]
Thank you

Engagement, Metrics & Personalisation at Scale

  • 1.
  • 2.
    Spotify’s mission isto unlock the potential of human creativity — by giving a million creative artists the opportunity to live off their art and billions of fans the opportunity to enjoy and be inspired by it.
  • 3.
  • 4.
    Qualitative&quantitativeresearch KPIs&businessmetrics Algorithms Training & Datasets Optimisationmetrics Offline& online evaluation Explicit & implicit feedback Features (items) Features (users) Features (others) Bias Personalisation at scale
  • 5.
    humancomputation.com HCOMP is thehome of the human computation and crowdsourcing community. It’s the premier venue for presenting latest findings from research and practice into frameworks, methods and systems that bring together people and machine intelligence to achieve better results. “ ”
  • 6.
    humancomputation.com HCOMP is thehome of the human computation and crowdsourcing community. It’s the premier venue for presenting latest findings from research and practice into frameworks, methods and systems that bring together people and machine intelligence to achieve better results. “ ” The role of “human in the loop” when personalising at scale. Part 1: About putting humans in the loop Part 2: How Spotify puts these ideas into practice
  • 7.
    PZN Offsite 2019 About PartI User engagement Metrics Human in the loop
  • 8.
    User engagement Metrics Human inthe loop PZN Offsite 2019 About Part I
  • 9.
    What is userengagement? User engagement is the quality of the user experience that emphasizes the positive aspects of interaction – in particular the fact of wanting to use the technology longer and often. S Attfield, G Kazai, M Lalmas & B Piwowarski. Towards a science of user engagement (Position Paper). WSDM Workshop on User Modelling for Web Applications, 2011. “ ”
  • 10.
    Point of engagement Period of engagement Disengagement Re-engagement Howengagement starts (acquisition & activation) Aesthetics & novelty in sync with user interests & contexts Ability to maintain user attention and interests Often the focus personalisation algorithms and the focus of this talk Loss of interests leads to passive usage & even stopping usage Identifying users that are likely to churn often undertaken Engage again after becoming disengaged Triggered by relevance, novelty, convenience, remembering past positive experience sometimes as result of campaign strategy The engagement life cycle
  • 11.
    New Users Acquisition Active Users Activation Disengagement Dormant Users Churn DisengagementRe-engagement Period of engagement relates to the quality of the user experience with the product during a session and across sessions The engagement life cycle
  • 12.
    12 New Users Acquisition Active Users Activation Disengagement Dormant Users Churn DisengagementRe-engagement Period of engagement relates to user behaviour with the product during a session and across sessions. 12 The engagement life cycleQuality of the user experience during and across sessions We need metrics to quantify the quality of the user experience
  • 13.
    PZN Offsite 2019 About PartI User engagement Metrics Human in the loop
  • 14.
    3. Optimisation metricsObjective metrics to train personalisation algorithms Three levels of metrics 2. Behavioral metrics Online metrics 1. Business metrics KPIs
  • 15.
    follow post percentage completion dwell time abandonment rateclick impression to click ratesave Optimisationmetrics mostly quantify users’ feedback signals within and across sessions and act as proxy of engagement
  • 16.
    Why several metrics? Games Usersvisit infrequently but stay a long time Search Users visit frequently but do not stay for long Social media Users visit frequently and stay a long time Niche Users visit once a week News Users visit periodically, e.g. morning and evening Service Users visit site when needed
  • 17.
    Playlists differ in listeningpatterns Search has a particular engagement pattern Engagement varies by media type and freshness Home has its own “star” engagement pattern Why several metrics for Spotify?
  • 18.
    Leaning backLeaning inActive Occupied Playlists types Pure discovery sets Trending tracks Fresh Finds Playlist metrics Downstreams Artist discoveries # or % of tracks sampled Playlists types Sleep Chill at home Ambient sounds Playlist metrics Session time Playlists types Workout Study Gaming Playlist metrics Session time Skip rate Playlists types Hits flagships Decades Moods Playlist metrics Skip rate Downstreams Why several metrics for Spotify playlists?
  • 19.
    PZN Offsite 2019 About PartI User engagement Metrics Human in the loop
  • 20.
    What is humanin the loop? For the purpose of this talk “Human-in-the-loop or HITL is defined as a model that requires human interaction.” Wikipedia “Human-in-the-loop (HITL) is a branch of artificial intelligence that leverages both human and machine intelligence to create machine learning models.” Appen, January 15, 2019 Some thinkings around designing AI systems with human in the loop ● L van Ahn & L Dabbish. Designing Games with a Purpose. Communications of the ACM 2008. ● S Amershi, D Weld, M Vorvoreanu, A Fourney, B Nushi, P Collisson, J Suh, S Iqbal, PN Bennett, K Inkpen, J Teevan, R Kikin-Gil & E Horvitz. Guidelines for Human-AI Interaction. CHI 2019. ● G Bansal, B Nushi, E Kamar, WS Lasecki, DS Weld & E Horvitz. Beyond accuracy: The role of mental models in human-AI team performance. HCOMP 2019.
  • 21.
    Implicit feedback Human in the loop quantitative proxy Explicit feedback Personalised results Howto incorporate human in the loop to personalise at scale? qualitative proxy
  • 22.
    Understanding intents What dousers want on Home? What do users want from Search? Optimizing for the right metric How are users satisfied with Search? How are users satisfied with Playlists? Acting on segmentation What music users listen to? How do users listen to music? Thinking about diversity How to consider diversity in user satisfaction? How to consider diversity in content? How to incorporate human in the loop to personalise at scale? Some answers through the lens of our research at Spotify 1 2 3 4
  • 23.
    PZN Offsite 2019 How PartII Understanding intents Optimising for the right metric Acting on segmentation Thinking about diversity
  • 24.
  • 25.
    [1] R Mehrotra,M Lalmas, D Kenney, T Lim-Meng & G Hashemian. Jointly Leveraging Intent and Interaction Signals to Predict User Satisfaction with Slate Recommendations. WWW 2019. What do users want on Home? Knowing user intent on Home helps interpreting user implicit feedback Passively Listening - quickly access playlists or saved music (2) - play music matching mood or activity (4) - find music to play in background (6) Other Home is default screen (1) Actively Engaging - discover new music to listen to now (3) - to find X (5) - save new music or follow new playlists for later (7) - explore artists or albums more deeply (8) Considering intent and learning across intents improves ability to infer user satisfaction by 20%
  • 26.
    FOCUSED One specific thingin mind ● Find it or not ● Quickest/easiest path to results is important ● From nothing good enough, good enough to better than good enough ● Willing to try things out ● But still want to fulfil their intent EXPLORATORY A path to explore ● Difficult for users to assess how it went ● May be able to answer in relative terms ● Users expect to be active when in an exploratory mindset ● Effort is expected [2] C Hosey, L Vujović, B St. Thomas, J Garcia-Gathright & J Thom. Just Give Me What I Want: How People Use and Evaluate Music Search. CHI 2019. [3] A Li, J Thom, P Ravichandran, C Hosey, B St. Thomas & J Garcia-Gathright. Search Mindsets: Understanding Focused and Non-Focused Information Seeking in Music Search. WWW 2019. How users think about results relate to how they use Spotify tabs What do users want from Search? OPEN A seed of an idea in mind
  • 27.
    Important to consideruser intent to predict satisfaction, define optimisation metric or interpret a metric. K Shu, S Mukherjee, G Zheng, A Hassan Awadallah, M Shokouhi & S Dumais. Learning with Weak Supervision for Email Intent Detection. SIGIR 2020. J Thom, A Nazarian, R Brillman, H Cramer & S Mennicken. "Play Music": User Motivations and Expectations for Non-Specific Voice Queries. ISMIR 2020. N Martelaro, S Mennicken, J Thom, H Cramer & W Ju. Using Remote Controlled Speech Agents to Explore Music Experience in Context. DIS 2020. N Su, J He, Y Liu, M Zhang & S Ma. User Intent, Behaviour, and Perceived Satisfaction in Product Search. WSDM 2018. J Cheng, C Lo & J Leskovec. Predicting Intent Using Activity Logs: How Goal Specificity and Temporal Range Affect User Behavior. WWW 2017. Understanding intent is hard
  • 28.
  • 29.
    [4] P Chandar,J Garcia-Gathright, C Hosey, B St Thomas & J Thom. Developing Evaluation Metrics for Instant Search Using Mixed Methods. SIGIR 2019. Success rate: a composite metric of all success-related behaviors, is more sensitive than click-through rate Users evaluate their search experience in terms of effort and success TYPE User communicates with us CONSIDER User evaluates what we show them DECIDE User ends the search session EFFORT Depends on a user mindset: focused, open, exploratory SUCCESS Depends on user goal: listen, organize, share Success Click-through How are users satisfied with Search?
  • 30.
    [5] P Dragone,R Mehrotra & M Lalmas. Deriving User- and Content-specific Rewards for Contextual Bandits. WWW 2019. Using playlist consumption time to inform metric to optimise for playlist satisfaction on Home Optimizing for mean consumption time led to +22.24% in predicted stream rate. Defining per user x playlist cluster led to further +13% mean of consumption time co-clustering user group x playlist type How are users satisfied with playlists?
  • 31.
    Personalisation algorithms willbe very good at optimising for the chosen metric. L Hong & M Lalmas. Tutorial on Online User Engagement: Metrics and Optimization. WWW 2019 & KDD 2020. H Hohnhold, D O’Brien & D Tang. Focusing on the Long-term: It’s Good for Users and Business. KDD 2015. G Dupret & M Lalmas. Absence time and user engagement: Evaluating Ranking Functions. WSDM 2013. X Yi, L Hong, E Zhong, N Nan Liu & S Rajan. Beyond clicks: dwell time for personalization. RecSys 2014. M Lalmas, H O’Brien & E Yom-Tov. Measuring user engagement. Morgan & Claypool Publishers, 2014. J Lehmann, M Lalmas, E Yom-Tov and G Dupret. Models of User Engagement. UMAP 2012. Choosing metric is important
  • 32.
  • 33.
    [6] S Way,J Garcia-Gathright, and H Cramer. Local Trends in Global Music Streaming. ICWSM 2020. Despite access to a global catalog, countries are increasingly streaming their own, local music Global music trade is strongly shaped by language and geography Used Gravity Modeling to study how these relationships are changing over time, around the world Local music is on the rise What music users listen to?
  • 34.
    [7] A Anderson,L Maystre, R Mehrotra, I Anderson & M Lalmas. Algorithmic Effects on the Diversity of Consumption on Spotify. WWW 2020. Generalists and specialists exhibit different retention and conversion behaviorsHow do users listen to music? Generalist-Specialist Score (GS) Specialist Generalist Generalists churn less and convert more than specialists
  • 35.
    Segmentation helps personalisationalgorithms to perform for users and contents across the spectrum. C Hanser, C Hansen, L Maystre, R Mehrotra, B Brost, F Tomasi & M Lalmas. Contextual and Sequential User Embeddings for Large-Scale Music Recommendation. RecSys 2020. A Epps-Darling, R Takeo, H Cramer. Female creator representation in music streaming. ISMIR 2020. Y Jinyun, W Chu & R White. Cohort modeling for enhanced personalized search. SIGIR 2014. S Goel, A Broder, E Gabrilovich & B Pang. Anatomy of the long tail: ordinary people with extraordinary tastes. WSDM 2010. R White, S Dumais & J Teevan. Characterizing the influence of domain expertise on web search behavior. WSDM 2009. Optimizing for segmentation
  • 36.
  • 37.
    [8] R Mehrotra,N Xue & M Lalmas. Bandit based Optimization of Multiple Objectives on a Music Streaming Platform. KDD 2020. Optimizing for multiple satisfaction objectives together performs better than single metric optimisation Optimising for multiple satisfaction metrics performs better for each metric than directly optimising that metric ● clicks ● streaming time ● total number of tracks played Single objective models Multi- objective model Learning more relevant patterns of user satisfaction with more optimisation metrics ● positive correlation between objectives ● holistic view of user experience How to consider diversity in user satisfaction?
  • 38.
    [9] C Hansen,R Mehrotra, C Hansen, B Brost, L Maystre & M Lalmas. Shifting Consumption towards Diverse Content on Music Streaming Platforms. WSDM 2021. Personalisation algorithms need to explicitly optimise for content diversity As personalisation algorithms increase in complexity, they improve satisfaction but at the cost of content diversity Choice of personalisation algorithm is more important when considering diversity compared to considering satisfaction only How to consider diversity of content?
  • 39.
    When thinking aboutdiversity, personalisation algorithms become informed about what and who they serve. M Abdool, M Haldar, P Ramanathan, T Sax, L Zhang, A Manasawala, S Yang, B Turnbull, Q Zhang & T Legrand. Managing Diversity in Airbnb Search. KDD 2020. H Cramer, J Wortman-Vaughan, K Holstein, H Wallach, H Daume, M Dudík, S Reddy & J Garcia-Gathright. Algorithmic bias in practice. FAT* Industry Translation Tutorial, 2019. H Steck. Calibrated recommendations. RecSys 2019. P Shah, A Soni & T Chevalier. Online Ranking with Constraints: A Primal-Dual Algorithm and Applications to Web Traffic-Shaping. KDD 2017. D Agarwal & S Chatterjee. Constrained optimization for homepage relevance. WWW 2015. Understanding diversity
  • 40.
    Let us recap ...to incorporate human in the loop when personalising at scale
  • 41.
    PZN Offsite 2019 Understandingintents Optimising for the right metric Acting on segmentation Thinking about diversity User intents help inform metric optimisation & interpretation. Intent, segmentation & diversity help bring the human in the loop in our personalisation algorithms. Segmentation helps adapt personalisation algorithms. 1 2 3 4 Recap
  • 42.
    Qualitative&quantitativeresearch KPIs&businessmetrics Algorithms Training & Datasets Optimizationmetrics Offline& online evaluation Explicit & implicit feedback Features (items) Features (users) Features (others) Bias Incorporating human in the loop to personalise at scale Understandingintents[1,2,3] Actingonsegmentation[6,7] Optimisingfortherightmetric[4,5] Thinkingaboutdiversity[8,9]
  • 43.