Opening the black box of user
profiles in content-based
recommender systems
Fair and Transparent Machine
Learning @ ICAI Meetup
April 12, 2019
David Graus | david.graus@fdmediagroep.nl | @dvdgrs 1
Opening the black box of user
profiles in content-based
recommender systems
Fair and Transparent Machine
Learning @ ICAI Meetup
April 12, 2019
David Graus | david.graus@fdmediagroep.nl | @dvdgrs 1
The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
Context: AI @ FD Mediagroup
Context: AI @ FD Mediagroup
7
Speciaal voor U
Why fair and transparent?
8
User studies have shown:
• Our users want personalized content
• Our users care for transparency
FD:
• Verifiability is one of the core values for FD Mediagroup
• Transparency for verifiability
The Context
9
User-data Rec
engine
Inferred
data
Recomme
ndations
The Context
10
User-data Rec
engine
Inferred
data
Recomme
ndations
The Context
11
12
STEP 1
DISCOURSE
Understanding
the problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
13
STEP 1
DISCOURSE
Understanding
the problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
Discourse
Who is the target audience of the explanations?
What is the goal?
What purpose do the explanations serve?
15
STEP 1
DISCOURSE
Understanding
the problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
Framework
Framework
Framework
Framework
I want to
be an
expert
I want to
stay
informed
I want to
broaden my
horizon
I want to
discover the
unexplored
Values
Broadness, diversity, autonomy, objectivity,
match with the user needs, controllability
20
STEP 1
DISCOURSE
Understanding
the problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
21
Data exploration
Data exploration
Data exploration
Data exploration
Data exploration
Data exploration
27
User study
User study
System
Evaluation
Your goal is to Broaden your Horizons.
There may be topics you do not normally read
about, but you may actually find interesting.
Exploring this helps to build a broad
perspective on the issues that matter to you.
Your goal is to Discover the Unexplored.
There may be topics that you haven’t
explored before that may actually become
new interests. Exploring new topics can
promote creativity and objectivity.
User study
Aim:
Study whether being offered a particular
goal would influence the user’s intended
reading behavior
System
Evaluation
User study
System
Evaluation
Objective
Pick a persona from four
data-driven profiles
Random assign
goal-order
Explain the goals: Broaden Horizons,
Discover the unexplored
Goal A
Show
visualization
Questionnaire
Persona 1
Goal B
Persona 4
Similarity
Familiarity
Hypotheses
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
33
Results
(Executive summary)
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
40
Going Forward
41
Summary of achievements
Novel, scalable and generalizable framework of user-profile
explanations
Exploration of FD reader data; domain-knowledge and
data-driven ontology
Interface mockup
User-study to evaluate the value-driven explanations
42
Future
Further designing implementation of
our framework
Formalize the epistemic goals
together with editors
Extend user studies/focus groups to
all value-driven goals
Thank you!
43
Pre-print: graus.nu/publications/reading-news-with-a-purpose/
Viz demo: info.ilab.sztaki.hu/~kdomokos/ict/
@dvdgrs
david.graus@fdmediagroep.nl

Opening the Black Box of User Profiles in Content-based Recommender Systems

Editor's Notes

  • #3 According to our corporate website
  • #4 How do we provide information in the financial domain? Many ways, probably best known for our all-day news radio station
  • #5 But in this project I’ll talk most about our daily financial newspaper, aka het FD, the European newspaper of the year
  • #6 Getting the right information to the right person at the right time Through summarization, personalization, and contextualization of our journalism
  • #7 Getting the right information to the right person at the right time Through summarization, personalization, and contextualization of our journalism
  • #8 More specifically, for FD We’re building a recommender system (as a first step)
  • #9 The “Them and Us slide”
  • #10 We monitor reading behavior We infer a “model” on top of that We generate recommendations
  • #11 We focus our explanations on the input, (usually explanations are on the output).
  • #12 That was the idea, submitted a project proposal at ICT with Industry We sat together with a team of academics from different disciplines Ranging from philosophy, political communication scientists to UX and computer scientists Academic hackathon in a week
  • #13 Enough context, on to content. These steps roughly correspond to the process we took to finish this project
  • #15 Provide understandability of reading behavior for users (not the publisher) Purpose: Provide users with a framework to expand the utility of the platform and achieve their epistemic (knowledge) goals
  • #16 With that purpose in mind we started looking at different types of explanations
  • #17 Level 1, dashboards, overviews, patterns
  • #18 Level 2: Context; how are you ‘unique’, what do you do more/better than typical users, etc.
  • #19 Final level where we combine insights to help users achieve specific goals Epistemic; knowledge goals ”Transactional” --- get commitment by giving something back
  • #20 How do you formulate these goals? Take something we want, and something we think our users would want
  • #21 How to measure whether/how we can do this
  • #22 Answer what to explain
  • #23 Many viz, whic one to pick? We need data analysis, to find the overall structure.
  • #24 Started looking into our data at the user behavior level
  • #25 Authors
  • #26 Many viz, whic one to pick? We need data analysis, to find the overall structure.
  • #27 At the content level;
  • #28 Fast-forward to our user-study.
  • #29 - Related goals: both about diversity in content - Suitable test case as to whether the specific goal leads to the user exploring different degrees of diversity.
  • #30 whether a particular goal w/ viz would influence intended reading behavior In terms of topics Using a dataviz to represent reading behavior
  • #31 Not too much detail but we set up a mturk experiment (with 40ish users) Users picked ‘persona’s’ that reflected reading behavior Represented as “topic word clouds” Users were presented a goal, visualization, and asked to pick which they would read next.
  • #32 This is the visualization we presented. it shows the topics a user has read over X amount of time. Real_Estate & Housing_Market are highly similar Energy & Environment Foods & Retail (we’re at Ahold) Care & Banks are very dissimilar (Sport & Govt)
  • #33 Hypotheses: 1. Comparison between selected topics between the two goals 2. Broaden horizon will select MORE SIMILAR/FAMILIAR topics 3. Unexplored selects LESS SIMILAR/FAMILIAR topics
  • #36 We did not find evidence that people select more similar topics in broaden horizon than discover
  • #37 People select more familiair topics in broaden horizon than discover H1: Partial support
  • #38 Second hypothesis is rejected, people don’t select more similar and familiar topics
  • #39 Hypotheses
  • #40 Partial evidence for third hypothesis; People select topics that are less familiar in discovering unexplored
  • #46 High similarity: Politics & Foreign Countries
  • #48 Markets & macro economy
  • #49 Real Etate, Banks, Housing Market