Data	Fusion	for	Dealing	
with	the	Recommendation	
Problem
Denis	Parra,	PUC	Chile
Keynote	for	IFUP
Workshop	on	Multi-dimensional	Information	Fusion	
for	User	Modeling	and	Personalization
UMAP	2016,	Halifax,	Canada
In	this	talk
• Recommendation	of	articles	with	user-controlled	
fusion
• Fusing	data	in	the	music	domain
• Fusion	for	e-marketplaces	in	virtual	worlds
• How	to	integrate	time	into	collaborative	filtering?
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 2
Part	1:	Recommendation	of	Articles	
with	User-Controlled	Fusion
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 3
Recommendation	of	Articles
• Problem:	a)	Traditional	user	feedback	is	(was?)	difficult	
to	obtain,	b)	Sparsity
• There	are	several	potential	sources	of	
recommendation,	but	mostly	from	the	items:
• Content
• Co-citations,	co-authorship
• Etc.
• Our	approach:	give	users	control	over	what	to	fuse.
• Would	it	work?
• How	much	data	combination	 is	the	optimum?
• Does	visual	representation	affect	the	behavior/accuracy?
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 4
References
• Verbert,	K.,	Parra,	D.,	Brusilovsky,	P.,	&	Duval,	E.	(2013).	
Visualizing	recommendations	to	support	exploration,	
transparency	and	controllability.	In	Proceedings	of	the	2013	
international	conference	on	Intelligent	user	interfaces (pp.	
351-362).	ACM.
• Parra,	D.,	Brusilovsky,	P.,	&	Trattner,	C.	(2014).	See	what	you	
want	to	see:	visual	user-driven	approach	for	hybrid	
recommendation.	In	Proceedings	of	the	19th	international	
conference	on	Intelligent	User	Interfaces(pp.	235-240).	
ACM.
• Verbert,	K.,	Parra,	D.,	&	Brusilovksy,	P.	(2014).	The	effect	of	
different	set-based	visualizations	on	user	exploration	of	
recommendations.	In	Proceedings	of	the	Joint	Workshop	on	
Interfaces	and	Human	Decision	Making	in	Recommender	
Systems(pp.	37-44).
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 5
TalkExplorer
• Implemented	initially	for	a	user	study	in	ACM	
Hypertext	2012	for	Conference	Navigator.
• Main	question	to	address:	Do	users	consider	the	
fusion	of	several	sources	of	data	when	choosing	
relevant	items?
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 6
Recap	– Conference	Navigator
Program Proceedings Author List Recommendations
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 7
TalkExplorer Interface
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 8
TalkExplorer - Entities
Entities
Tags,	Recommender	Agents,	
Users
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 9
TalkExplorer – Central	Canvas
Recommender
Recommender
Cluster with
intersection
of entities
Cluster (of talks)
associated to only
one entity
• Canvas	Area:	Intersections	of	Different	Entities	
User
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 10
TalkExplorer - Articles
Items
Talks	explored	by	the	
user	
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 11
TalkExplorer Studies	I	&	II
• Study	I
• Controlled	Experiment:	Users	were	asked	to	discover	
relevant	talks	by	exploring	the	three	types	of	entities:	
tags,	recommender	agents	and	users.
• Conducted	at	Hypertext	and	UMAP	2012	(21	users)
• Subjects	familiar	with	Visualizations	and	Recsys
• Study	II
• Field	Study:	Users	were	left	free	to	explore	the	
interface.
• Conducted	at	LAK	2012	and	ECTEL	2013	(18	users)	
• Subjects	familiar	with	visualizations,	but	not	much	with	
RecSys
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 12
Evaluation:	Intersections	&	Effectiveness
• What	do	we	call	an	“Intersection”?
• We	used	#explorations	on	intersections	and	their	
effectiveness,	defined	as:
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 13
Results	of	Studies	I	&	II
• Effectiveness	increases	
with	intersections	of	more	
entities
• Effectiveness	wasn’t	
affected	in	the	field	study	
(study	2)
• …	but	exploration	
distribution	was	affected
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 14
SetFusion
• Main	motivation	was	investigating	a	simpler	way	to	
visualize	recommendations	from	several	sources.	
Would	that	improve	“effectiveness”	?
• 3	studies	were	conducted
• Field	study	in	CSCW	2013
• Controlled	user	with	iConference series
• Field	study	in	UMAP	2013
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 15
SetFusion
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 16
SetFusion I
Traditional
Ranked List
Paperssorted by
Relevance.
It combines3
recommendation
approaches.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 17
SetFusion - II
Sliders
Allow the user to control the importance of
each data source or recommendation method
Interactive Venn Diagram
Allows the user to inspect and to filter papers
recommended. Actionsavailable:
- Filter item list by clicking on an area
- Highlight a paper by mouse-over on a circle
- Scroll to paper by clicking on a circle
- Indicate bookmarkedpapers
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 18
SetFusion Controlled	Study
• 40	users,	within-subjects	study,	simulated	
iConference attendance
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 19
Controlled	Study	Main	Results
• Controlling	and	fusing	sources	of	relevancy	
produces	more	bookmarks:
• 58.44%	of	bookmarks	after	using	sliders
• 28.08%		of	bookmarks	after	using	Venn	diagram
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 20
Controlled	Study	Main	Results
• Users	prefer	articles	recommended	by	a	fusion	of	
methods,	in	both	conditions,	but	the	effect	is	
stronger	with	the	visualization
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 21
SetFusion – UMAP	2013
• Field	Study:	let	users	freely	explore	the	interface
- ~50% (50 users) tried the
SetFusion recommender
- 28% (14 users) bookmarked at
least one paper
- Users explored in average 14.9
talks and bookmarked 7.36
talks in average.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 22
TalkExplorer Vs.	SetFusion
Clustermap Venn	diagram
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 23
TalkExplorer vs.	SetFusion
• Comparing	distributions	of	explorations
In studies 1 and 2 over
TalkExplorer we observed an
important change in the
distribution of explorations.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 24
TalkExplorer vs.	SetFusion
• Comparing	distributions	of	explorations
Comparing the field studies:
- In TalkExplorer, 84% of
the explorationsover
intersectionswere
performed over clusters of
1 item
- In SetFusion, was only
52%, compared to 48%
(18% + 30%) of multiple
intersections, diff. not
statistically significant
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 25
Take-aways
• We	showed	that	intersections	of	several	contexts	of	
relevance	help	to	discover	relevant	items.
• The	visual	paradigm	used	can	have	a	strong	effect	
on	user	behavior:	we	need	to	keep	working	on	
visual	representations	that	promote	exploration	
without	increasing	the	cognitive	load	over	the	
users.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 26
Part	2:	Fusing	Data	in	the	Music	
Domain
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 27
References
Parra-Santander,	D.,	&	Amatriain,	X.	(2011).	Walk	the	
Talk:	Analyzing	the	relation	between	implicit	and	
explicit	feedback	for	preference	elicitation.	
Proceedings	of	UMAP	2011,	Girona,	Spain
Parra,	D.,	Karatzoglou,	A.,	Amatriain,	X.,	&	Yavuz,	I.	
(2011).	Implicit	feedback	recommendation	via	
implicit-to-explicit	ordinal	logistic	regression	
mapping.	Proceedings	of	the	CARS	Workshop,	RecSys
Chicago,	IL,	USA,	2011.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 28
Introduction	(back	in	2011)
• Most	of	recommender	system	approaches	rely	on	
explicit	information	of	the	users,	but…
• Explicit	feedback:	scarce	(people	are	not	especially	
eager	to	rate	or	to	provide	personal	info)
• Implicit	feedback:	Is	less	scarce,	but	(Hu	et	al.,	2008)
There’s	no	negative	feedback …	and	if	you	watch	a	TV	program	just	
once	or	twice?
Noisy …	but	explicit feedback	is	also	noisy	
(Amatriain	et	al.,	2009)
Preference	&	Confidence …	we	aim	to	map	the	I.F.	to	
preference	(our main	goal)
Lack	of	evaluation	metrics …	if	we can	map	I.F.	and	E.F.,	we	can	
have	a	comparable	evaluation
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 29
Introduction	(Today)
• Is	it	possible	to	map	implicit	behavior	to	explicit	
preference	(ratings)?		These	data	can	eventually	be	
fused	into	a	single	compact	model.
• OUR	APPROACH:	Study	with	Last.fm	users	
• Part	I:	Ask	users	to	rate	100	albums	(how	to	sample)
• Part	II:	Build	a	model	to	map	collected	implicit	feedback	
and	context	to	explicit	feedback
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 30
Walk	the	Talk	(2011)
Albums	they	listened	 to	during	last:	
7days,	3months,	6months,	year,	
overall For	each	album	in	the	list	we	
obtained:	#	user	plays	(in	each	
period),	#	of	global	listeners	and	#	of	
global	plays	
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 31
Walk	the	Talk	- 2
• Requirements:	18	y.o.,		scrobblings >	5000
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 32
Quantization	of	Data	for	Sampling
• What	items	should	they	rate?	Item	(album)	sampling:
• Implicit	Feedback	(IF):	playcountfor	a	user	on	a	given	album.	
Changed	to	scale	[1-3],	3	means	being	more	listened	to.
• Global	Popularity	(GP):	global	playcount for	all	users	on	a	
given	album	[1-3].	Changed	to	scale	[1-3],	3	means	being	
more	listened	to.	
• Recentness(R)	:	time	elapsed	since	user	played	a	given	
album.	Changed	to	scale	[1-3],	3	means	being	listened	to	
more	recently.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 33
Regression	Analysis
• Including	Recentness	 increases	R2	in	more	than	10%	[	1		->	2]
• Including	GP	increases	R2,	not	much	compared	to	RE	+	IF	[	1	->	3]
• Not	Including	GP,	but	including	interaction	between	IF	and	RE	
improves	the	variance	of	the	DV	explained	 by	the	regression	 model.	[	
2	->	4	]
M1:	implicit	feedback
M2:	
implicit	
feedback	&	
recentness
M4:	
Interaction	of	
implicit	
feedback	&	
recentness
M3:	implicit	
feedback,	
recentness,	
global	
popularity
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 34
Regression	Analysis
• We	tested	conclusions	of	regression	analysis	by	
predicting	the	score,	checking	RMSE	in	10-fold	
cross	validation.
• Results	of	regression	analysis	are	supported.
Model RMSE1 RMSE2
User	average 1.5308 1.1051
M1:	Implicit feedback 1.4206 1.0402
M2:	Implicit	feedback	 +	recentness 1.4136 1.034
M3:	Implicit	feedback	 + recentness	 +	global	popularity 1.4130 1.0338
M4:	Interaction of	Implicit	feedback	 *	recentness 1.4127 1.0332
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 35
Part	II:	Extension	of	Walk	the	Talk
• Implicit	Feedback	Recommendation	via	Implicit-to-
Explicit	OLR	Mapping	(Recsys 2011,	CARS	
Workshop)
• Consider	ratings	as	ordinal	variables
• Use	mixed-models	to	account	for	non-independence	of	
observations
• Compare	with	state-of-the-art	implicit	feedback	
algorithm
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 36
Recalling	the	1st study	(5/5)
• Prediction	of	rating	by	multiple	Linear	Regression	
evaluated	with	RMSE.	
• Results	showed	that	Implicit	feedback (play	count	
of	the	album	by	a	specific	user)	and	recentness
(how	recently	an	album	was	listened	to)	were	
important	factors,	global	popularity	had	a	weaker	
effect.
• Results	also	showed	that	listening	style	(if	user	
preferred	to	listen	to	single	tracks,	CDs,	or	either)	
was	also	an	important	factor,	and	not	the	other	
ones.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 37
...	but
• Linear	Regression	didn’t	account	for	the	nested	
nature	of	ratings
• And	ratings were	treated	as	continuous,	when	they	
are	actually	ordinal.
User	1
1		3		5		3		0		4		5		2		2		1		5		4		3		2
User	n
3		2		1		0		4		5		2	5		4		3		2	1		3		5	
.	.	.	
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 38
So,	Ordinal	Logistic	Regression!
• Actually	Mixed-Effects	Ordinal	Multinomial	Logistic	
Regression
• Mixed-effects:	Nested	nature	of	ratings	
• We	obtain	a distribution	over	ratings	(ordinal	
multinomial)	per	each	pair	USER,	ITEM	->	we	
predict the	rating	using	the	expected	value.
• …	And	we	can	compare	the	inferred	ratings with a	
method	that	directly	uses	implicit	information	
(playcounts)	to	recommend (	by	Hu,	Koren et	al.	
2007)
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 39
Ordinal	Regression	for	Mapping
• Model
• Predicted	value
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 40
Datasets
• D1:	users,	albums,	if,	re,	gp,	ratings,	
demographics/consumption
• D2:	users,	albums,	if,	re,	gp,	NO	RATINGS.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 41
Results
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 42
Conclusions	(after	5	years)
• Fusion	of	Implicit	feedback	(scrobbles)	and	recency
can	help	to	make	more	precise	recommendations
• Models	like	the	one	by	Gurbanov and	Ricci	presented	
this	year	at	UMAP	offer	a	more	compact	way	to	work	
with	these	data:
“Modeling	and	Predicting	User	 Actionsin Recommender	 Systems”	
by	Tural Gurbanov,	 	Francesco	Ricci,	Meinhard Ploner
• Evaluation	is	still	a	challenge!
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 43
Part	3:	Data	Fusion	for	Virtual	Worlds
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 44
References
Lacic,	E.,	Kowald,	D.,	Eberhard,	L.,	Trattner,	C.,	Parra,	D.,	&	
Marinho,	L.	B.	(2015).		Utilizing	online	social	network	and	
location-based	data	to	recommend	products	and	
categories	in	online	marketplaces.		In Mining,	Modeling,	
and	Recommending	'Things'	in	Social	Media (pp.	96-115).	
Springer	International	Publishing.
Trattner,	C.,	Parra,	D.,	Eberhard,	L.,	&	Wen,	X.	(2014,	
April).	Who	will	trade	with	whom?:	Predicting	buyer-
seller	interactions	in	online	trading	platforms	through	
social	networks.	In Proceedings	of	the	23rd	International	
Conference	on	World	Wide	Web (pp.	387-388).	ACM.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 45
Second	Life
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 46
Social	Network
Marketplace
Virtual	World
Christoph Trattner
Know-Center
Graz,	Austria
Dataset	(Task:	Item	recommendation)
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 47
Recommendation	Approaches
• User-based	Collaborative	Filtering,	where
• Hybrid	approaches	(combine	features)
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 48
Similarity	Features	- I
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 49
Similarity	Features	II
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 50
Hybrids
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 51
Different	Task:	Predict	Buyer-Seller
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 52
Predict	Buyer-Sellers:	AUC	Results
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 53
Summary
• These	studies	show	that	social	network	data	is	very	
important	for	certain	types	of	recommendations.
• Due	to	the	lack	of	available	cross-service	data	in	the	
real	world,	using	data	from	Second	Life	has	the	
potential	of	a	Proxy	to	build	models	for	the	real	
world.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 54
Part	4:	Fusion	of	Time	into	CF
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 55
References
Larrain,	S.,	Trattner,	C.,	Parra,	D.,	Graells-Garrido,	E.,	
&	Nørvåg,	K.	(2015).	
Good	Times	Bad	Times:	A	Study	on	Recency Effects	in	
Collaborative	Filtering	for	Social	Tagging.	
In Proceedings	of	the	9th	ACM	Conference	on	
Recommender	Systems (pp.	269-272).	ACM.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 56
Time-Aware	Collaborative	Filtering
• Collaborative	Filtering	(User	and	Item-based)	
considers	all	transactions	equally	important
• But	transactions	which	happened	too	long	ago	
might	be	less	important	shaping	the	user	model…
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 57
5
4
2
1
5
4
Active
user
User_1
User_2
2
3
4
Item 1
Item 2
consumed	
2	years	ago
consumed
1	month	
ago
Two	Concepts	for	Time-Aware	CF
• Items	consumed	recently	might	be	more	important	
than	items	consumed	long	time	ago.
•Whenand	how to	incorporate	time	in	user-
and	item-based	collaborative	filtering?
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 58
When	and	How	in	UB-CF
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 59
Item	1 Item	2 … Item	j Item	m
User	1 1 5 2
User	2 5 1 4 2
…
User	i 3 4
…
User	n 2 5 5
Step	1:	Find	similar	users.	Weight	transactions	
based	on	recency difference
When	and	How	in	UB-CF	
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 60
Item	1 Item	2 … Item	j Item	m
User	1 1 5 2
User	2 5 1 4 3
…
User	i 3 4
…
User	n 2 5 4
Step	2:	Similar	users	found.	Recommend	items	
with	high	ratings	and	consumed	recently.
When	and	How	in	IB-CF
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 61
Item	1 Item	2 … Item	j Item	m
User	1 1 5 2
User	2 5 1 4 2
…
User	i 3 4
…
User	n 2 5 5
Step	1:	Find	similar	items	sim(items(user	i)).	
Weight	items	based	on	recency.		 Consu-
med	1	
week	
ago
Consu-
med	1	
year	
ago
When	and	How	in	IB-CF
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 62
Item	1 Item	2 … Item	j Item	m
User	1 1 5 2
User	2 5 1 4 2
…
User	i 3 4
…
User	n 2 5 5
Step	2:	Find	similar	items	Item	1.	Weight	items	
based	on	recency difference.
Decay	functions
• Exponential
• Power
• Linear
• Logistic
• BLL
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 63
Parameters	and	fitting
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 64
Days	from	bookmark
Median	=	50	days
Evaluation:	Datasets
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 65
Evaluation:	Results	I
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 66
Evaluation:	Results	II
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 67
Summary
• Best	results:	Post-filtering	combined	with	power	
decay	gives	the	best	
• Pre- and	Post-filtering	produce	a	strong	effect,	but	
UB-CF	is	more	susceptible	than	IB-CF	to	the	effect	
of	filtering	specially	pre-filtering.
• The	hybridization	of	UB	and	IB	improves	makes	the	
recommendation	more	robust.
• Future	work:	fit	parameters	on	a	user	basis	rather	
than	dataset	basis.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 68
Wrapping	up
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 69
• Visual	approaches	for	user-controllable	data	fusion	can	
work,	but	there’s	room	to	find	effective	visual-
interactive	combinations.
• In	the	music	domain	and	other	domains,	time	and	
recency can	work	very	well	for	recommendation.
• …but	using	time	requires	an	adequate	modeling	of	the	
decay	functions.
• Information	from	Virtual	worlds	could	may	be	used	as	
proxy	to	build	models	and	use	them	for	transfer	
learning.
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 70
Promising	works	in	this	UMAP	2016
• Using	Semantic	Information:	Extend	the	work	of	
Musto et	al.	(UMAP	2016)	to	support	better	models	
and	more	explainable	models.
• Combine	taxonomies	with	implicit/explicit	feedback	
using	compact	graphical	models	(co-authored	by	g.	
Guo)
• Extend	models	with	time	and	other	sources	of	
feedback	(Turgnov et	al.)
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 71
Ideas	for	Data	Fusion
• Combine	multimodal	information	within	the	same	
embedding	using	deep	learning	has	given	great	
results	in	visual	processing	+	NLP	fields:
• Visual	Q&A
• Automatic	Captioning	of	Pictures
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 72
Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Lawrence Zitnick, C., & Parikh,
D. (2015). Vqa: Visual question answering. In Proceedings of the IEEE International
Conference on Computer Vision (pp. 2425-2433).
Thanks!
dparras@uc.cl
http://dparra.sitios.ing.uc.cl/
7/17/16 D.	Parra,	IFUP	keynote,	UMAP	2016 73

Data Fusion for Dealing with the Recommendation Problem