Building your first recommender
Jettro Coenradie
AMSTERDAM | MAY 8-9, 2018
Jettro Coenradie
Fellow	at	Luminis	Amsterdam	
specialised	in	(Elastic)	search	
experimenting	with	Machine	Learning
@jettroCoenradie
https://www.linkedin.com/in/jettro/
https://github.com/jettro
https://nl.wikipedia.org/wiki/Free_Record_Shop
The Goal
The Goal
The Goal
To increase sales and happiness
To increase sales and happiness
Advertorial
To increase sales and happiness
Advertorial
Recommendation
Advertorial
Not	based	on	knowledge	we	have	about	you,	no	
context.
Advertorial
Not	based	on	knowledge	we	have	about	you,	no	
context.
Based	on	what	we	know	about	you	and	about	
the	items	we	have.
Advertorial
Recommendation
Having posters on the wall
Having posters on the wall
Advertorial
Regular customer wants to buy
something new
Regular customer wants to buy
something new
Recommendation
Play music in the store
Play music in the storeAdvertorial
Play music in the store
Saturday	morning
Advertorial
Play music in the store
Saturday	morning
Advertorial
Play music in the store
Saturday	morning
Advertorial
Recommendation
Risk of Recommendations
Risk of Recommendations
• Would	you	recommend	something	that	is	out	of	stock?
Risk of Recommendations
• Would	you	recommend	something	that	is	out	of	stock?
• Would	you	advise	different	items	during	the	seasons?
Risk of Recommendations
• Would	you	recommend	something	that	is	out	of	stock?
• Would	you	advise	different	items	during	the	seasons?
• Does	personal	taste	influence	your	recommendations?
Who buys music these days?
Advertorial
Advertorial
Recommendation
Goal:
Have users listen to many different songs
Make users stick around
Analytics Dashboard to support our Goals
• Track	returning	users	
• Track	amount	of	songs	a	user	listens	to	
• Track	amount	of	recommended	songs	are	listened	to	
• ????
Assumption
Recommendations	empower	our	application
Type of recommendations
• Collaborative	filtering	
• Content	/	Knowledge	based	filtering
Collaborative filtering
Find	similar	users	using	behaviour	of	the	users
Collaborative filtering
Find	similar	users	using	behaviour	of	the	users
Explicit	Ratings	
• Rates	song	with	amount	of	stars
Collaborative filtering
Find	similar	users	using	behaviour	of	the	users
Explicit	Ratings	
• Rates	song	with	amount	of	stars
Implicit	Ratings	
• Users	click	on	a	song	
• Users	listens	to	complete	song	(or	stops	after	half	a	minute)
Content/Knowledge based filtering
Find	similar	music	using	properties	of	the	music	
• The	genre	of	the	music	
• The	artist	
• Producer
Create recommender
using Collaborative filtering
Create recommender
using Collaborative filtering
Obtain	all	rating	data
Create recommender
using Collaborative filtering
Obtain	all	rating	data
Find	similar	users
Create recommender
using Collaborative filtering
Obtain	all	rating	data
Find	similar	users
Predict	missing	ratings
Create recommender
using Collaborative filtering
Obtain	all	rating	data
Find	similar	users
Predict	missing	ratings
Present	recommendations
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
User-item matrix
Obtain all rating data
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
User-item matrix
Obtain all rating data
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
User-item matrix
Obtain all rating data
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
User-item matrix
Obtain all rating data
user	i user	jsimilar
item	m item	n
similar
rated rated
user	i user	jsimilar
item	m item	n
similar
rated rated
user	based
user	i user	jsimilar
item	m item	n
similar
rated rated
item	based
user	i user	jsimilar
item	m item	n
similar
rated rated
item	based
user	based
Find similar users
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Find similar users
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Find similar items
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Find similar items
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Similarity
How	to	determine	if	users	or	items	are	similar
==
==
?
?
Definition of similarity
sim(itemi ,itemi ) = 1
sim(itemi ,itemj ) = 0
Similarity based on distance
similarity(itemi ,itemj ) = 1−
distance
maximum_distance
User similarity examples
• Jaccard	Distance	-	Compare	items	liked	between	users	
• L1-Norm	-	Compares	ratings	given	by	users	the	Manhatten	way	
• L2-Norm	-	Compares	ratings	the	Pythagoras	way	
• Cosine	-	Compares	ratings	the	angle	between	vectors	way
Jaccard Distance
album	1 album	2 album	3 album	4
user	1
user	2 4 4
Jaccard Distance
album	1 album	2 album	3 album	4
user	1
user	2 4 4
distance(user1,user2 ) =
#albums_liked _by_both
#albums_liked _by_any
=
1
3
Jaccard Distance
album	1 album	2 album	3 album	4
user	1
user	2
Jaccard Distance
album	1 album	2 album	3 album	4
user	1
user	2
distance(user1,user2 ) =
#albums_ same_opinion
#albums_with_opinion
=
2
3
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
Simplify to 2D space
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
Simplify to 2D space
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
L1 Norm
Mean Absolute Error
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
L1 Norm
D1
D2
Mean Absolute Error
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
L1 Norm
D1
D2
Mean Absolute Error
=
1
2
4 −1 + 3− 5( )= 2.5
distance(user2,user3 ) =
1
n
ratinguser2 ,albumi
− ratinguser3,albumi
i=1
n
∑
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
L2 Norm
Root Mean Squared Error
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
D1
L2 Norm
Root Mean Squared Error
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
D1
L2 Norm
=
1
2
4 −1
2
+ 3− 5
2
=
1
2
13 = 1.8
distance(user2,user3 ) =
1
n
ratinguser2 ,albumi
− ratinguser3,albumi
2
i=1
n
∑
Root Mean Squared Error
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
Cosine
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
Cosine
∠23
∠12
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
Cosine
∠23
∠12
sim(user2,user3 ) =
ratinguser2 ,albumi
i ratinguser3,albumi
i=1
n
∑
ratinguser2 ,albumi
2
i=1
n
∑ i ratinguser3,albumi
2
i=1
n
∑
album	1 album	2
user	1 4 1
user	2 4 3
user	3 1 5
L1	norm L2	norm Cosine
user	1	
user	2
0.75 0.65 0.92
user	1	
user	3
0.13 0.12 0.43
user	2	
user3
0.38 0.36 0.75
Normalising
• Use	average	of	ratings	for	each	album	
• Baseline	ratings	using	the	average	rating
Normalising
• Use	average	of	ratings	for	each	album	
• Baseline	ratings	using	the	average	rating
1
2
3
4
5
u1 u2 u3 u4 u5
Normalising
• Use	average	of	ratings	for	each	album	
• Baseline	ratings	using	the	average	rating
1
2
3
4
5
u1 u2 u3 u4 u5
2.6
Normalising
• Use	average	of	ratings	for	each	album	
• Baseline	ratings	using	the	average	rating
1
2
3
4
5
u1 u2 u3 u4 u5
2.6
Normalised	rating:	u3	=	2	-	2.6
Normalising
• Use	average	of	ratings	for	each	album	
• Baseline	ratings	using	the	average	rating
1
2
3
4
5
u1 u2 u3 u4 u5
2.6
Normalised	rating:	u3	=	2	-	2.6
ratinguser2 ,albumi
− ratinguser2
( )i ratinguser3,albumi
− ratinguser3
( )i=1
n
∑
ratinguser2 ,albumi
− ratinguser2
( )
2
i=1
n
∑ i ratinguser3,albumi
− ratinguser3
( )
2
i=1
n
∑
Find Neighbourhood
• Find	most	similar	users	
• Users	that	have	rated	at	least	one	item	I	have	rated	
• Score	indicating	how	similar	the	user	is	to	me
Predicting missing ratings
Predict	the	ratings	a	user	would	give
Predictions - Classification way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predictions - Classification way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
Predictions - Classification way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
1	x	4
Predictions - Classification way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
1	x	4
2	x	1
Predictions - Classification way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
1	x	4
2	x	1	
so	prediction	becomes	1
1
Predictions - Regression way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
Predictions - Regression way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
=
4	+	1	+	1
3
Predictions - Regression way
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
Predicted	rating	for	user	1	and	album	3
so	prediction	becomes	2
2
=
4	+	1	+	1
3
Predictions
• Normalise	ratings	by	users	
• Give	more	similar	users	higher	impact	
• Only	use	ratings	when	more	than	‘x’	persons	have	rated
The story so far
The story so far
The story so far
The story so far
The story so far
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
The story so far
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
==
==
?
?
The story so far
Item
album	1 album	2 album	3 album	4
user	1 4 1 3
user	2 4 3 4 2
user	3 1 5 1 4
user	4 1 1 4
User
==
==
?
?
Predict
Demo
https://rolling500.luminis.amsterdam
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
Logs
Queries
Clicks
Rolling 500Recommendation
Recommendations
Ratings
{
"album": "War",
"image": "rs-137178-883f5fe955b2745cd539.jpg",
"information": "<p>U2 were on the cusp of
becoming one of the Eighties&apos; most
important groups when their third album came
out. It&apos;s the band&apos;s most overtly
political album, with songs about Poland&apos;s
Solidarity movement (&quot;New Year&apos;s
Day&quot;) and Irish unrest (&quot;Sunday
Bloody Sunday&quot;) charged with explosive,
passionate guitar rock.</p>",
"sequence": 223,
"order": 279,
"label": "Island",
"artist": "U2",
"year": "1983",
"id": 350640
}
{
“user_id": “39cf6ada-6a79-4d10-9537-0fe65179b7d2",
"ratings": [0,0,0,4,0,0,0,1,0,0,0,0,5, … ,0,0,0,3,5,1,0,0]
}
The Data
Lenskit - http://lenskit.org
Open-Source	Tools	for	Recommender	Systems
ItemRecommender RatingPredictor
TopNItemRecommender SimpleRatingPredictor
Application
implements implements
uses
ItemRecommender
BaselineScorer
UserVectorSimilarity
EventDao
UserMeanItemScorer
CosineVectorSimilarity
EventCollectionDao
TopNItemRecommender
ItemScorer
UserUserItemScorer
implements
implements
uses
uses
implements
implements
implements
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
private LenskitConfiguration configureUserSimilarity(
List<Rating> foundRatings, Map<String, Long> userIdMapping) {
LenskitConfiguration config = new LenskitConfiguration();
config.bind(ItemScorer.class).to(UserUserItemScorer.class);
config.bind(BaselineScorer.class, ItemScorer.class).to(UserMeanItemScorer.class);
config.bind(UserSimilarityThreshold.class, Threshold.class).to(NoThreshold.class);
config.bind(UserMeanBaseline.class, ItemScorer.class).to(ItemMeanRatingItemScorer.class);
config.bind(UserVectorNormalizer.class).to(BaselineSubtractingUserVectorNormalizer.class);
config.within(UserVectorSimilarity.class).bind(VectorSimilarity.class).to(CosineVectorSimilarity.class);
List<MutableRating> ratings = convertRatings(foundRatings, userIdMapping);
config.bind(EventDAO.class).to(EventCollectionDAO.create(ratings));
return config;
}
public class Rating {
private String userId;
private int[] ratings;
}
public class MutableRating implements Rating {
private long uid;
private long iid;
private double value;
}
private List<MutableRating> convertRatings(List<Rating> foundRatings, Map<String, Long> userIdMapping) {
List<MutableRating> ratings = new ArrayList<>();
CurrentUserId userId = new CurrentUserId();
foundRatings.forEach(rating -> {
String userIdString = rating.getUserId();
long curUser = (userIdMapping.containsKey(userIdString)) ? userIdMapping.get(userIdString) : userId.getUserId();
userIdMapping.put(userIdString, curUser);
for (int i = 0; i < rating.getRatings().length; i++) {
if (rating.getRatings()[i] > 0) {
ratings.add(createRating(i, rating.getRatings()[i], curUser));
}
}
});
return ratings;
}
private MutableRating createRating(long item, long rating, long user) {
MutableRating mutableRating = new MutableRating();
mutableRating.setItemId(item);
mutableRating.setRating(rating);
mutableRating.setUserId(user);
return mutableRating;
}
private List<MutableRating> convertRatings(List<Rating> foundRatings, Map<String, Long> userIdMapping) {
List<MutableRating> ratings = new ArrayList<>();
CurrentUserId userId = new CurrentUserId();
foundRatings.forEach(rating -> {
String userIdString = rating.getUserId();
long curUser = (userIdMapping.containsKey(userIdString)) ? userIdMapping.get(userIdString) : userId.getUserId();
userIdMapping.put(userIdString, curUser);
for (int i = 0; i < rating.getRatings().length; i++) {
if (rating.getRatings()[i] > 0) {
ratings.add(createRating(i, rating.getRatings()[i], curUser));
}
}
});
return ratings;
}
private MutableRating createRating(long item, long rating, long user) {
MutableRating mutableRating = new MutableRating();
mutableRating.setItemId(item);
mutableRating.setRating(rating);
mutableRating.setUserId(user);
return mutableRating;
}
private List<MutableRating> convertRatings(List<Rating> foundRatings, Map<String, Long> userIdMapping) {
List<MutableRating> ratings = new ArrayList<>();
CurrentUserId userId = new CurrentUserId();
foundRatings.forEach(rating -> {
String userIdString = rating.getUserId();
long curUser = (userIdMapping.containsKey(userIdString)) ? userIdMapping.get(userIdString) : userId.getUserId();
userIdMapping.put(userIdString, curUser);
for (int i = 0; i < rating.getRatings().length; i++) {
if (rating.getRatings()[i] > 0) {
ratings.add(createRating(i, rating.getRatings()[i], curUser));
}
}
});
return ratings;
}
private MutableRating createRating(long item, long rating, long user) {
MutableRating mutableRating = new MutableRating();
mutableRating.setItemId(item);
mutableRating.setRating(rating);
mutableRating.setUserId(user);
return mutableRating;
}
private List<MutableRating> convertRatings(List<Rating> foundRatings, Map<String, Long> userIdMapping) {
List<MutableRating> ratings = new ArrayList<>();
CurrentUserId userId = new CurrentUserId();
foundRatings.forEach(rating -> {
String userIdString = rating.getUserId();
long curUser = (userIdMapping.containsKey(userIdString)) ? userIdMapping.get(userIdString) : userId.getUserId();
userIdMapping.put(userIdString, curUser);
for (int i = 0; i < rating.getRatings().length; i++) {
if (rating.getRatings()[i] > 0) {
ratings.add(createRating(i, rating.getRatings()[i], curUser));
}
}
});
return ratings;
}
private MutableRating createRating(long item, long rating, long user) {
MutableRating mutableRating = new MutableRating();
mutableRating.setItemId(item);
mutableRating.setRating(rating);
mutableRating.setUserId(user);
return mutableRating;
}
private List<MutableRating> convertRatings(List<Rating> foundRatings, Map<String, Long> userIdMapping) {
List<MutableRating> ratings = new ArrayList<>();
CurrentUserId userId = new CurrentUserId();
foundRatings.forEach(rating -> {
String userIdString = rating.getUserId();
long curUser = (userIdMapping.containsKey(userIdString)) ? userIdMapping.get(userIdString) : userId.getUserId();
userIdMapping.put(userIdString, curUser);
for (int i = 0; i < rating.getRatings().length; i++) {
if (rating.getRatings()[i] > 0) {
ratings.add(createRating(i, rating.getRatings()[i], curUser));
}
}
});
return ratings;
}
private MutableRating createRating(long item, long rating, long user) {
MutableRating mutableRating = new MutableRating();
mutableRating.setItemId(item);
mutableRating.setRating(rating);
mutableRating.setUserId(user);
return mutableRating;
}
Implementing the story
Implementing the story
Rolling 500
Implementing the story
Rolling 500
Next chapters of the story
Segmentation - Selecting the
Neighbourhood
Use	clustering	technologies	to	pre-select	a	group
K-means
clustering
K-means
clustering
K-means
clustering
K-means
clustering
K-means
clustering
Cold cases
Users	or	items	without	ratings
Cold users
They	do	not	have	ratings	yet,	so	how	to	find	similar	users?	
• Can	we	use	information	we	have	from	them?	
• Ask	them	to	rate	a	few	items	first	
• Provide	them	with	the	top	selling	items
Cold items
Nobody	rated	them	yet	
• Ask	users	to	give	them	a	rating	
• Add	a	new	items	list	to	the	screen.	
• Use	Content/knowledge	based	recommendations
Content based Filtering
• Use	properties	of	products	like:	artist,	label,	year,	singer,	guitarist,	etc	
• Tags	or	categories:	
• Contains	a	lengthy	guitar	solo	
• Singer	/	songwriter	
• Contains	violins	
• Female	singer
Find “more like this” items
• Simple	approach:	use	a	query	in	elasticsearch	with	TF/IDF	or	BM25	
• Good	to	use	things	like	stop	words,	synonyms,	phrase	matches,	
stemming,	etc	
• Use	Entity	Extraction	to	find	Names,	Places,	dates,	etc
Nirvana
Nevermind
The	overnight-success	story	of	the	1990s,	Nirvana's	second	album	and	its	totemic	first	single,	"Smells	Like	Teen	Spirit,"	shot	up	
from	the	nascent	grunge	scene	in	Seattle	to	kick	Michael	Jackson	off	the	top	of	the	Billboard	album	chart	and	blow	hair	metal	off	
the	map.	No	album	in	recent	history	had	such	an	overpowering	impact	on	a	generation	–	a	nation	of	teens	suddenly	turned	punk	
–	and	such	a	catastrophic	effect	on	its	main	creator.	The	weight	of	fame	led	already	troubled	singer-guitarist	Kurt	Cobain	to	take	
his	own	life	in	1994.	But	his	slashing	riffs,	corrosive	singing	and	deviously	oblique	writing,	rammed	home	by	the	Pixies-via-
Zeppelin	might	of	bassist	Krist	Novoselic	and	drummer	Dave	Grohl,	put	the	warrior	purity	back	in	rock	&	roll.	Lyrically,	Cobain	
raged	in	code	–	shorthand	grenades	of	inner	tumult	and	self-loathing.	His	genius,	though,	in	songs	like	"Lithium,"	"Breed"	and	
"Teen	Spirit"	was	the	soft-loud	tension	he	created	between	verse	and	chorus,	restraint	and	assault.	Cobain	was	a	pop	lover	at	
heart	–	and	a	Beatlemaniac:	Nevermind	producer	Butch	Vig	remembers	hearing	Cobain	play	John	Lennon's	"Julia"	at	sessions.	
Cobain	also	fought	to	maintain	his	underground	honor.	Ultimately,	it	was	a	losing	battle,	but	it	is	part	of	this	album's	enduring	
power.	Vig	recalls	when	Cobain	was	forced	to	overdub	the	guitar	intro	to	"Teen	Spirit"	because	he	couldn't	nail	it	live	with	the	
band: "That	pissed	him	off.	He	wanted	to	play	[the	song]	live	all	the	way	through."
@jettroCoenradie
https://www.linkedin.com/in/jettro/
https://github.com/jettro
https://rolling500.luminis.amsterdam	
https://github.com/jettro/rolling500	
http://lenskit.org	
http://elasticsearch-learning-to-rank.readthedocs.io

Building your first recommender - Jettro Coenradie - Codemotion Amsterdam 2018