Robson Motta | robson@chaordic.com.br
Aprendizado de Máquina e Visualização de Informação
para otimização de Sistemas de Recomendação
312.000.000.000
(this means billions)
recommendations
in 2014
Get to know
our solutions
How to present the
best
recommendation
for each client/context?
recommendations
data
recommendations
data
preprocessing
processing
postprocessing
● products
● pageviews
● clicks
● buyorders
etc.
Machine Learning
“All models are wrong,
but some are useful”
(George E. P. Box)
Collaborative
Filtering
1
Collaborative
Filtering
1
Customers Who Bought This Item Also Bought, PaulsHealthBlog.com, 11.04.2014
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
Collaborative
Filtering
1
user-based
Collaborative
Filtering
1
10 5 7 0 2 3 4 1
...
Collaborative
Filtering
1
10 5 7 0 2 3 4 1
...
item-based
Collaborative
Filtering
1
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???
new items
How do we
guarantee quality
to our clients?
● subjective evaluation: Visualization
● objective evaluation: Quality measures
● online evaluation: A/B test
● online optimization: Bandit
Multidimensional Projection
(tSNE technique)
Stability, purity and coverage measures
Content-based
Filtering
2
Content-based
Filtering
2
frequency of term
n in document d
IDF factor of
term n
weight of term n
within document d
reference
reference
reference
reference
Content-based
Filtering
2
Content-based
Filtering
2
Content-based
Filtering
2
Clustering
3
Clustering
3
Clustering
3
Clustering
3
… main issues
the number
of clusters
Clustering
3
Clustering
3
Clustering
3
… main issues
false positives
(pair of products wrongly
assigned to the same cluster)
false negatives
(pair of products wrongly
assigned to different clusters)
Clustering
3
Clustering
3
Clustering
3
Classification
4
Classification
4
Classification
4
Classification
4
Classification
4
… main issues
unbalanced classes
unlabeled areas
Classification
4
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???
new items
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???
new items
x
Circular connected chart: alternatives
Circular connected chart: complementars
Tabular information
Circular connected chart: complementars
A/B tests
+16%
clicks
final result:
10 days
95% significance
Multi-armed
Bandit
5
Multi-armed
Bandit
5
Exploration-Exploitation
trade-off
Multi-armed
Bandit
5
… case 1
algorithm 2
algorithm 1
…
algorithm N
Multi-armed
Bandit
5
… case 2
order 2
order 1
…
Multi-armed
Bandit
chance to
be picked
5
Multi-armed
Bandit
5 chance to
be picked
Multi-armed
Bandit
5 chance to
be picked
Multi-armed
Bandit
5 chance to
be picked
Multi-armed
Bandit
user feedback: click
5 chance to
be picked
Multi-armed
Bandit
5 chance to
be picked
Multi-armed
Bandit
user feedback: click
5 chance to
be picked
Bandit - Beta Distribution
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and
10 attempts
0 success and
0 attempts
5 success and
10 attempts
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and
10 attempts
0 success and
0 attempts
5 success and
10 attempts
Bandit - Beta Distribution
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and
10 attempts
0 success and
0 attempts
5 success and
10 attempts
Bandit - Beta Distribution
0 success and
10 attempts
0 success and
0 attempts
5 success and
10 attempts
Bandit - Thompson Sampling
http://www.distributome.org/js/sim/BetaSimulation.html
Bandit - Thompson Sampling
success and attempts: [(0, 10), (0, 7), (0, 7), (0, 6), (0, 4), (0, 3), (0, 4), (0, 3), (0, 0), (0, 0), ...
success and attempts: [(1, 44), (10, 398), (0, 66), (1, 57), (2, 25), (14, 324), (0, 3), (1, 46), ...
Bandit - Thompson Sampling
success and attempts: [(103, 1183), (64, 1138), (48, 900), (25, 524), (56, 527), (37, 546), (11, 216), …
success and attempts: [(143, 2227), (8, 299), (119, 1706), (28, 889), (146, 1288), (86, 1646), (63, 1272) ...
Bandit convergence
A/B tests
+3,5 %
purchases
final result:
25 days
95% significance
Robson Motta
robson@chaordic.com.br

Aprendizado de maquina e visualizacao de informacao para otimizacao de sistemas de recomendacao