0
Recommender
Systems, MaTRICES
and Graphs
Roelof Pieters
roelof@vionlabs.com
14 May 2014 @ KTH
About me
Interests in:
• IR, RecSys, Big Data, ML, NLP, SNA,
Graphs, CV, Data Visualization, Discourse
Analysis
History:
•...
Say Hello!
St: Eriksgatan 63
112 33 Stockholm - Sweden
Email: hello@vionlabs.com
Tech company here in Stockholm with Geeks...
WE LOVE MOVIES….
Outline
•Recommender Systems
•Algorithms*
•Graphs
(* math magicians better pay attention here)
Outline
•Recommender Systems
•Taxonomy
•History
•Evaluating Recommenders
•Algorithms*
•Graphs
(* math magicians better pay...
Information Retrieval
• Recommender
Systems as part of
Information Retrieval
Document(s)Document(s)Document(s)Document(s)D...
IR: Measure Success
• Recall: success in retrieving all correct documents
• Precision: success in retrieving the most rele...
“generate meaningful recommendations to a
(collection of) user(s) for items or products that
might interest them”
Recommen...
Where can RS be found?
• Movie recommendation (Netflix)
• Related product recommendation (Amazon)
• Web page ranking (Googl...
Outline
•Recommender Systems
•Taxonomy
•History
•Evaluating Recommenders
•Algorithms*
•Graphs
(* math magicians better pay...
Taxonomy of RS
• Collaborative Filtering (CF)
• Content Based Filtering (CBF)
• Knowledge Based Filtering (KBF)
• Hybrid
Taxonomy of RS
• Collaborative Filtering (CF)!
• Content Based Filtering (CBF)
• Knowledge Based Filtering (KBF)
• Hybrid
Collaborative Filtering:
• relies on past user behavior
• Implicit feedback
• Explicit feedback
• requires no gathering of...
Collaborative
(Dietmar et. al. At ‘AI 2011)
User based Collaborative Filtering
User based Collaborative Filtering
Taxonomy of RS
• Collaborative Filtering (CF)
• Content Based Filtering (CBF)!
• Knowledge Based Filtering (KBF)
• Hybrid
Content Filtering
• creates profile for user/movie
• requires gathering external data
• dense data
• domain-bounded
• no co...
Content based
(Dietmar et. al. At ‘AI 2013)
Item based Collaborative Filtering
Item based Collaborative Filtering
Taxonomy of RS
• Collaborative Filtering (CF)
• Content Based Filtering (CBF)
• Knowledge Based Filtering (KBF)!
• Hybrid
Knowledge based
(Dietmar et. al. At ‘AI 2013)
Knowledge based Content Filtering
Knowledge based Content Filtering
Knowledge based Content Filtering
Taxonomy of RS
• Collaborative Filtering (CF)
• Content Based Filtering (CBF)
• Knowledge Based Filtering (KBF)
• Hybrid
Hybrid
(Dietmar et. al. At ‘AI 2013)
Outline
•Recommender Systems
•Taxonomy
•History
•Evaluating Recommenders
•Algorithms*
•Graphs
(* math magicians better pay...
History
• 1992-1995: Manual Collaborative Filtering
• 1994-2000: Automatic Collaborative Filtering +
Content
• 2000+: Comm...
TQL:
Tapestry (1992)
(Golberg et. al 1992)
Grouplens (1994)
(Resnick et. al 1994)
2000+: Commercial CF’s
• 2001: Amazon starts using item based collaborative
filtering (Patent filed at 1998)
• 2000: Pandora...
Annual Conferences
• RecSys (since 2007) http://recsys.acm.org
• SIGIR (since 1978) http://sigir.org/
• KDD (official since...
Ongoing Discussion
• Evaluation
• Scalability
• Similarity versus Diversity
• Cold start (items + users)
• Fraud
• Imbalan...
Outline
•Recommender Systems
•Taxonomy
•History
•Evaluating Recommenders
•Algorithms*
•Graphs
(* math magicians better pay...
Evaluating Recommenders
• Least mean squares prediction error
• RMSE





• Similarity measure enough ?





rmse(S) =
s
|...
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Evaluating Recommenders
rmse(S) =
s
|S| 1
X
(i,u)2S
(ˆrui rui)2
Outline
•Recommender Systems
•Algorithms*
•Graphs
(* math magicians better pay attention here)
Outline
•Recommender Systems
•Algorithms*
•Content based Algorithms *
•Collaborative Algorithms *
•Classification
•Rating/R...
• content is exploited (item to item filtering)
• content model:
• keywords (ie TF-IDF)
• similarity/distance measures:
• E...
• similarity/distance measures:
• Euclidean distance

• Jaccard distance

• Cosine distance
Content-based Filtering
dot pr...
• similarity/distance measures:
• Euclidean distance

• Jaccard distance

• Cosine distance
Content-based Filtering
dot pr...
Examples
• Item to Query
• Item to Item
• Item to User
Examples
• Item to Query!
• Item to Item
• Item to User
Example: Item to Query
Title Price Genre Rating
The Avengers 5 Action 3,7
Spiderman II 10 Action 4,5
user query q : 

“pri...
Examples
• Item to Query
• Item to Item!
• Item to User
Example: Item to Item Similarity
Title ReleaseTime Genres Actors Rating
TA 90s, start 90s, 1993 Action, Comedy, Romance X,...
Title ReleaseTime Genres Actors Rating
TA 90s, start 90s, 1993 Action, Comedy, Romance X,Y,Z 3,7
S2 90s, start 90s, 1991 A...
X1 = (90s,S90s,1993)
X2 = (1,1,1)
X3 = (0,1,1,1)
X4 = 3.7
TA
W 0.5 0.3 0.2
X1 = (90s,S90s,1991)
X2 = (1,0,0)
X3 = (1,1,0,1...
(content factors)
Examples
• Item to Query
• Item to Item
• Item to User
Example: Item to User
Title Roelof Klas Mo Max X
(Action)
X
(
)
The Avengers 5 1 2 5 0.8 0.1
Spiderman II ? 2 1 ? 0.9 0.2
...
Title Roelof Klas Mo Max X
(Action)
X
(
)
The Avengers 5 1 2 5 0.8 0.1
Spiderman II ? 2 1 ? 0.9 0.2
American Pie 2 5 ? 1 0...
Title Roelof Klas Mo Max X
(Action)
X
(
)
The Avengers 5 1 2 5 0.8 0.1
Spiderman II ? 2 1 ? 0.9 0.2
American Pie 2 5 ? 1 0...
Title Roelof Klas Mo Max X
(Action)
X
(
)
The Avengers 5 1 2 5 0.8 0.1
Spiderman II ? 2 1 ? 0.9 0.2
American Pie 2 5 4.5 1...
Title Roelof Klas Mo Max X
(Action)
X
(
)
The Avengers 5 1 2 5 0.8 0.1
Spiderman II ≈4 2 1 ≈4 0.9 0.2
American Pie 2 5 4.5...
problem formulation:!
• r(i,u) = 1 if user u has rated movie i, otherwise 0
• y
(i,u)
= rating by user u on movie i (if de...
min ∑ ∑ (( (u))T!(i) - "(i,u))2 + ∑ ∑ ( )2
ƛ
—
2
#
u=1
problem formulation:!
• learning (u):
• learning (1), (2) , … , #
:...
Outline
•Recommender Systems
•Algorithms*
•Content based Algorithms *
•Collaborative Algorithms *
•Classification
•Rating/R...
Collaborative Filtering:
• User-based approach!
• Find a set of users Si who rated item j, that are most similar to
ui
• c...
Collaborative Filtering:
• Two primary models:
• Neighborhood models!
• focus on relationships between movies or users
• L...
Neighborhood (user oriented)
69
(pic from Koren et al. 2009)
Neighbourhood Methods
• Problems:
• Ratings biased per user
• Ratings biased towards certain items
• Ratings change over t...
Latent Factors
71
• latent factor models map users and items into a
latent feature space
• user's feature vector denotes t...
Latent Factors (users+movies)
72
(pic from Koren et al. 2009)
Latent Factors (x+y)
73
(http://xkcd.com/388/)
xkcd.com
Latent Factor models
• Matrix Factorization:
• characterizes items + users by vectors of
factors inferred from (ratings or...
Matrix Factorization
• Dimensionality reduction
• Principal Components Analysis, PCA
• Singular Value Decomposition, SVD
•...
Matrix Factorization: SVD
SVD, Singular Value Decomposition
• transforms correlated variables into a set of
uncorrelated o...
SVD: Matrix Decomposition
77
U: document-to-concept similarity
matrix !
V: term-to-concept similarity matrix !
ƛ : its dia...
SVD for 

Collaborative Filtering
each item i associated with vector qi ∈ ℝf 

each user u associated with vector pu ∈ ℝf ...
SVD for 

Collaborative Filtering
• compute u,i mappings: qi,pu ∈ ℝ
f
• factor user, item matrix
• imputation (Sarwar et.a...
SVD Visualized
regression line reducing two dimensional
space into one dimensional one
reducing three dimensional (multidimensional)
space into two dimensional plane
SVD Visualized
SVD: Code Away!
<Coding Time>
82
Stochastic Gradient Descent
• optimizable by Stochastic Gradient Descent
(SGD) (Funk 2006)
• incremental learning
• loops ...
Gradient Descent
<Coding Time>
84
Alternating Least Squares
• optimizable by Alternating Least Squares (ALS) (2006)
• both qi and pu unknown: minimum functi...
Alternating Least Squares
• rotates between fixing qi’s and pu’s
• when all pu’s fixed, recompute qi’s by solving a least
sq...
• Add Biases:
• Add Input Sources: Implicit
Feedback:

pu in rui becomes (pu + 

+ (…) )Add
Temporal Aspect / time-varying...
Develop Further…
• Final Algoritm:







88
confidence bias terms
regularization
(Paterek,A. 2007)
• Final Algorithm with Temporal dimensions:







Develop Further…
89
• So what if we don’t have any content factors
known?
• Probabilistic Matrix Factorization to the rescue!
• describe each ...
Probabilistic Matrix
Factorization
• Imagine we have the following rating data:















we could say that Roelof an...
Probabilistic Matrix
Factorization
• This could be represented by the PMF model by using three
dimensional vectors to desc...
<CODE TIME>
ratings
Probabilistic Matrix
Factorization
Outline
•Recommender Systems
•Algorithms*
•Content based Algorithms *
•Collaborative Algorithms *
•Classification
•Rating/R...
Classification
• k-Nearest Neighbors (KNN)
• Decision Trees
• Rule-Based
• Bayesian
• Artificial Neural Networks
• Support V...
Classification
• k-Nearest Neighbors (KNN)!
• Decision Trees
• Rule-Based
• Bayesian
• Artificial Neural Networks
• Support ...
k-Nearest Neighbor s
• non parametric lazy learning algorithm
• data as feature space
• simple and fast
• k-nn classificati...
kNN: Classification
• Classify
• several Xi used to classify Y
• compare (X1
p,X2
p) and (X1
q,Xq) by Squared
Euclidean dis...
kNN: Classification
• input: content extracted emotional values of 561
movies. thanks: Johannes Östling :)










 ie:
d...
KNN
<CODE>
k-Nearest Neighbors
emotional
dimension “Anger”
vs “Love”
k-Nearest Neighbors
Negative: afraid, confused, helpless', hurt,
sad, angry, depressed
Positive: good, interested, love, p...
Outline
•Recommender Systems
•Algorithms*
•Content based Algorithms *
•Collaborative Algorithms *
•Classification
•Rating/R...
Rating predictions:
• Pos — Neg
• Average
• Bayesian (Weighted) Estimates
• Lower bound of Wilson score confidence interval...
Rating predictions:
• Pos — Neg!
• Average
• Bayesian (Weighted) Estimates
• Lower bound of Wilson score confidence interva...
P — N
• (Positive ratings) - (Negative ratings)
• Problematic:











(http://www.urbandictionary.com/define.php?term=m...
Rating predictions:
• Pos — Neg
• Average!
• Bayesian (Weighted) Estimates
• Lower bound of Wilson score confidence interva...
Average
• (Positive ratings) / (Total ratings)
• Problematic:











(http://www.amazon.co.uk/gp/bestsellers/electroni...
Rating predictions:
• Pos — Neg
• Average
• Bayesian (Weighted) Estimates!
• Lower bound of Wilson score confidence interva...
Ratings
• Top Ranking at IMDB (gives Bayesian estimate):
• Weighted Rating (WR) = 

(v / (v+m)) × R + (m / (v+m)) × C!
• W...
Bayesian (Weighted)
Estimates
• :
• weighted average on a 

per-item basis:
(source(s): http://www.imdb.com/title/tt036889...
Bayesian (Weighted)
Estimates @ IMDB
Bayesian Weights for m = 1250
0"
0,1"
0,2"
0,3"
0,4"
0,5"
0,6"
0,7"
0,8"
0,9"
1"
0" 2...
m=1250
Rating predictions:
• Pos — Neg
• Average
• Bayesian (Weighted) Estimates
• Lower bound of Wilson score confidence
interval...
Wilson Score interval
• 1927 by Edwin B. Wilson





• Given the ratings I have, there is a 95% chance
that the "real" fra...
Wilson Score interval
• used by Reddit for comments ranking
• “rank the best comments highest 

regardless of their submis...
Wilson Score interval
• Endpoints for Wilson Score interval:







• Reddit’s comment Ranking function

(phat+z*z/(2*n) -...
CODE
CODE
Bernoulli anyone?
*as the trial (N) = 2 (2
throws of dice) its actually
not a real Bernoulli
distribution
What’s next?
GRAPHS
Outline
•Recommender Systems
•Algorithms*
•Graphs
(* math magicians better pay attention here)
Graph Based Approaches
• Whats a Graph?!
• Why Graphs?
• Who uses Graphs?
• Talking with Graphs
• Graph example: Recommend...
What’s a Graph?
124
Movie
has_genre
Genre
features_actor
Actor
Director
directed_by
likes
User
watches
rates
Userlikes_use...
Graph Based Approaches
• Whats a Graph?
• Why Graphs?!
• Who uses Graphs?
• Talking with Graphs
• Graph example: Recommend...
Why Graphs?
• more complex (social networks…)
• more connected (wikis, pingbacks, rdf, collaborative
tagging)
• more semi-...
Data Trend
“Every 2 days we
create as much
information as we did
up to 2003”

— Eric Schmidt, Google
Why Graphs?
Graphs vs Relational
128
relational
graph
graph
(pic by Michael Hunger, neo4j)
Why Graphs?
Its Fast!
Matrix based Calculat...
Graphs vs Relational
129
relational
graph
graph
(pic by Michael Hunger, neo4j)
Why Graphs?
Its Fast!
Graph based Calculati...
Its 

White-Board

Friendly !
(pic by Michael Hunger, neo4j)
Why Graphs?
(pic by Michael Hunger, neo4j)
Its 

White-Board

Friendly !
Why Graphs?
(pic by Michael Hunger, neo4j)
Its 

White-Board

Friendly !
Why Graphs?
Graph Based Approaches
• Whats a Graph?
• Why Graphs?
• Who uses Graphs?!
• Talking with Graphs
• Graph example: Recommend...
Who uses Graphs?
• Facebook: Open Graph (https://
developers.facebook.com/docs/opengraph)
• Google: Knowledge Graph (http:...
135
(pic by Michael Hunger, neo4j)
Graph Based Approaches
• Whats a Graph?
• Why Graphs?
• Who uses Graphs?
• Talking with Graphs!
• Graph example: Recommend...
Talking with Graphs
• Graphs can be queried!
• no unions for comparison, but traversals!
• many different graph traversal ...
graph traversal patterns
• traversals can be seen as a diffusion
proces over a graph!
• “Energy” moves over a graph and sp...
Energy Diffusion
(pic by Marko A. Rodriguez, 2011)
Energy Diffusion
(pic by Marko A. Rodriguez, 2011)
energy = 4
Energy Diffusion
(pic by Marko A. Rodriguez, 2011)
energy = 3
Energy Diffusion
(pic by Marko A. Rodriguez, 2011)
energy = 2
Energy Diffusion
(pic by Marko A. Rodriguez, 2011)
energy = 1
Graph Based Approaches
• Whats a Graph?
• Why Graphs?
• Who uses Graphs?
• Talking with Graphs
• Graph example: Recommenda...
Diffusion Example:
Recommendations
• Energy diffusion is an easy algorithms for
making recommendations!
• different paths ...
Friend
Recommendation
• Who are my friends’ friends that are not
me or my friends
(pic by Marko A. Rodriguez, 2011)
Friend
Recommendation
• Who are my friends’ friends



• Who are my friends’ friends that are not
me or my friends
G.V(‘me...
Product
Recommendation
• Who likes what I like —> of these things, what
do they like which I dont’ already like
(pic by Ma...
Product
Recommendation
• Who likes what I like

• Who likes what I like —> of these things, what
do they like which I dont...
Recommendations at

with FoorSee
Graph Based Approaches
• Whats a Graph?
• Why Graphs?
• Who uses Graphs?
• Talking with Graphs
• Graph example: Recommenda...
154
Pulp Fiction
Graphs: Conclusion
• Fast!
• Scalable!
• Diversification!
• No Cold Start!
• Sparsity/Density not applicable
Graphs: Conclusion
• NaturalVisualizable!
• Feedback / Understandable!
• Connectable to the “web” / semantic web!
• Social...
WARNING
Graphs 

are 

Addictive!
Les Miserables
Facebook Network
References
• J. Dietmar, G. Friedrich and M. Zanker (2011) “Recommender Systems”,
International Joint Conference on Artifi...
References
• Y. Koren (2008) “Factorization meets the Neighborhood: A
Multifaceted Collaborative Filtering Model”, SIGKDD,...
References
• P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl (1994),
“GroupLens: An Open Architecture for Col...
Take Away Points
• Focus on the best Question, not just the Answer…!
• Best Match (most similar) vs Most Popular!
• Person...
Thanks for listening!
163
(xkcd)
Say What?
• So what other stuff do we do at Vionlabs?

• Some examples of data extraction which is fed into
our BAG (Big As...
Computer Vision
NLTK
167
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
Upcoming SlideShare
Loading in...5
×

Recommender Systems, Matrices and Graphs

2,942

Published on

talk at KTH 14 May 2014 about matrix factorization, different latent and neighborhood models, graphs and energy diffusion for recommender systems, as well as what makes good/bad recommendations.

2 Comments
19 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,942
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
148
Comments
2
Likes
19
Embeds 0
No embeds

No notes for slide

Transcript of "Recommender Systems, Matrices and Graphs"

  1. 1. Recommender Systems, MaTRICES and Graphs Roelof Pieters roelof@vionlabs.com 14 May 2014 @ KTH
  2. 2. About me Interests in: • IR, RecSys, Big Data, ML, NLP, SNA, Graphs, CV, Data Visualization, Discourse Analysis History: • 2002-2006: almost-BA Computer Science @ Amsterdam Tech Uni (dropped out in 2006) • 2006-2010: BA Cultural Anthropology @ Leiden & Amsterdam Uni’s • 2010-2012: MA Social Anthropology @ Stockholm Uni • 2011-Current: Working @ Vionlabs se.linkedin.com/in/roelofpieters/ roelof@vionlabs.com
  3. 3. Say Hello! St: Eriksgatan 63 112 33 Stockholm - Sweden Email: hello@vionlabs.com Tech company here in Stockholm with Geeks and Movie lovers… Since 2009: • Digital ecosystems for network operators, cable TV companies, and film distributor such as Tele2/Comviq, Cyberia, and Warner Bros • Various software and hardware hacks for different companies: Webbstory, Excito, Spotify, Samsung Focus since 2012: • Movie and TV recommendation 
 service FoorSee
  4. 4. WE LOVE MOVIES….
  5. 5. Outline •Recommender Systems •Algorithms* •Graphs (* math magicians better pay attention here)
  6. 6. Outline •Recommender Systems •Taxonomy •History •Evaluating Recommenders •Algorithms* •Graphs (* math magicians better pay attention here)
  7. 7. Information Retrieval • Recommender Systems as part of Information Retrieval Document(s)Document(s)Document(s)Document(s)Document(s) Retrieval USER Query • Information Retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources.
  8. 8. IR: Measure Success • Recall: success in retrieving all correct documents • Precision: success in retrieving the most relevant documents • Given a set of terms and a set of document terms select only the most relevant documents (precision), and preferably all the relevant ones (recall)
  9. 9. “generate meaningful recommendations to a (collection of) user(s) for items or products that might interest them” Recommender Systems
  10. 10. Where can RS be found? • Movie recommendation (Netflix) • Related product recommendation (Amazon) • Web page ranking (Google) • Social recommendation (Facebook) • News content recommendation (Yahoo) • Priority inbox & spam filtering (Google) • Online dating (OK Cupid) • Computational Advertising (Yahoo)
  11. 11. Outline •Recommender Systems •Taxonomy •History •Evaluating Recommenders •Algorithms* •Graphs (* math magicians better pay attention here)
  12. 12. Taxonomy of RS • Collaborative Filtering (CF) • Content Based Filtering (CBF) • Knowledge Based Filtering (KBF) • Hybrid
  13. 13. Taxonomy of RS • Collaborative Filtering (CF)! • Content Based Filtering (CBF) • Knowledge Based Filtering (KBF) • Hybrid
  14. 14. Collaborative Filtering: • relies on past user behavior • Implicit feedback • Explicit feedback • requires no gathering of external data • sparse data • domain free • cold start problem 16
  15. 15. Collaborative (Dietmar et. al. At ‘AI 2011) User based Collaborative Filtering
  16. 16. User based Collaborative Filtering
  17. 17. Taxonomy of RS • Collaborative Filtering (CF) • Content Based Filtering (CBF)! • Knowledge Based Filtering (KBF) • Hybrid
  18. 18. Content Filtering • creates profile for user/movie • requires gathering external data • dense data • domain-bounded • no cold start problem 20
  19. 19. Content based (Dietmar et. al. At ‘AI 2013) Item based Collaborative Filtering
  20. 20. Item based Collaborative Filtering
  21. 21. Taxonomy of RS • Collaborative Filtering (CF) • Content Based Filtering (CBF) • Knowledge Based Filtering (KBF)! • Hybrid
  22. 22. Knowledge based (Dietmar et. al. At ‘AI 2013) Knowledge based Content Filtering
  23. 23. Knowledge based Content Filtering
  24. 24. Knowledge based Content Filtering
  25. 25. Taxonomy of RS • Collaborative Filtering (CF) • Content Based Filtering (CBF) • Knowledge Based Filtering (KBF) • Hybrid
  26. 26. Hybrid (Dietmar et. al. At ‘AI 2013)
  27. 27. Outline •Recommender Systems •Taxonomy •History •Evaluating Recommenders •Algorithms* •Graphs (* math magicians better pay attention here)
  28. 28. History • 1992-1995: Manual Collaborative Filtering • 1994-2000: Automatic Collaborative Filtering + Content • 2000+: Commercialization…
  29. 29. TQL: Tapestry (1992) (Golberg et. al 1992)
  30. 30. Grouplens (1994) (Resnick et. al 1994)
  31. 31. 2000+: Commercial CF’s • 2001: Amazon starts using item based collaborative filtering (Patent filed at 1998) • 2000: Pandora starts music genome
 project, where each song“is analyzed using up to 450 distinct musical characteristics by a trained music analyst.” • 2006-2009: Netflix Contents: 2 of many algorithms put in use by Netflix replacing “Cinematch": Matrix Factorization (SVD) and Restricted Boltzmann Machines (RBM)
 (http://www.pandora.com/about/mgp) (http://www.netflixprize.com)
  32. 32. Annual Conferences • RecSys (since 2007) http://recsys.acm.org • SIGIR (since 1978) http://sigir.org/ • KDD (official since 1998) http://www.kdd.org/ • KDD Cup
  33. 33. Ongoing Discussion • Evaluation • Scalability • Similarity versus Diversity • Cold start (items + users) • Fraud • Imbalanced dataset or Sparsity • Personalization • Filter Bubbles • Privacy • Data Collection
  34. 34. Outline •Recommender Systems •Taxonomy •History •Evaluating Recommenders •Algorithms* •Graphs (* math magicians better pay attention here)
  35. 35. Evaluating Recommenders • Least mean squares prediction error • RMSE
 
 
 • Similarity measure enough ?
 
 
 rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  36. 36. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  37. 37. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  38. 38. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  39. 39. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  40. 40. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  41. 41. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  42. 42. Evaluating Recommenders rmse(S) = s |S| 1 X (i,u)2S (ˆrui rui)2
  43. 43. Outline •Recommender Systems •Algorithms* •Graphs (* math magicians better pay attention here)
  44. 44. Outline •Recommender Systems •Algorithms* •Content based Algorithms * •Collaborative Algorithms * •Classification •Rating/Ranking * •Graphs (* math magicians better pay attention here)
  45. 45. • content is exploited (item to item filtering) • content model: • keywords (ie TF-IDF) • similarity/distance measures: • Euclidean distance: • L1 and L2-norm • Jaccard distance Content-based Filtering • (adjusted) Cosine distance • Edit distance • Hamming distance
  46. 46. • similarity/distance measures: • Euclidean distance
 • Jaccard distance
 • Cosine distance Content-based Filtering dot product x.y is 
 1 × 2 + 2 × 1 + (−1) × 1 = 3 x = [1,2, −1] and = [2,1,1]. L2-norm =
 √12 + 22 + (−1)2 = 6 ie:
  47. 47. • similarity/distance measures: • Euclidean distance
 • Jaccard distance
 • Cosine distance Content-based Filtering dot product x.y is 
 1 × 2 + 2 × 1 + (−1) × 1 = 3 x = [1,2, −1] and = [2,1,1]. cosine of angle:
 3/(√6√6) =1/2 cos distance of 1/2: 60 degrees, L2-norm =
 √12 + 22 + (−1)2 = 6 ie:
  48. 48. Examples • Item to Query • Item to Item • Item to User
  49. 49. Examples • Item to Query! • Item to Item • Item to User
  50. 50. Example: Item to Query Title Price Genre Rating The Avengers 5 Action 3,7 Spiderman II 10 Action 4,5 user query q : 
 “price (6) AND genre(Adventure) AND rating (4)” weights of features: 0.22 0.450.33 Sim(q,”The Avengers”) = 
 0.22 x (1 - 1/25) + 0.33 x 0 + 0.45 x (1 - 0.3/5) = 0.6342 1-25 price range no matchdiff of 1 diff of 0.3 0-5 rating range Sim(q,”Spiderman II”) = 0.5898 
 (0.6348 if we count rating 4.5 > 4 as match) Weighted Sum:
  51. 51. Examples • Item to Query • Item to Item! • Item to User
  52. 52. Example: Item to Item Similarity Title ReleaseTime Genres Actors Rating TA 90s, start 90s, 1993 Action, Comedy, Romance X,Y,Z 3,7 S2 90s, start 90s, 1991 Action W,X,Z 4,5 numeric Array of Booleans Sim(X,Y) = 1 - d(X,Y) 
 or 
 Sim(X,Y) = exp(- d(X,Y)) where 0 ≤ wi ≤ 1, and i=1..n (number of features). Set of hierarchical related symbols
  53. 53. Title ReleaseTime Genres Actors Rating TA 90s, start 90s, 1993 Action, Comedy, Romance X,Y,Z 3,7 S2 90s, start 90s, 1991 Action W,X,Z 4,5 numeric Array of Booleans Set of hierarchical related symbols X1 = (90s,S90s,1993) X2 = (1,1,1) X3 = (0,1,1,1) X4 = 3.7 TA W 0.5 0.3 0.2 X1 = (90s,S90s,1991) X2 = (1,0,0) X3 = (1,1,0,1) X4 = 4.5 S2 weights of feature all the same weights of categories within “Release time” different Example: Item to Item Similarity
  54. 54. X1 = (90s,S90s,1993) X2 = (1,1,1) X3 = (0,1,1,1) X4 = 3.7 TA W 0.5 0.3 0.2 X1 = (90s,S90s,1991) X2 = (1,0,0) X3 = (1,1,0,1) X4 = 4.5 S2 
 exp(- (1/√4) √d1(X1,Y1)2 +…+d4(X4,Y4 )2 ) = exp(- ) exp(-(1/√4) √(1-(0.3+0.5))2 + (1-1/3)2 +(1-2/4)2 + (1-0.8/5)2 ) = exp(-(1/√4) √(1.5745 ) = exp(-0.339) = 0.534 Sim( dest1,dest2 ) = Example: Item to Item Similarity
  55. 55. (content factors)
  56. 56. Examples • Item to Query • Item to Item • Item to User
  57. 57. Example: Item to User Title Roelof Klas Mo Max X (Action) X ( ) The Avengers 5 1 2 5 0.8 0.1 Spiderman II ? 2 1 ? 0.9 0.2 American Pie 2 5 ? 1 0.05 0.9 X(1) = 1
 0.8
 0.1 For each user u, learn a parameter ∈ R(n+1) .
 Predict user u as rating movie i with ( )T x(i)
  58. 58. Title Roelof Klas Mo Max X (Action) X ( ) The Avengers 5 1 2 5 0.8 0.1 Spiderman II ? 2 1 ? 0.9 0.2 American Pie 2 5 ? 1 0.05 0.9 Mo ( (3) ) and Klas ( (2) ) predict rating Mo ( (3) ),
 American pie (X(3) ) (2) (3)(1) (4) X(1) X(2) X(3) X(3) = 1
 0.05
 0.9 temp (3) = 0
 0
 5 Example: Item to User
  59. 59. Title Roelof Klas Mo Max X (Action) X ( ) The Avengers 5 1 2 5 0.8 0.1 Spiderman II ? 2 1 ? 0.9 0.2 American Pie 2 5 ? 1 0.05 0.9 Mo ( (3) ) and Klas ( (2) ) predict rating Mo ( (3) ),
 American pie (X(3) ) (2) (3)(1) (4) X(1) X(2) X(3) 1
 0.05
 0.9 0
 0
 5 Example: Item to User dot product ≈ 4.5
  60. 60. Title Roelof Klas Mo Max X (Action) X ( ) The Avengers 5 1 2 5 0.8 0.1 Spiderman II ? 2 1 ? 0.9 0.2 American Pie 2 5 4.5 1 0.05 0.9 Mo ( (3) ) and Klas ( (2) ) predict rating Mo ( (3) ),
 American pie (X(3) ) (2) (3)(1) (4) X(1) X(2) X(3) 1
 0.05
 0.9 0
 0
 5 Example: Item to User dot product ≈ 4.5
  61. 61. Title Roelof Klas Mo Max X (Action) X ( ) The Avengers 5 1 2 5 0.8 0.1 Spiderman II ≈4 2 1 ≈4 0.9 0.2 American Pie 2 5 4.5 1 0.05 0.9 How do we learn these user factor parameters? (2) (3)(1) (4) X(1) X(2) X(3) Example: Item to User
  62. 62. problem formulation:! • r(i,u) = 1 if user u has rated movie i, otherwise 0 • y (i,u) = rating by user u on movie i (if defined) • (u) = parameter vector for user u • x (i) = feature vector for movie i • For user u, movie i, predicted rating: ( ) T (x (i) ) • temp m (u) = # of movies rated by user u
 
 min ∑ ( ( (u))T!(i) - "(i,u) )2 + ∑ ( )2 ƛ —— 2 # k=1 (u) (u) 1 2 —— m(u)m(u) Example: Item to User Say what?• learning (u) = (A. Ng. 2013)
  63. 63. min ∑ ∑ (( (u))T!(i) - "(i,u))2 + ∑ ∑ ( )2 ƛ — 2 # u=1 problem formulation:! • learning (u): • learning (1), (2) , … , # : # 1 2 — min ∑ ( ( (u))T!(i) - "(i,u) )2 + ∑ ( )2 ƛ — 2 # k=1 (u) (u) 1 2 — #u k=1 (u) regularization term # squared error term actualpredicted learn for “all” users Example: Item to Userremember:
 y = rating 
 parameter vector for a user
 x = feature vector for a movie
  64. 64. Outline •Recommender Systems •Algorithms* •Content based Algorithms * •Collaborative Algorithms * •Classification •Rating/Ranking * •Graphs (* math magicians better pay attention here)
  65. 65. Collaborative Filtering: • User-based approach! • Find a set of users Si who rated item j, that are most similar to ui • compute predicted Vij score as a function of ratings of item j given by Si (usually weighted linear combination) • Item-based approach! • Find a set of most similar items Sj to the item j which were rated by ui • compute predicted Vij score as a function of ui's ratings for Sj
  66. 66. Collaborative Filtering: • Two primary models: • Neighborhood models! • focus on relationships between movies or users • Latent Factor models • focus on factors inferred from (rating) patterns • computerized alternative to naive content creation • predicts rating by dot product of user and movie locations on known dimensions 68 (Sarwar, B. et al. 2001)
  67. 67. Neighborhood (user oriented) 69 (pic from Koren et al. 2009)
  68. 68. Neighbourhood Methods • Problems: • Ratings biased per user • Ratings biased towards certain items • Ratings change over time • Ratings can rapidly change through real time events (Oscar nomination, etc) • Bias correction needed
  69. 69. Latent Factors 71 • latent factor models map users and items into a latent feature space • user's feature vector denotes the user's affinity to each of the features • item's feature vector represents how much the item itself is related to the features. • rating is approximated by the dot product of the user feature vector and the item feature vector.
  70. 70. Latent Factors (users+movies) 72 (pic from Koren et al. 2009)
  71. 71. Latent Factors (x+y) 73 (http://xkcd.com/388/) xkcd.com
  72. 72. Latent Factor models • Matrix Factorization: • characterizes items + users by vectors of factors inferred from (ratings or other user- item related) patterns • Given a list of users and items, and user-item interactions, predict user behavior • can deal with sparse data (matrix) • can incorporate additional information 74
  73. 73. Matrix Factorization • Dimensionality reduction • Principal Components Analysis, PCA • Singular Value Decomposition, SVD • Non Negative Matrix Factorization, NNMF
  74. 74. Matrix Factorization: SVD SVD, Singular Value Decomposition • transforms correlated variables into a set of uncorrelated ones that better expose the various relationships among the original data items. • identifies and orders the dimensions along which data points exhibit the most variation. • allowing us to find the best approximation of the original data points using fewer dimensions.
  75. 75. SVD: Matrix Decomposition 77 U: document-to-concept similarity matrix ! V: term-to-concept similarity matrix ! ƛ : its diagonal elements: ‘strength’ of each concept ! (pic by Xavier Amatriain 2013)
  76. 76. SVD for 
 Collaborative Filtering each item i associated with vector qi ∈ ℝf 
 each user u associated with vector pu ∈ ℝf 
 qi measures extent to which item possesses factors
 pu measures extent of interest for user in items which possess high on factors
 user-item interactions modeled as dot products within the factor space, measured by qi T pu
 user u rating on item i approximates: rui = qi T pu 78 ^
  77. 77. SVD for 
 Collaborative Filtering • compute u,i mappings: qi,pu ∈ ℝ f • factor user, item matrix • imputation (Sarwar et.al. 2000) • model only observed ratings + regularization (Funk 2006; Koren 2008) • learn factor vectors qi and pu by minimizing (regularized) squared error on set of known ratings: approximate user u rating of item i, denoted by rui, leading to Learning Algorithm:
 
 79 ^
  78. 78. SVD Visualized regression line reducing two dimensional space into one dimensional one
  79. 79. reducing three dimensional (multidimensional) space into two dimensional plane SVD Visualized
  80. 80. SVD: Code Away! <Coding Time> 82
  81. 81. Stochastic Gradient Descent • optimizable by Stochastic Gradient Descent (SGD) (Funk 2006) • incremental learning • loops trough ratings and computes prediction error for predicted rating on rui :
 
 • modify parameters by magnitude proportional to y in opposite direction of the gradient, giving learning rule:
 
 
 
 83 and
  82. 82. Gradient Descent <Coding Time> 84
  83. 83. Alternating Least Squares • optimizable by Alternating Least Squares (ALS) (2006) • both qi and pu unknown: minimum function not convex
 —> can not be solved for a minimum. • ALS rotates between fixing qi’s and pu’s • Fix qi or pu makes optimization problem quadratic 
 —> one not optimized can now be solved • qi and pu independently computed of other item/user factors: parallelization • Best for implicit data (dense matrix) 85
  84. 84. Alternating Least Squares • rotates between fixing qi’s and pu’s • when all pu’s fixed, recompute qi’s by solving a least squares problem:
 • Fix matrix P as some matrix P, so that minimization problem:
 • or fix Q similarly as:
 • Learning Rule:
 86 where and
  85. 85. • Add Biases: • Add Input Sources: Implicit Feedback:
 pu in rui becomes (pu + 
 + (…) )Add Temporal Aspect / time-varying parameters
 
 • Vary Confidence Levels of Inputs Develop Further… 87 and pic: Lei Guo 2012 (Salakhutdinov & 
 Mnih 2008; Koren 2010)
  86. 86. Develop Further… • Final Algoritm:
 
 
 
 88 confidence bias terms regularization (Paterek,A. 2007)
  87. 87. • Final Algorithm with Temporal dimensions:
 
 
 
 Develop Further… 89
  88. 88. • So what if we don’t have any content factors known? • Probabilistic Matrix Factorization to the rescue! • describe each user and each movie by a small set of attributes
  89. 89. Probabilistic Matrix Factorization • Imagine we have the following rating data:
 
 
 
 
 
 
 
 we could say that Roelof and Klas like Action movies, but don’t like Comedy’s, while its the opposite for Mo and Max Title Roelof Klas Mo Max The Avengers 5 1 1 4 Spiderman II 4 2 1 5 American Pie 3 5 4 1 Shrek 1 4 5 2
  90. 90. Probabilistic Matrix Factorization • This could be represented by the PMF model by using three dimensional vectors to describe each user and each movie. • example latent vectors: • AV: [0, 0.3] • SPII: [1, 0.3] • AP [1, 0.3] • SH [1, 0.3] • Roelof: [0, 3] • Klas: [8, 3] • Mo [10, 3] • Max [10, 3] • predict rating by dot product of user vector with the item vector • So predicting Klas’ rating for Spiderman II = 8*1 + 3*0.3 = • But descriptions of users and movies not known ahead of time. • PGM discovers such latent characteristics
  91. 91. <CODE TIME> ratings Probabilistic Matrix Factorization
  92. 92. Outline •Recommender Systems •Algorithms* •Content based Algorithms * •Collaborative Algorithms * •Classification •Rating/Ranking * •Graphs (* math magicians better pay attention here)
  93. 93. Classification • k-Nearest Neighbors (KNN) • Decision Trees • Rule-Based • Bayesian • Artificial Neural Networks • Support Vector Machines
  94. 94. Classification • k-Nearest Neighbors (KNN)! • Decision Trees • Rule-Based • Bayesian • Artificial Neural Networks • Support Vector Machines
  95. 95. k-Nearest Neighbor s • non parametric lazy learning algorithm • data as feature space • simple and fast • k-nn classification • k-nn regression: density estimation
  96. 96. kNN: Classification • Classify • several Xi used to classify Y • compare (X1 p,X2 p) and (X1 q,Xq) by Squared Euclidean distance: 
 d2 pq = (X1 p - x1 q)2 + (X2 p - X2q)2 • find k-Nearest Neighbors
  97. 97. kNN: Classification • input: content extracted emotional values of 561 movies. thanks: Johannes Östling :)
 
 
 
 
 
 ie: dimensions of movie “Hamlet”:
  98. 98. KNN <CODE>
  99. 99. k-Nearest Neighbors emotional dimension “Anger” vs “Love”
  100. 100. k-Nearest Neighbors Negative: afraid, confused, helpless', hurt, sad, angry, depressed Positive: good, interested, love, positive, strong aggregate of positive and negative emotions
  101. 101. Outline •Recommender Systems •Algorithms* •Content based Algorithms * •Collaborative Algorithms * •Classification •Rating/Ranking * •Graphs (* math magicians better pay attention here)
  102. 102. Rating predictions: • Pos — Neg • Average • Bayesian (Weighted) Estimates • Lower bound of Wilson score confidence interval for a Bernoulli parameter
  103. 103. Rating predictions: • Pos — Neg! • Average • Bayesian (Weighted) Estimates • Lower bound of Wilson score confidence interval for a Bernoulli parameter
  104. 104. P — N • (Positive ratings) - (Negative ratings) • Problematic:
 
 
 
 
 
 (http://www.urbandictionary.com/define.php?term=movies)
  105. 105. Rating predictions: • Pos — Neg • Average! • Bayesian (Weighted) Estimates • Lower bound of Wilson score confidence interval for a Bernoulli parameter
  106. 106. Average • (Positive ratings) / (Total ratings) • Problematic:
 
 
 
 
 
 (http://www.amazon.co.uk/gp/bestsellers/electronics/)
  107. 107. Rating predictions: • Pos — Neg • Average • Bayesian (Weighted) Estimates! • Lower bound of Wilson score confidence interval for a Bernoulli parameter
  108. 108. Ratings • Top Ranking at IMDB (gives Bayesian estimate): • Weighted Rating (WR) = 
 (v / (v+m)) × R + (m / (v+m)) × C! • Where: R = average for the movie (mean) = (Rating)
 v = number of votes for the movie = (votes)
 m = minimum votes required to be listed in the Top 250 (currently 25000)
 C = the mean vote across the whole report (currently 7.0)
  109. 109. Bayesian (Weighted) Estimates • : • weighted average on a 
 per-item basis: (source(s): http://www.imdb.com/title/tt0368891/ratings)
  110. 110. Bayesian (Weighted) Estimates @ IMDB Bayesian Weights for m = 1250 0" 0,1" 0,2" 0,3" 0,4" 0,5" 0,6" 0,7" 0,8" 0,9" 1" 0" 250" 500" 750" 1000" 1250" 2000" 3000" 4000" 5000" specific" global" • specific part for individual items • global part is constant over all items • can be precalculated
  111. 111. m=1250
  112. 112. Rating predictions: • Pos — Neg • Average • Bayesian (Weighted) Estimates • Lower bound of Wilson score confidence interval for a Bernoulli parameter
  113. 113. Wilson Score interval • 1927 by Edwin B. Wilson
 
 
 • Given the ratings I have, there is a 95% chance that the "real" fraction of positive ratings is at least what?
  114. 114. Wilson Score interval • used by Reddit for comments ranking • “rank the best comments highest 
 regardless of their submission time” • algorithm introduced to Reddit by 
 Randall Munroe (the author of xkcd). • treats the vote count as a statistical sampling of a hypothetical full vote by everyone, much as in an opinion poll.
  115. 115. Wilson Score interval • Endpoints for Wilson Score interval:
 
 
 
 • Reddit’s comment Ranking function
 (phat+z*z/(2*n) - z*sqrt((phat*(1-phat) + z*z/(4*n))/n)) /(1+z*z/n)
  116. 116. CODE
  117. 117. CODE
  118. 118. Bernoulli anyone? *as the trial (N) = 2 (2 throws of dice) its actually not a real Bernoulli distribution
  119. 119. What’s next? GRAPHS
  120. 120. Outline •Recommender Systems •Algorithms* •Graphs (* math magicians better pay attention here)
  121. 121. Graph Based Approaches • Whats a Graph?! • Why Graphs? • Who uses Graphs? • Talking with Graphs • Graph example: Recommendations • Graph example: Data Analysis
  122. 122. What’s a Graph? 124 Movie has_genre Genre features_actor Actor Director directed_by likes User watches rates Userlikes_user likes_user friends follows comments_movie Comment likes_com m ent likes_actor … has_X etcetera locations! time! moods! keywords! … Vertices (Nodes) Edges (Relations)
  123. 123. Graph Based Approaches • Whats a Graph? • Why Graphs?! • Who uses Graphs? • Talking with Graphs • Graph example: Recommendations • Graph example: Data Analysis
  124. 124. Why Graphs? • more complex (social networks…) • more connected (wikis, pingbacks, rdf, collaborative tagging) • more semi-structured (wikis, rss) • more decentralized: democratization of content production (blogs, twitter*, social media*) and just: MORE Its the nature of todays Data, which is getting:
  125. 125. Data Trend “Every 2 days we create as much information as we did up to 2003”
 — Eric Schmidt, Google Why Graphs?
  126. 126. Graphs vs Relational 128 relational graph graph (pic by Michael Hunger, neo4j) Why Graphs? Its Fast! Matrix based Calculations: 
 Exponential run-time 
 (items x users x factori x …)
  127. 127. Graphs vs Relational 129 relational graph graph (pic by Michael Hunger, neo4j) Why Graphs? Its Fast! Graph based Calculations: 
 Linear/Constant run-time 
 (item of interest x relations)
  128. 128. Its 
 White-Board
 Friendly ! (pic by Michael Hunger, neo4j) Why Graphs?
  129. 129. (pic by Michael Hunger, neo4j) Its 
 White-Board
 Friendly ! Why Graphs?
  130. 130. (pic by Michael Hunger, neo4j) Its 
 White-Board
 Friendly ! Why Graphs?
  131. 131. Graph Based Approaches • Whats a Graph? • Why Graphs? • Who uses Graphs?! • Talking with Graphs • Graph example: Recommendations • Graph example: Data Analysis
  132. 132. Who uses Graphs? • Facebook: Open Graph (https:// developers.facebook.com/docs/opengraph) • Google: Knowledge Graph (http:// www.google.com/insidesearch/features/search/ knowledge.html) • Twitter: FlockDB (https://github.com/twitter/flockdb) • Mozilla: Pancake (https://wiki.mozilla.org/Pancake) • (…)
  133. 133. 135 (pic by Michael Hunger, neo4j)
  134. 134. Graph Based Approaches • Whats a Graph? • Why Graphs? • Who uses Graphs? • Talking with Graphs! • Graph example: Recommendations • Graph example: Data Analysis
  135. 135. Talking with Graphs • Graphs can be queried! • no unions for comparison, but traversals! • many different graph traversal patterns (xkcd)
  136. 136. graph traversal patterns • traversals can be seen as a diffusion proces over a graph! • “Energy” moves over a graph and spreads out through the network! • energy: (Ghahramani 2012)
  137. 137. Energy Diffusion (pic by Marko A. Rodriguez, 2011)
  138. 138. Energy Diffusion (pic by Marko A. Rodriguez, 2011) energy = 4
  139. 139. Energy Diffusion (pic by Marko A. Rodriguez, 2011) energy = 3
  140. 140. Energy Diffusion (pic by Marko A. Rodriguez, 2011) energy = 2
  141. 141. Energy Diffusion (pic by Marko A. Rodriguez, 2011) energy = 1
  142. 142. Graph Based Approaches • Whats a Graph? • Why Graphs? • Who uses Graphs? • Talking with Graphs • Graph example: Recommendations! • Graph example: Data Analysis
  143. 143. Diffusion Example: Recommendations • Energy diffusion is an easy algorithms for making recommendations! • different paths make different recommendations! • different paths for different problems can be solved on same graph/domain! • recommendation = “jumps” through the data
  144. 144. Friend Recommendation • Who are my friends’ friends that are not me or my friends (pic by Marko A. Rodriguez, 2011)
  145. 145. Friend Recommendation • Who are my friends’ friends
 
 • Who are my friends’ friends that are not me or my friends G.V(‘me’).outE[knows].inV.outE.inV G.V(‘me’).outE[knows].inV.aggregate(x).outE.
 inV{!x.contains(it)}
  146. 146. Product Recommendation • Who likes what I like —> of these things, what do they like which I dont’ already like (pic by Marko A. Rodriguez, 2011)
  147. 147. Product Recommendation • Who likes what I like
 • Who likes what I like —> of these things, what do they like which I dont’ already like
 • Who likes what I like —> of these things, what do they like which I dont’ already like G.V(‘me’).outE[likes].inV.inE[likes].outV G.V(‘me’).outE[likes].inV.aggregate(x).inE[likes].
 outV.outE[like].inV{!x.contains(it)} G.V(‘me’).outE[likes].inV.inE[likes].outV.outE[like].inV
  148. 148. Recommendations at
 with FoorSee
  149. 149. Graph Based Approaches • Whats a Graph? • Why Graphs? • Who uses Graphs? • Talking with Graphs • Graph example: Recommendations • Graph example: Data Analysis
  150. 150. 154 Pulp Fiction
  151. 151. Graphs: Conclusion • Fast! • Scalable! • Diversification! • No Cold Start! • Sparsity/Density not applicable
  152. 152. Graphs: Conclusion • NaturalVisualizable! • Feedback / Understandable! • Connectable to the “web” / semantic web! • Social Network Analysis! • Real Time Updates / Recommendations !
  153. 153. WARNING Graphs 
 are 
 Addictive!
  154. 154. Les Miserables
  155. 155. Facebook Network
  156. 156. References • J. Dietmar, G. Friedrich and M. Zanker (2011) “Recommender Systems”, International Joint Conference on Artificial Intelligence Barcelona • Z. Ghahramani (2012) “Graph-based Semi-supervised Learning”, MLSS, La Palma • D. Goldbergs, D. Nichols, B.M. Oki and D. Terry (1992) “Using collaborative filtering to weave an information tapestry”, Communications of the ACM 35 (12) • M. Hunger (2013) “Data Modeling with Neo4j”, http:// www.slideshare.net/neo4j/data-modeling-with-neo4j-25767444 • S. Funk (2006) “Netflix Update: Try This at Home”, sifter.org/~simon/ journal/20061211.html 159
  157. 157. References • Y. Koren (2008) “Factorization meets the Neighborhood: A Multifaceted Collaborative Filtering Model”, SIGKDD, http:// public.research.att.com/~volinsky/netflix/kdd08koren.pdf • Y. Koren & C. Bell, (2007) “Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights” • Y, Koren (2010) “Collaborative filtering with temporal dynamics” • A. Ng. (2013) Machine Learning, ml-004 @ Coursera • A. Paterek (2007) “Improving Regularized Singular Value Decomposition for Collaborative Filtering”, KDD 160
  158. 158. References • P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl (1994), “GroupLens: An Open Architecture for Collaborative Filtering of Netnews”, Proceedings of ACM • B.R. Sarwar et al. (2000) “Application of Dimensionality Reduction in Recommender System—A case Study”, WebKDD • B. Saewar, G. Karypis, J. Konstan, J, Riedl (2001) “Item-Based Collaborative Filtering Recommendation Algorithms” • R. Salakhutdinov & A. Mnih (2008) “Probabilistic Matrix Factorization” • xkcd.com 161
  159. 159. Take Away Points • Focus on the best Question, not just the Answer…! • Best Match (most similar) vs Most Popular! • Personalized vs Global Factors! • Less is More ?! • What is relevant?
  160. 160. Thanks for listening! 163 (xkcd)
  161. 161. Say What? • So what other stuff do we do at Vionlabs? • Some examples of data extraction which is fed into our BAG (Big Ass Grap)…
  162. 162. Computer Vision
  163. 163. NLTK 167
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×