Tag & Tag-based Recommenders
IBM Research – China
Presenter: Xiatian Zhang (张夏天)
赵石顽 张夏天 袁 泉
2000-2004, B.S. Math, Central South University
2004-2007, M.S. Computer Science, BUPT
2007-Present, Researcher, Working on Recommender Systems and
Social Tagging System and Its Features
A folksonomy is a system of classification derived from the practice
and method of collaboratively creating and managing tags to annotate
and categorize content; this practice is also known as collaborative
tagging, social classification, social indexing, and social tagging.
Folksonomy is a portmaneau of folk and taxonomy.
Social Tagging boomed from 2004, with the wave of Web 2.0.
– Dogear – A internal social book marking system in IBM
Some Insights of Tagging System
Shilad Sen et.al., tagging, communities, vocabulary, evolution,
– Modeling vocabulary evolution
– Tagging system features
– Based on Movielens recommender system
– Personal tendency and community influence
– Tag displaying strategies and their effects
– Tag utility
How strongly do investment and
habit affect personal tagging
– 1. Habit and investment
influence user’s tag applications.
– 2. Habit and investment
influence grows stronger as
users apply more tags.
– 3. Habit and investment cannot
be the only factors thatcontribute
to vocabulary evolution.
How does the tagging
– 1. Community influence
affects a user’s personal
– 2. Community influence
on a user’s first tag is
stronger for users who
have seen more tags.
– Encourage users to tag more frequently, apply more tags to an
individual resource, reuse common tags
– Make user use tags not previously considered.
– Eliminate Redundant tags
– Promote a core tag vocabulary steering the user toward adopting
certain tags while not imposing any strict rules.
– Avoid ambiguous tags in favor of tags that offer greater information
Tag Recommender – Technologies
– Most Popular Tags on Resources
– Most Popular Tags on Users
– Most Popular Tags on Resources and Users
Classical Collaborative Filtering
Adapted KNN Methods
– Extend User-Item Matrix
– Degrade User-Item-Tag Relationship
– Tensor Factorization
PR( p j )
PR( pi ) (1 d ) / N d
p j M ( pi ) L( p j ) (1)
PR( p j )
PR( pi ) (1 d ) pi d
p j M ( pi ) L( p j ) (2)
1. Compute global PageRank by (1)
2. Then for each <user, item> pair, compute personalized PageRank by (2)
– p[i] = 1, but p [u] = 1 + |U| and p [r] = 1 + |R|.
3. FolkRank = Personalized PageRank - PageRank
Explored and Exploring Methods
– Non-classical Tensor Fusion Factorization
– Multi-label Classification by Random Decision Trees, High Speed
– The performance of both two methods are close to FolkRank
– Shiwan develop a simple graph model
– Best precision and recall on several datasets compared to other
– We are writing paper targeting ACM RecSys 2010
– IUI 2008 Paper, Improved Recommendation based on Collaborative
– Explored Methods
– Tensor Factorization
– Non-classical Tensor and Matrix Fusion Factorization
– Shilad Sen, Jesse Vig, and John Riedl, Tagommenders: Connecting
Users to Items through Tags, WWW 2009
IUI 2008 Paper Overview
We invent a new collaborative filtering approach TBCF (Tag-based Collaborative
Filtering) based on the semantic distance among tags assigned by different users
to improve the effectiveness of neighbor selection.
That is, two users could be considered similar not only if they rated the items
similarly, but also if they have similar cognitions over these items.
– Both Bob and Tom may rate the movie Avatar with 5 stars, which indicates they
all like this movie very much.
– Nevertheless, as a 3D fan, Bob appreciates this movie for its high quality 3D
animations, while Tom may think that it is a wonderful action movie.
Tag-based Collaborative Filtering
Tag-based User-Item Matrix
Item1 Item2 Item3 Item4
Alice Art, photo Home, Products Writing, Design Learning,
Daniel Photo, Album, Ø Typewriter Tutorial, Training
Sherry Ø Cleaning Ø Language, Study
Maggie Photography Ø Ovens Ø
1. Calculate the semantic similarity of tags based on WordNet (for the tags not
included in WordNet, calculate the edit-distance instead)
2. Calculate the similarity between tag sets
3. Calculate the similarity between user u and v by summing up the similarity of tag
sets on common pages (tagged by both u & v)
4. Find the top-N nearest neighbors of the active user to make the prediction
5. Return the top-M predicted items to the active user
Tag Similarity Calculation
Tag set similarity
– Hungarian method
WordNet Concept Tree
Word similarity in WordNet
If x and y are contained in WordNet, dis(x,y) is the shortest path length between x and y.
Extract total 8000 users, 5315 pages and 7670 tags from web logs.
Algorithm Average Precision Average Ranking
TBCF 0.27 2.8
cosine 0.13 1.5
Random generated subset Average Precision Average Precision
500 0.208 0.121
2000 0.182 0.118
4000 0.202 0.173
6000 0.209 0.180
Tagommenders: Connecting Users to Items through Tags