Random Walk by User Trust and TemporalIssues toward Sparsity Problem in SocialTagging Recommender Systems20130513Speaker: ...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issu...
3Introduction toRecommender Systems• Recommendation systems (RS) help to match userswith items– Ease information overload–...
Motivation• User generated data obtained by predefinedwebsite.– instead of random graph generator– e.g. ER model, BA model...
Preliminaries• Recommender system assumes:– A set of users, U = {u1, u2…un}– A set of items, I = {i1, i2… im}– Each user u...
Preliminaries: Trust Network• Additionally, there is a trust network amongusers in trust-based system:tu,v ∈ Tu: a real nu...
Network Model of User Trust andActionsUser LayerItem Layer7
Recommendation :Collaborative Filtering for Rating Value• Common task of recommendation:– Given an user u∈U and an item i ...
Problem Definition -Top-N Item Recommendation• Given a target user u• recommend a set of items Îu where | Îu | < Nand Îu ∩...
Outline• Introduction• Related Works– Itembased CF– RandomWalk Recommendation– TrustWalker– Influence Probabilities• Cold ...
Related Work – Item-based CF• By similarity between items or users• Simply predict by weighted sum of similaritems. (ex: 5...
Related Works –Random Walk Recommendation12[3] Yildirim, Hilmi, and Mukkai S. Krishnamoorthy. "A random walk method for al...
Random Walk Recommendation– Three components1. Building the item graph which captures the similarity ofitems between each ...
Related Works - TrustWalker• Combined user-based recommendation and item-based and waiting forrandom walk variance converg...
Related Works-Influence Probabilities• Toward Influence Maximization problem– To find the influence between each user pair...
Learning Influence Probabilitiesthe Models• Parameters to learn:– #actions performed by each user – Au– #actions propagate...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithm• Temporal Decay Issue...
Cold Start Problem• Similarity matrices are usually too sparse to capture actual dependenciesbetween items.– item i that h...
Consider the State-of-the-artRecommendation• Matrix Factorization method[2] still dominatesif you only concern about the v...
Trust20
User Similarity v.s. Node Distance• Uni-partite previous• Katz centrality with penalty beta• Similar to pageRank21
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms– Item-based Random W...
Item-Based Random Walk (ItemRW)• Construct item-based similarity matrix– By Jaccard index• the Random walk process:– Denot...
User-Based Random Walk (UserRW)• Construct user-based similarity• the Random walk process:– Denote Xu,i the random variabl...
Influence-Based Random WalkAlgorithm1. Build the item and user graph with correlation2. learn influence power by parsing t...
Learning Influence - GraphUser LayerItem LayerGoyal, Amit, Francesco Bonchi, and Laks VS Lakshmanan. "Learning influencepr...
Algorithm Formulation27
Sigmoid Smoothing• Adjust the weight for fewer related items• A sigmoid function is a mathematicalfunction having an "S" s...
Influence-BasedRandom Walku1u2uu4u5ItemiΦu1,iitemjΦu2,iΦu4,iΦu5,isimi,j
• Influence-based User Random WalkProbability:α*user-similarity(u,v) + (1-α)*Influence Power(u,v)User LayerYildirim, Hilmi...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issu...
Exponential Time Decay Function32Dunlavy, Daniel M., Tamara G. Kolda, and Evrim Acar. "Temporal link predictionusing matri...
Time Interval Analysis– ItemBetwUser LayerItem Layer{u, i1, t1}{u, i2, t2}User u take actioni1 at timestamp t1i2 at timest...
Time Interval – ItemBetw• By assumption:items which users took action on it in shortinterval gains higher similarity“you a...
Time Interval Analysis –PastDecayUser LayerItem Layer{u, i, t1}{u, j, t2}User u take actioni at timestamp t1j at timestamp...
Time Interval - PastDecay• By assumption:items which users action it in short time beforenow gains higher similarityThe ne...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issu...
Credibility & Accordance• What is the evidently to examine recommendationquality of algorithm?– The ranking of testing ite...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issu...
ALL-BUT-ONE Evaluation• Also called “leave-one-out” method• Predict the last item i target user u took• Output top-N, if t...
Dataset of Experiment41• bookmark data• 68,215 bookmark URLs from 1,867 users• friendship “become mutual fans” with timest...
Dataset Description• Social degree of node (trust) conformspower-law distribution.42010203040506070801 5 9 13 17 21 25 29 ...
Experiment - Learning Influence• User-Based Similarity:– Average correlation by Jaccard index– 2.58%– Average correlation ...
Experimental Result4405101520253010 20 30 40 50 60 70 80 90 100RECALL AND TOP-K SIZEMIN_ITEM_FOR_USER>5user-based influenc...
Result for Cold Start User051015202530354045Recall for Cold Start Userwith action-item <10item-based RWuser-based RWInflue...
Results for ratio ofGlobal/Friendship Ratio αα*user-similarity(u,v) + (1-α)*Influence Power(u,v)0510152025303540451 2 3 4 ...
Time Interval Decay Result -ItemBetw• Set decay function as constant = 1 gain thebest performance!0510152025301.5 1.1 1.05...
Time Interval Decay Result -PastDecay051015202510 20 30 40 50 60 70 80 90 100TIMEDECAY1.5 1.1 1.05 1.01 10.99 0.95 0.9 0.8...
Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issu...
Discussion - Why TrustWalker Fails?• TrustWalker puts more emphasis on the localtrusted user instead of global similar use...
Discussion:Influence Based Random Walk51• For α is near to 0.001– In the different scale of user similarit• Like Decision ...
Discussion: Time Interval Decay52• Achieve peak when all the data remain the sameweight in the time issue.• “In predefined...
Conclusion• Propose novel method by influence.– Influence-based Random Walk– Intersection with item and user• Probe and le...
Q&AThanks for Your Attention!54
Upcoming SlideShare
Loading in …5
×

Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Social Tagging Recommender Systems

648 views
562 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
648
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Social Tagging Recommender Systems

  1. 1. Random Walk by User Trust and TemporalIssues toward Sparsity Problem in SocialTagging Recommender Systems20130513Speaker: Yan Kai HuangNTU Internet Research Lab1
  2. 2. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion2
  3. 3. 3Introduction toRecommender Systems• Recommendation systems (RS) help to match userswith items– Ease information overload– Sales assistance (guidance, advisory, persuasion,…)• Collaborative Filtering– Considers Users with Similar Rating Patterns– Aggregates the ratings of Similar Users• Social Networks Emerged Recently– Independent source of information• Motivations of Trust-based RS– Social Influence: users adopt the behavior of theirfriends
  4. 4. Motivation• User generated data obtained by predefinedwebsite.– instead of random graph generator– e.g. ER model, BA model, WS model… etc.– Unable to generate uni-partite,not to mentioned bipartite.• “Knowledge discovery”– What is the characteristics of user-action data?– What can be attributed into pragmaticapplications?– Data-proven reliability.4
  5. 5. Preliminaries• Recommender system assumes:– A set of users, U = {u1, u2…un}– A set of items, I = {i1, i2… im}– Each user u do actions for a set of items:Iu = {iu1, iu2… iuk}– The action of user u on item i is denoted by Au,I5
  6. 6. Preliminaries: Trust Network• Additionally, there is a trust network amongusers in trust-based system:tu,v ∈ Tu: a real number in [0,1] denotes u trust v .• The trust network can be represented as adirected graph G = <U, T>• T={ (u, v) | u ∈ U, v ∈ Tu}6
  7. 7. Network Model of User Trust andActionsUser LayerItem Layer7
  8. 8. Recommendation :Collaborative Filtering for Rating Value• Common task of recommendation:– Given an user u∈U and an item i ∈ I– For an unknown action, predict action value (ratingstars in [0,5]) for user u on item i.• Is “value prediction” what the user want? Tractable to compare and optimize. NOT practical and user-friendly Serendipity8
  9. 9. Problem Definition -Top-N Item Recommendation• Given a target user u• recommend a set of items Îu where | Îu | < Nand Îu ∩ Îu= Ø– Once produced, the rank within set does NOTmatter anymore.• Verify whether the testing item îu iscontained in the resulting item set.9
  10. 10. Outline• Introduction• Related Works– Itembased CF– RandomWalk Recommendation– TrustWalker– Influence Probabilities• Cold Start Problem• Random Walk and Probability Assignment• Algorithm• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion10
  11. 11. Related Work – Item-based CF• By similarity between items or users• Simply predict by weighted sum of similaritems. (ex: 5*0.2+4*0.3+3*0.5 = 3.7)• Take the highest rating n items as the top-N11[1] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-basedcollaborative filtering recommendation algorithms. In Proceedings of the 10th internationalconference on World Wide Web (WWW 01).
  12. 12. Related Works –Random Walk Recommendation12[3] Yildirim, Hilmi, and Mukkai S. Krishnamoorthy. "A random walk method for alleviatingthe sparsity problem in collaborative filtering." Proceedings of the 2008 ACM conference onRecommender systems. ACM, 2008.
  13. 13. Random Walk Recommendation– Three components1. Building the item graph which captures the similarity ofitems between each other2. The second component computes the rank values of itemsfor each user by simulating a random walk3. Finally the last component interprets and scales the rankscores as ratings for each user-item pair.13
  14. 14. Related Works - TrustWalker• Combined user-based recommendation and item-based and waiting forrandom walk variance converge.• Starts from Source user u0, at step k, at node u:– If u has rated i, return ru,i– With Φu,i,k , the random walk stops• Randomly select item j rated by u and return ru,j .– With 1- Φu,i,k , continue the random walk to a direct neighbor of u.• Three way to stop:1. Reaching a node uk who has expressed an action on item i2. Decide to stay at the user uk and select one of the items i rated by uk3. Define max-depth = 6 (by “six-degrees of separation”)14[5] Mohsen Jamali and Martin Ester. 2009. ”TrustWalker: a random walk model forcombining trust-based and item-based recommendation.” In Proceedings of the 15th ACMSIGKDD international conference on Knowledge discovery and data mining (KDD 09).[4] Mohsen Jamali and Martin Ester. "Using a trust network to improve top-Nrecommendation." Proceedings of the third ACM conference on Recommender systems. ACM, 2009.
  15. 15. Related Works-Influence Probabilities• Toward Influence Maximization problem– To find the influence between each user pair.– Assume influence probabilities do NOT remainconstant independently of time?Exponential Decay• Dataset Difference– Yahoo! Flickr dataset– “Joining a group”(?!) is considered as action– User “James” joined “Whistler Mountains” attimestamp 5.15[6] Goyal, Amit, Francesco Bonchi, and Laks VS Lakshmanan. "Learning influenceprobabilities in social networks." Proceedings of the third ACM international conferenceon Web search and data mining. ACM, 2010.
  16. 16. Learning Influence Probabilitiesthe Models• Parameters to learn:– #actions performed by each user – Au– #actions propagated via each edge–Av2u– Mean life time –P a1 5Q a1 10R a1 15Q a2 12R a2 14R a3 6P a3 14u AuPQRP Q RP XQ 0,0 XR 0,0 X0101010,01,5 0,01,10220,01,2320,01,8uv,uv,uv, ,A 16[6] Goyal, Amit, Francesco Bonchi, and Laks VS Lakshmanan. "Learning influenceprobabilities in social networks." Proceedings of the third ACM international conferenceon Web search and data mining. ACM, 2010.InfluenceModelsQ RP0.33000.50.50.2
  17. 17. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithm• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion17
  18. 18. Cold Start Problem• Similarity matrices are usually too sparse to capture actual dependenciesbetween items.– item i that hasn’t been rated by any user who has rated item j : similarity score of 0– However these items would be found as closely to each other, if another item t issimilar to both items.• Random Walk Recommender captures these transitive associations invarious levels proportional to the length of the random walk.• Parameterize the length of the walk according to the sparsity level of the ratingmatrix by continue probability (typically 0.8~0.85)• Cold Start UserUser with few action and plenty relation/friendUser with plenty action and few relation/friend• Not cold user -> traditional CF works best! New Comer with few action and few relation• Use sigmoid function and alpha to beverage ratio between user-similarity andsocial influence18
  19. 19. Consider the State-of-the-artRecommendation• Matrix Factorization method[2] still dominatesif you only concern about the value accuracy: Highly effective: learning by training dataset Low efficiency: high complexity and memory costs Without quality indicator and source explain-ability “Latent” is scanty of physical meaning Centralized information is needed.• Network Method based-on neighborhood similarity: distributed Random work with lower complexity Feasible to update immediately
  20. 20. Trust20
  21. 21. User Similarity v.s. Node Distance• Uni-partite previous• Katz centrality with penalty beta• Similar to pageRank21
  22. 22. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms– Item-based Random Walk– User-based Random Walk– Influence-based Random Walk• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion22
  23. 23. Item-Based Random Walk (ItemRW)• Construct item-based similarity matrix– By Jaccard index• the Random walk process:– Denote Yu,i the random variable for selecting item jamongst items rated by u for an item similar to i.– General by Sigmoid Function, where exp ascommon neighborLiben‐Nowell, David, and Jon Kleinberg. "The link‐prediction problem for socialnetworks." Journal of the American society for information science andtechnology 58.7 (2007): 1019-1031. 23
  24. 24. User-Based Random Walk (UserRW)• Construct user-based similarity• the Random walk process:– Denote Xu,i the random variable for selectinguser v amongst all v for an user similar to u.– Pick a nearest neighbor and output the actionset of v.24
  25. 25. Influence-Based Random WalkAlgorithm1. Build the item and user graph with correlation2. learn influence power by parsing the trust corpus3. Perform random walk on the graph to get rank list.• To perform a random walk, we can acquire neededinformation by user request distributedly.• To validate the algorithm, we compute the expected value andsort the state probabilities of each items.– Most of them remain 0 -> no need to parse full item vector I to performmatrix operation25Build GraphLearningInfluenceupdatedRandomWalk toproducerank list
  26. 26. Learning Influence - GraphUser LayerItem LayerGoyal, Amit, Francesco Bonchi, and Laks VS Lakshmanan. "Learning influenceprobabilities in social networks." Proceedings of the third ACM international conferenceon Web search and data mining. ACM, 2010.Influence Power △t = (t2-t1){u1, i1, t1} {u2, i1, t2}u1 take action i1at timestamp t126
  27. 27. Algorithm Formulation27
  28. 28. Sigmoid Smoothing• Adjust the weight for fewer related items• A sigmoid function is a mathematicalfunction having an "S" shape (sigmoidcurve). Often, sigmoid function refers to thespecial case of the logistic function anddefined by the formula
  29. 29. Influence-BasedRandom Walku1u2uu4u5ItemiΦu1,iitemjΦu2,iΦu4,iΦu5,isimi,j
  30. 30. • Influence-based User Random WalkProbability:α*user-similarity(u,v) + (1-α)*Influence Power(u,v)User LayerYildirim, Hilmi, and Mukkai S. Krishnamoorthy. "A random walk method for alleviatingthe sparsity problem in collaborative filtering." Proceedings of the 2008 ACMconference on Recommender systems. ACM, 2008.Influence-basedUser Transition Probabilities
  31. 31. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issues– ItemBetw– PastDecay• Credibility & Accordance• Experiment Design• Discussion and Conclusion31
  32. 32. Exponential Time Decay Function32Dunlavy, Daniel M., Tamara G. Kolda, and Evrim Acar. "Temporal link predictionusing matrix and tensor factorizations." ACM Transactions on Knowledge Discoveryfrom Data (TKDD) 5.2 (2011): 10.
  33. 33. Time Interval Analysis– ItemBetwUser LayerItem Layer{u, i1, t1}{u, i2, t2}User u take actioni1 at timestamp t1i2 at timestamp t233
  34. 34. Time Interval – ItemBetw• By assumption:items which users took action on it in shortinterval gains higher similarity“you all items are my favorite of past…”Where items which user take action during longtimeslot will become close to 034
  35. 35. Time Interval Analysis –PastDecayUser LayerItem Layer{u, i, t1}{u, j, t2}User u take actioni at timestamp t1j at timestamp t2k at timestamp t3{u, k, t3}35
  36. 36. Time Interval - PastDecay• By assumption:items which users action it in short time beforenow gains higher similarityThe newer, the better !!!Where items which user take action during longtime interval will be close to 036
  37. 37. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion37
  38. 38. Credibility & Accordance• What is the evidently to examine recommendationquality of algorithm?– The ranking of testing item in our rank list!– For the best case: rank = 1, presented by avg. percentage:Rank 3 out of top-15 => credibility u,i =20%– Metrics to a Recommender System/Method• Select the highest Probability of related item/useras the reference38
  39. 39. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issues• Credibility & Accordance• Experiment Design– ALL-BUT-ONE Evaluation– Dataset Description– Experimental Result• Discussion and Conclusion39
  40. 40. ALL-BUT-ONE Evaluation• Also called “leave-one-out” method• Predict the last item i target user u took• Output top-N, if the action items is contained, callsa HITItem Layer{u, i1, t1} {u, i2, t2}{u, i3, t3}40{u, i?, tlast}, L to be the testing set size.
  41. 41. Dataset of Experiment41• bookmark data• 68,215 bookmark URLs from 1,867 users• friendship “become mutual fans” with timestampinformation<source_user, target_user, timestamp>• Action also with timestamp to measure the intervalinfluence.<user, item, timestamp>
  42. 42. Dataset Description• Social degree of node (trust) conformspower-law distribution.42010203040506070801 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89User Action Times Distribution#(ItemtoTag)(#Days)010020030040050060070080090010 20 30 40 50 60 70 80 90 100110120USER DEGREE OF TRUSTDISTRIBUTION(#Social Degree)(#Users)
  43. 43. Experiment - Learning Influence• User-Based Similarity:– Average correlation by Jaccard index– 2.58%– Average correlation in mutual trust– 8.28% (4 times as average!)0500100015002000250030003500400045000.0010.0110.0210.0310.0410.0510.0610.0710.0810.0910.1010.1110.1210.1310.1410.1510.1610.1710.1810.191User-based Similarity0.0010.010.0190.0280.0370.0460.0550.0640.0730.0820.0910.10.1090.1180.1270.1360.1450.1540.1630.1720.1810.190.199User-based Similarity – withMutual Trust43
  44. 44. Experimental Result4405101520253010 20 30 40 50 60 70 80 90 100RECALL AND TOP-K SIZEMIN_ITEM_FOR_USER>5user-based influence baseditemBased itemEnhancedrelational popular051015202510 20 30 40 50 60 70 80 90 100RECALL AND TOP-K SIZEMIN_ITEM_FOR_USER > 1userbased influenceBaseditemBased itemEnhancedTrustwalker popular
  45. 45. Result for Cold Start User051015202530354045Recall for Cold Start Userwith action-item <10item-based RWuser-based RWInfluenceitemAdjustTrustWalkerHitRadio(%)10 20 30 40 50 60 70 80 90 100
  46. 46. Results for ratio ofGlobal/Friendship Ratio αα*user-similarity(u,v) + (1-α)*Influence Power(u,v)0510152025303540451 2 3 4 5 6 7 8 9 10Recall for Cold Start Userwith Action-item <10alpha = 0.1 alpha =0.01alpha =0.001 alpha =0.0001HitRadio(%)
  47. 47. Time Interval Decay Result -ItemBetw• Set decay function as constant = 1 gain thebest performance!0510152025301.5 1.1 1.05 1.01 1 0.99 0.95 0.9 0.8 0.7TIME ITEMBETW TOP-KCURVE47HitRadio(%)05101520253010 30 50 70 90Time ItemBetw Top-K Curve1.51.11.051.0110.990.950.990.950.7
  48. 48. Time Interval Decay Result -PastDecay051015202510 20 30 40 50 60 70 80 90 100TIMEDECAY1.5 1.1 1.05 1.01 10.99 0.95 0.9 0.805101520251.51.11.051.0110.990.950.90.80.7TimeDecay Top-k Curve48
  49. 49. Outline• Introduction• Related Works• Cold Start:– Random Walk and Probability Assignment• Algorithms• Temporal Decay Issues• Credibility & Accordance• Experiment Design• Discussion and Conclusion49
  50. 50. Discussion - Why TrustWalker Fails?• TrustWalker puts more emphasis on the localtrusted user instead of global similar user.• Minimize the Mean Square Error :– Similar to Non-personalized Popular List• As mentioned, top-N result is more user-friendly50TrustWalker Experiment ondataset: EpinionBecome a fans of experts andColumnistsTrust > Global similarity
  51. 51. Discussion:Influence Based Random Walk51• For α is near to 0.001– In the different scale of user similarit• Like Decision tree:– Similarity would be the primary andInfluence power are the secondaryComparison metrics sim(u,v)Influence Power(u,v)
  52. 52. Discussion: Time Interval Decay52• Achieve peak when all the data remain the sameweight in the time issue.• “In predefined dataset, you should not easilyabandon or under estimate value of old data.”
  53. 53. Conclusion• Propose novel method by influence.– Influence-based Random Walk– Intersection with item and user• Probe and leverage influence probabilities anduser correlation for cold start user• Provide creditability and Accordance for userexperience and feedback in RS• Analyze the time decay function by 2 decayfunction– PastDecay– Itembetw53
  54. 54. Q&AThanks for Your Attention!54

×