Frank: A Ranking Method with         Fidelity Loss  Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi        chen, Wei-Ying M...
Outline   Introduction   Related Work   Fidelity Rank   Experiments   Conclusions                   2
IntroductionThe IR problem can be formulated as a ranking problem:provided a query and a set of documents, an IR system sh...
Introduction(cont.)The probabilistic ranking framework has many effectiveproperties for ranking, such as pair-wise differen...
Related workTreated IR as a binary classification problem that directlyclassifies a document as relevant or irrelevant with ...
Related work(cont.)RankBoost used the Boosting approach for learning a rankingfunction and solved the problem of combining...
Related work(cont.)RankSVM used Support Vector Machines for optimizingsearch performance using click-through data  aims to...
Related work(cont.)RankNet a probabilistic ranking framework and used neuralnetwork for minimizing a pair-wise differentiab...
Fidelity Rank                9
Fidelity Rank(cont.)                   10
Fidelity Rank(cont.)Problems within cross entropy loss function:                            The cross entropy loss functio...
Fidelity Rank(cont.)                   12
Fidelity Rank(cont.)                   13
Fidelity Rank(cont.)                   14
Fidelity Rank(cont.)                   15
Fidelity Rank(cont.)                   16
Fidelity Rank(cont.)                   17
Fidelity Rank(cont.)FRank algorithm Derivation                             18
Fidelity Rank(cont.)FRank algorithm Derivation  When a binary weak learner is introduced, the above equation can  be simpl...
Fidelity Rank(cont.)                   20
Experiments              21
Experiments(cont.)                 22
Experiments(cont.)Comparison methods   RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25Experiment on TREC Datase...
Experiments(cont.)Experiment on Web Search Dataset   These web pages are partially labeled with ratings from 1 to 5       ...
Experiments(cont.)The power of representation of RankBoost is quite limited with fewweak learners   Set the number of weak...
Experiments(cont.)An empirical approach of using linearcombination of BM25 and PageRank is alsoperformedUsing conventional...
Experiments(cont.)Several observations as follows:The FRank algorithm performs effectively in small training datasetsince i...
Conslusion• The FRank algorithm performs well in practice, even  for conventional TREC dataset and real Web search  datase...
Thank you for listening.                    29
Related work(cont.) Joachims proposed RankSVM algorithm, which uses Support Vector Machines for optimizing search performa...
Related work(cont.)the ranking problem can be expressed as the followingconstrained optimization problemRankSVM is well-fo...
Upcoming SlideShare
Loading in …5
×

2011-10-28大咪報告

515 views
420 views

Published on

Published in: Design, Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
515
On SlideShare
0
From Embeds
0
Number of Embeds
118
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2011-10-28大咪報告

  1. 1. Frank: A Ranking Method with Fidelity Loss Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi chen, Wei-Ying Ma, SIGIR 2007 Advisor: Chia-Hui Chang Student: Chen-Ling Chen Date: 2011-10-21 1
  2. 2. Outline Introduction Related Work Fidelity Rank Experiments Conclusions 2
  3. 3. IntroductionThe IR problem can be formulated as a ranking problem:provided a query and a set of documents, an IR system shouldprovide a ranked list of documents In addition to traditional IR approaches, machine learningtechniques are becoming more widely used for the rankingproblem of IR the learning-based methods aim to use labeled data for learning an effective ranking function several learning-based methods, such as RankBoost, RankSVM, and RankNet, view the problem as a ranking problem 3
  4. 4. Introduction(cont.)The probabilistic ranking framework has many effectiveproperties for ranking, such as pair-wise differentiable loss,and can better model multiple relevance levelsA further studies on the probabilistic ranking framework, andpropose a novel loss function named fidelity loss inherits those properties of the probabilistic ranking framework, and has several additional useful characteristics for rankingFidelity Rank (FRank) combines the probabilistic rankingframework with the generalized additive model 4
  5. 5. Related workTreated IR as a binary classification problem that directlyclassifies a document as relevant or irrelevant with respect tothe given query– However, in real Web searching, a page has multiple relevance levels, such as highly relevant, partially relevant, and definitely irrelevant– some studies regard IR as a ranking problem 5
  6. 6. Related work(cont.)RankBoost used the Boosting approach for learning a rankingfunction and solved the problem of combining preferences aims to minimize the weighted number of pairs of instances that are misordered by the final ranking function For each round t, RankBoost chooses αt and the weak learner ht so as to minimize the pair-wise loss, which is defined by The rule of adjustment is to decrease the weight of pairs if ht gives a correct ranking (ht(xi) > ht(xj)) and increase otherwise. Finally, this algorithm outputs the ranking function by combining selected weak learners 6
  7. 7. Related work(cont.)RankSVM used Support Vector Machines for optimizingsearch performance using click-through data aims to minimize the number of discordant pairs, and to maximize the margin of pairis well-founded in the structure risk minimization framework,and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click-through data 7
  8. 8. Related work(cont.)RankNet a probabilistic ranking framework and used neuralnetwork for minimizing a pair-wise differentiable losscompared to RankBoost and RankSVM,the loss function in RankNetis pair-wise differentiable, which can be regarded as an advantage loss function in RankNet still has some problems: it has no real minimal loss and appropriate upper bound These problems may cause the trained ranking function inaccurate and the training process biased by some hard pairs 8
  9. 9. Fidelity Rank 9
  10. 10. Fidelity Rank(cont.) 10
  11. 11. Fidelity Rank(cont.)Problems within cross entropy loss function: The cross entropy loss function cannot achieve the real minimal loss, zero, expect for the pair with posterior probability is 0 or 1 The cross entropy loss of a pair has no upper bound The query-level normalization is hard to define in cross entropy loss 11
  12. 12. Fidelity Rank(cont.) 12
  13. 13. Fidelity Rank(cont.) 13
  14. 14. Fidelity Rank(cont.) 14
  15. 15. Fidelity Rank(cont.) 15
  16. 16. Fidelity Rank(cont.) 16
  17. 17. Fidelity Rank(cont.) 17
  18. 18. Fidelity Rank(cont.)FRank algorithm Derivation 18
  19. 19. Fidelity Rank(cont.)FRank algorithm Derivation When a binary weak learner is introduced, the above equation can be simplified because only takes values -1, 0, and 1 Therefore, the equation can be expressed by 19
  20. 20. Fidelity Rank(cont.) 20
  21. 21. Experiments 21
  22. 22. Experiments(cont.) 22
  23. 23. Experiments(cont.)Comparison methods RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25Experiment on TREC Dataset Four-fold cross validationExperimental Results on TREC Dataset All learning-based methods outperform BM25 FRank even obtains about 40% improvement over BM25, in MAP Frank is suitable for Web search 23
  24. 24. Experiments(cont.)Experiment on Web Search Dataset These web pages are partially labeled with ratings from 1 to 5 The unlabeled pages are given the rating 0 20 unlabeled documents are randomly selected for training For a given query, a page is represented by query-dependent (e.g., term frequency) and query-independent (e.g., PageRank) features The total number of features is 619Training process of FRank 24
  25. 25. Experiments(cont.)The power of representation of RankBoost is quite limited with fewweak learners Set the number of weak learners as 224 for FRank and 271 for RankBoostThe performance of RankNet_TwoLayer dropped when the numberof epochs was more than 10 Set the number of epochs as 25 for RankNet Linear and 9 for RankNet_TwoLayer 25
  26. 26. Experiments(cont.)An empirical approach of using linearcombination of BM25 and PageRank is alsoperformedUsing conventional IR model is inadequate Because the learning-based methods is capable of leveraging various features in a large- scale Web search datasetThese results indicate that the probabilisticranking framework is a suitable framework forlearning to rankFRank algorithm significantly outperformsother learned based ranking algorithms The corresponding p-values are 0.0114 for NDCG@1, 0.007 for NDCG@5, and 0.0056 for NDCG@10. 26
  27. 27. Experiments(cont.)Several observations as follows:The FRank algorithm performs effectively in small training datasetsince it introduces query-level normalization in the fidelity lossfunctionUsing more training data does not guarantee improvement onperformance (e.g., RankBoost)The probabilistic ranking framework performs well when theamount of training data is large This is because more pair-wise information is capable of making the trained ranking function more accurateFRank outperformed other ranking algorithms for all cases 27
  28. 28. Conslusion• The FRank algorithm performs well in practice, even for conventional TREC dataset and real Web search dataset• Future work• For theoretical aspects, investigate how to prove the generalization bound based on the probabilistic ranking framework• Study whether it is more effective to combine the fidelity loss with other machine learning techniques, such as kernel methods• On scalability issues, implement a parallel version of FRank that can handle even larger training datasets 28
  29. 29. Thank you for listening. 29
  30. 30. Related work(cont.) Joachims proposed RankSVM algorithm, which uses Support Vector Machines for optimizing search performance using click-through data RankSVM aims to minimize the number of discordant pairs, and to maximize the margin of pair The margin maximization is equal to minimize the L2-norm of hyperplane parameter Given a query, if the ground truths assert that document di is more relevant than dj, the constraint of RankSVM is Where is the feature vectorcalculated from document d relative to query q 30
  31. 31. Related work(cont.)the ranking problem can be expressed as the followingconstrained optimization problemRankSVM is well-founded in the structure risk minimizationframework, and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click- through data. 31

×