SlideShare a Scribd company logo
Frank: A Ranking Method with
         Fidelity Loss
  Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi
        chen, Wei-Ying Ma, SIGIR 2007


                              Advisor: Chia-Hui Chang
                              Student: Chen-Ling Chen
                              Date: 2011-10-21




                                                        1
Outline
   Introduction
   Related Work
   Fidelity Rank
   Experiments
   Conclusions



                   2
Introduction
The IR problem can be formulated as a ranking problem:
provided a query and a set of documents, an IR system should
provide a ranked list of documents

 In addition to traditional IR approaches, machine learning
techniques are becoming more widely used for the ranking
problem of IR
   the learning-based methods aim to use labeled data for
   learning an effective ranking function
   several learning-based methods, such as RankBoost, RankSVM,
   and RankNet, view the problem as a ranking problem

                                                              3
Introduction(cont.)
The probabilistic ranking framework has many effective
properties for ranking, such as pair-wise differentiable loss,
and can better model multiple relevance levels

A further studies on the probabilistic ranking framework, and
propose a novel loss function named fidelity loss
   inherits those properties of the probabilistic ranking
   framework, and has several additional useful characteristics for
   ranking

Fidelity Rank (FRank) combines the probabilistic ranking
framework with the generalized additive model

                                                                      4
Related work
Treated IR as a binary classification problem that directly
classifies a document as relevant or irrelevant with respect to
the given query

– However, in real Web searching, a page has multiple relevance
  levels, such as highly relevant, partially relevant, and definitely
  irrelevant


– some studies regard IR as a ranking problem



                                                                  5
Related work(cont.)
RankBoost used the Boosting approach for learning a ranking
function and solved the problem of combining preferences
   aims to minimize the weighted number of pairs of instances that
   are misordered by the final ranking function
   For each round t, RankBoost chooses αt and the weak learner ht
   so as to minimize the pair-wise loss, which is defined by



   The rule of adjustment is to decrease the weight of pairs if ht
   gives a correct ranking (ht(xi) > ht(xj)) and increase otherwise.
   Finally, this algorithm outputs the ranking function by combining
   selected weak learners


                                                    6
Related work(cont.)
RankSVM used Support Vector Machines for optimizing
search performance using click-through data

  aims to minimize the number of discordant pairs, and to
  maximize the margin of pair

is well-founded in the structure risk minimization framework,
and is verified in a controlled experiment

   the advantage is that it needs no human-labeled data, and can
   automatically learn the ranking function from click-through data



                                                 7
Related work(cont.)
RankNet a probabilistic ranking framework and used neural
network for minimizing a pair-wise differentiable loss

compared to RankBoost and RankSVM,the loss function in RankNet
is pair-wise differentiable, which can be regarded as an advantage

    loss function in RankNet still has some problems:
       it has no real minimal loss and appropriate upper bound
       These problems may cause the trained ranking function inaccurate and
       the training process biased by some hard pairs




                                                         8
Fidelity Rank




                9
Fidelity Rank(cont.)




                   10
Fidelity Rank(cont.)
Problems within cross entropy loss function:

                            The cross entropy loss function
                            cannot achieve the real minimal loss,
                            zero, expect for the pair with
                            posterior probability is 0 or 1

                            The cross entropy loss of a pair
                            has no upper bound

                            The query-level normalization is hard
                            to define in cross entropy loss

                                            11
Fidelity Rank(cont.)




                   12
Fidelity Rank(cont.)




                   13
Fidelity Rank(cont.)




                   14
Fidelity Rank(cont.)




                   15
Fidelity Rank(cont.)




                   16
Fidelity Rank(cont.)




                   17
Fidelity Rank(cont.)
FRank algorithm Derivation




                             18
Fidelity Rank(cont.)
FRank algorithm Derivation
  When a binary weak learner is introduced, the above equation can
  be simplified because      only takes values -1, 0, and 1
  Therefore, the equation can be expressed by




                                                 19
Fidelity Rank(cont.)




                   20
Experiments




              21
Experiments(cont.)




                 22
Experiments(cont.)
Comparison methods
   RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25
Experiment on TREC Dataset
   Four-fold cross validation


Experimental Results on TREC Dataset
   All learning-based methods
  outperform BM25
   FRank even obtains about 40%
  improvement over BM25, in MAP
   Frank is suitable for Web search


                                                   23
Experiments(cont.)
Experiment on Web Search Dataset
   These web pages are partially labeled with ratings from 1 to 5
        The unlabeled pages are given the rating 0
            20 unlabeled documents are randomly selected for training
   For a given query, a page is represented by query-dependent (e.g., term
   frequency) and query-independent (e.g., PageRank) features
   The total number of features is 619




Training process of FRank


                                                       24
Experiments(cont.)
The power of representation of RankBoost is quite limited with few
weak learners
   Set the number of weak learners as 224 for FRank and 271 for RankBoost
The performance of RankNet_TwoLayer dropped when the number
of epochs was more than 10
   Set the number of epochs as 25 for RankNet Linear and 9 for
   RankNet_TwoLayer




                                                         25
Experiments(cont.)
An empirical approach of using linear
combination of BM25 and PageRank is also
performed
Using conventional IR model is inadequate
   Because the learning-based methods is
   capable of leveraging various features in a large-
   scale Web search dataset
These results indicate that the probabilistic
ranking framework is a suitable framework for
learning to rank
FRank algorithm significantly outperforms
other learned based ranking algorithms
   The corresponding p-values are 0.0114 for
   NDCG@1, 0.007 for NDCG@5, and 0.0056 for
   NDCG@10.

                                                        26
Experiments(cont.)

Several observations as follows:
The FRank algorithm performs effectively in small training dataset
since it introduces query-level normalization in the fidelity loss
function
Using more training data does not guarantee improvement on
performance (e.g., RankBoost)
The probabilistic ranking framework performs well when the
amount of training data is large
   This is because more pair-wise information is capable of making the
   trained ranking function more accurate
FRank outperformed other ranking algorithms for all cases

                                                                27
Conslusion
• The FRank algorithm performs well in practice, even
  for conventional TREC dataset and real Web search
  dataset

• Future work
• For theoretical aspects, investigate how to prove the generalization bound
  based on the probabilistic ranking framework
• Study whether it is more effective to combine the fidelity loss with other
  machine learning techniques, such as kernel methods
• On scalability issues, implement a parallel version of FRank that can handle
  even larger training datasets


                                                          28
Thank you for listening.




                    29
Related work(cont.)
 Joachims proposed RankSVM algorithm, which uses Support
 Vector Machines for optimizing search performance using
 click-through data
 RankSVM aims to minimize the number of discordant pairs, and to
 maximize the margin of pair
 The margin maximization is equal to minimize the L2-norm of
 hyperplane parameter


 Given a query, if the ground truths assert that document di is
 more relevant than dj, the constraint of RankSVM is
                     Where          is the feature vector
calculated from document d relative to query q

                                               30
Related work(cont.)
the ranking problem can be expressed as the following
constrained optimization problem




RankSVM is well-founded in the structure risk minimization
framework, and is verified in a controlled experiment
    the advantage is that it needs no human-labeled data, and
   can automatically learn the ranking function from click-
   through data.

                                             31

More Related Content

Similar to 2011-10-28大咪報告

Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
Ganesh Venkataraman
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
FELIX75
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine Learning
Pradip Rahul
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
ijmpict
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
powerpoint
powerpointpowerpoint
powerpoint
butest
 
Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...
Sayed Mohsin Reza
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
Nishna Ma
 
A Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational DatabasesA Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational Databases
IOSR Journals
 
Download
DownloadDownload
Download
butest
 
Download
DownloadDownload
Download
butest
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
VishalKumar725248
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
VishalKumar725248
 
ppt
pptppt
ppt
butest
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
MLconf
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
Xavier Amatriain
 
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdfitm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
beshahashenafe20
 
Mazhiming
MazhimingMazhiming
Mazhiming
lilyyyyyyy
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
Xu jiakon
 

Similar to 2011-10-28大咪報告 (20)

Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine Learning
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
powerpoint
powerpointpowerpoint
powerpoint
 
Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
A Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational DatabasesA Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational Databases
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
 
ppt
pptppt
ppt
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdfitm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
 
Mazhiming
MazhimingMazhiming
Mazhiming
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
 

Recently uploaded

一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
wkip62b
 
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
kohd1ci2
 
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
zv943dhb
 
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
k4krdgxx
 
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
kmzsy4kn
 
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
qa8dk1wm
 
Plastic Molding Infographic - RPWORLD.pdf
Plastic Molding Infographic - RPWORLD.pdfPlastic Molding Infographic - RPWORLD.pdf
Plastic Molding Infographic - RPWORLD.pdf
RPWORLD Manufacturing
 
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
batchelorerbm45967
 
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
ka3y2ukz
 
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
bz42w9z0
 
Intel-Centrino-Mobile-Technology-guidelines
Intel-Centrino-Mobile-Technology-guidelinesIntel-Centrino-Mobile-Technology-guidelines
Intel-Centrino-Mobile-Technology-guidelines
EricHo305923
 
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
ynrtjotp
 
一比一原版马来西亚世纪大学毕业证成绩单一模一样
一比一原版马来西亚世纪大学毕业证成绩单一模一样一比一原版马来西亚世纪大学毕业证成绩单一模一样
一比一原版马来西亚世纪大学毕业证成绩单一模一样
k4krdgxx
 
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
bljeremy734
 
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
aprhf21y
 
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
i990go7o
 
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
sf3cfttw
 
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
yk5hdsnr
 
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
881evgn0
 
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
67n7f53
 

Recently uploaded (20)

一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
 
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
一比一原版(ECU毕业证)澳洲埃迪斯科文大学毕业证如何办理
 
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
一比一原版(UCB毕业证)英国伯明翰大学学院毕业证如何办理
 
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
一比一原版(Deakin毕业证书)澳洲迪肯大学毕业证文凭如何办理
 
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
 
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
 
Plastic Molding Infographic - RPWORLD.pdf
Plastic Molding Infographic - RPWORLD.pdfPlastic Molding Infographic - RPWORLD.pdf
Plastic Molding Infographic - RPWORLD.pdf
 
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
欧洲杯足彩-欧洲杯足彩买球软件-欧洲杯足彩买球软件下载|【​网址​🎉ac123.net🎉​】
 
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
 
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
 
Intel-Centrino-Mobile-Technology-guidelines
Intel-Centrino-Mobile-Technology-guidelinesIntel-Centrino-Mobile-Technology-guidelines
Intel-Centrino-Mobile-Technology-guidelines
 
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
一比一原版(爱大毕业证)美国爱荷华大学毕业证如何办理
 
一比一原版马来西亚世纪大学毕业证成绩单一模一样
一比一原版马来西亚世纪大学毕业证成绩单一模一样一比一原版马来西亚世纪大学毕业证成绩单一模一样
一比一原版马来西亚世纪大学毕业证成绩单一模一样
 
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
欧洲杯足彩-欧洲杯足彩比赛投注-欧洲杯足彩比赛投注官网|【​网址​🎉ac10.net🎉​】
 
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
一比一原版(brunel毕业证书)布鲁内尔大学毕业证如何办理
 
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
一比一原版(UW毕业证书)华盛顿大学毕业证如何办理
 
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
一比一原版美国明尼苏达大学双城分校毕业证(UMTC学位证)如何办理
 
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
一比一原版(UofM毕业证)美国密歇根大学毕业证如何办理
 
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
一比一原版美国哥伦比亚大学毕业证Columbia成绩单一模一样
 
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
一比一原版(OU毕业证)美国俄克拉荷马大学毕业证如何办理
 

2011-10-28大咪報告

  • 1. Frank: A Ranking Method with Fidelity Loss Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi chen, Wei-Ying Ma, SIGIR 2007 Advisor: Chia-Hui Chang Student: Chen-Ling Chen Date: 2011-10-21 1
  • 2. Outline Introduction Related Work Fidelity Rank Experiments Conclusions 2
  • 3. Introduction The IR problem can be formulated as a ranking problem: provided a query and a set of documents, an IR system should provide a ranked list of documents In addition to traditional IR approaches, machine learning techniques are becoming more widely used for the ranking problem of IR the learning-based methods aim to use labeled data for learning an effective ranking function several learning-based methods, such as RankBoost, RankSVM, and RankNet, view the problem as a ranking problem 3
  • 4. Introduction(cont.) The probabilistic ranking framework has many effective properties for ranking, such as pair-wise differentiable loss, and can better model multiple relevance levels A further studies on the probabilistic ranking framework, and propose a novel loss function named fidelity loss inherits those properties of the probabilistic ranking framework, and has several additional useful characteristics for ranking Fidelity Rank (FRank) combines the probabilistic ranking framework with the generalized additive model 4
  • 5. Related work Treated IR as a binary classification problem that directly classifies a document as relevant or irrelevant with respect to the given query – However, in real Web searching, a page has multiple relevance levels, such as highly relevant, partially relevant, and definitely irrelevant – some studies regard IR as a ranking problem 5
  • 6. Related work(cont.) RankBoost used the Boosting approach for learning a ranking function and solved the problem of combining preferences aims to minimize the weighted number of pairs of instances that are misordered by the final ranking function For each round t, RankBoost chooses αt and the weak learner ht so as to minimize the pair-wise loss, which is defined by The rule of adjustment is to decrease the weight of pairs if ht gives a correct ranking (ht(xi) > ht(xj)) and increase otherwise. Finally, this algorithm outputs the ranking function by combining selected weak learners 6
  • 7. Related work(cont.) RankSVM used Support Vector Machines for optimizing search performance using click-through data aims to minimize the number of discordant pairs, and to maximize the margin of pair is well-founded in the structure risk minimization framework, and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click-through data 7
  • 8. Related work(cont.) RankNet a probabilistic ranking framework and used neural network for minimizing a pair-wise differentiable loss compared to RankBoost and RankSVM,the loss function in RankNet is pair-wise differentiable, which can be regarded as an advantage loss function in RankNet still has some problems: it has no real minimal loss and appropriate upper bound These problems may cause the trained ranking function inaccurate and the training process biased by some hard pairs 8
  • 11. Fidelity Rank(cont.) Problems within cross entropy loss function: The cross entropy loss function cannot achieve the real minimal loss, zero, expect for the pair with posterior probability is 0 or 1 The cross entropy loss of a pair has no upper bound The query-level normalization is hard to define in cross entropy loss 11
  • 19. Fidelity Rank(cont.) FRank algorithm Derivation When a binary weak learner is introduced, the above equation can be simplified because only takes values -1, 0, and 1 Therefore, the equation can be expressed by 19
  • 23. Experiments(cont.) Comparison methods RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25 Experiment on TREC Dataset Four-fold cross validation Experimental Results on TREC Dataset All learning-based methods outperform BM25 FRank even obtains about 40% improvement over BM25, in MAP Frank is suitable for Web search 23
  • 24. Experiments(cont.) Experiment on Web Search Dataset These web pages are partially labeled with ratings from 1 to 5 The unlabeled pages are given the rating 0 20 unlabeled documents are randomly selected for training For a given query, a page is represented by query-dependent (e.g., term frequency) and query-independent (e.g., PageRank) features The total number of features is 619 Training process of FRank 24
  • 25. Experiments(cont.) The power of representation of RankBoost is quite limited with few weak learners Set the number of weak learners as 224 for FRank and 271 for RankBoost The performance of RankNet_TwoLayer dropped when the number of epochs was more than 10 Set the number of epochs as 25 for RankNet Linear and 9 for RankNet_TwoLayer 25
  • 26. Experiments(cont.) An empirical approach of using linear combination of BM25 and PageRank is also performed Using conventional IR model is inadequate Because the learning-based methods is capable of leveraging various features in a large- scale Web search dataset These results indicate that the probabilistic ranking framework is a suitable framework for learning to rank FRank algorithm significantly outperforms other learned based ranking algorithms The corresponding p-values are 0.0114 for NDCG@1, 0.007 for NDCG@5, and 0.0056 for NDCG@10. 26
  • 27. Experiments(cont.) Several observations as follows: The FRank algorithm performs effectively in small training dataset since it introduces query-level normalization in the fidelity loss function Using more training data does not guarantee improvement on performance (e.g., RankBoost) The probabilistic ranking framework performs well when the amount of training data is large This is because more pair-wise information is capable of making the trained ranking function more accurate FRank outperformed other ranking algorithms for all cases 27
  • 28. Conslusion • The FRank algorithm performs well in practice, even for conventional TREC dataset and real Web search dataset • Future work • For theoretical aspects, investigate how to prove the generalization bound based on the probabilistic ranking framework • Study whether it is more effective to combine the fidelity loss with other machine learning techniques, such as kernel methods • On scalability issues, implement a parallel version of FRank that can handle even larger training datasets 28
  • 29. Thank you for listening. 29
  • 30. Related work(cont.) Joachims proposed RankSVM algorithm, which uses Support Vector Machines for optimizing search performance using click-through data RankSVM aims to minimize the number of discordant pairs, and to maximize the margin of pair The margin maximization is equal to minimize the L2-norm of hyperplane parameter Given a query, if the ground truths assert that document di is more relevant than dj, the constraint of RankSVM is Where is the feature vector calculated from document d relative to query q 30
  • 31. Related work(cont.) the ranking problem can be expressed as the following constrained optimization problem RankSVM is well-founded in the structure risk minimization framework, and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click- through data. 31