SlideShare a Scribd company logo
1 of 22
CONTEXT BASED CITATION RECOMMENDATION
Which work you are referring ? Know similar works …
Guided By : Dr. Animesh Mukherjee & Dr. Pawan Goyal
Citation and Citation Context
• Citations – crucial for assignments of academic credit. Helps support claims in
one’s own work.
• Citation Context (c) – sequence of words that appear around a particular citation.
• Citation context contains words that describe or summarize the cited papers.
• Semantics of cited documents should be close to citation context.[1]
• Citation Recommendation Engine – helps checking completeness of citations
when authoring a paper and find prior work related to topic under investigation
and to find missing relevant citations.[1]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
Citation and Citation Context
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[1]
Motivation and Approach
• Motivation : Since all the words in the citation context are used to describe the same
citation, the semantics of these words should be similar.
• This motivates to learn the semantic embedding for words in citation contexts and
cited documents and to recommend citations based on semantic distance.
• Proposed to learn distributed semantic representations of words and documents.
• Using the representations, we train a Neural Network Model that estimate probability
of citing a paper given a citation context.[1]
• Neural Network Model will tune distributed representations of words and documents
so that semantic similarity between citation context and cited paper will be high.[1]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
Related work : Global Citation Recommendation
[1] McNee et al. (2002) : www-users.cs.umn.edu/~mcnee/konstan-nectar-2006.pdf
• It suggest a list of references for an entire given manuscript.
 McNee et al. (2002) used a partial list of reference as the input query and
recommended additional references based on collaborative filtering.[1]
 Strohman et al. (2007) assumed that the input is an incomplete paper manuscript
and recommended papers with high text similarity using bibliography similarity.
 Bethard and Jurafsky (2010) introduced features such as topical similarity, author
behavioral patterns, citation count, and recency of publication to train a classifier
for global recommendation.
Related work : Local Citation Recommendation
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
• Input query is a particular context where a citation should be made and output is a
short list of papers that need to be cited given context.
 He et al. (2010) assumed that user has provided placeholders for citation in a
query manuscript. A probabilistic model was trained to measure the relevance
between the citation contexts and the cited documents.[1]
 In another work (He et al. 2011), they first segmented the document into a
sequence of disjointed candidate citation contexts, and then applied the
dependency feature model to retrieve a ranked list of papers for each context.[1]
 IBM translation model-1 was trained to learn the probability of citing a document
given a word. Lu et al. (2011) used the translation model to learn the translation
probabilities between words in the citation context and the cited document.[1]
Citation Recommendation Model : Problem Definition
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
• We propose to model the citation context given the cited paper : p(c|d)
• Given a dataset of training samples with |C| pairs of citation context 𝑐𝑡 and cited
document 𝑑 𝑡, the objective is to maximize the log-likelihood:[1]
• By assuming that words in the citation contexts are mutually conditional
independent :
• Objective function can be now written as :
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Neural Probabilistic Model
• In a neural probabilistic model, the conditional probability p(w | d) can be defined
using a soft max function:[1,2]
where |V| is the size of the vocabulary that consists of all words appearing in the
citation contexts, and 𝑠 𝜃 𝑤, 𝑑 is a neural network based scoring function with
parameter.
The scoring function 𝑠 𝜃 𝑤, 𝑑 is defined as:
where, is a logistic function that rescales the
inner product of the word representation 𝑉𝑤 and the document representation 𝑣 𝑑 to
0,1 .
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
[3] http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/
Word Representation Learning
• Negative sampling is used to learn the distributed representations of words using
the surrounding words.[1]
• Given a pair of citation context c and cited document d, the skip-gram model will
go through all the words in the citation context using a sliding window.[1]
• For each word 𝑤 𝑚 that appears in citation context c, words that appear within M
words before/after 𝑤 𝑚 are treated as positive samples. Suppose 𝑤 𝑝 is one of the
positive sample for 𝑤 𝑚 .The training objective is defined as:
where 𝑤 𝑛𝑖
is a negative sample randomly generated by noise distribution 𝑝 𝑛(.), and k
is the number of negative samples.[1,2,3]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Document Representation Learning
• Given a pair of citation context c and cited document d, we assume that each
word w that appears in the context c and the cited document d meets the real
data distribution 𝑝 𝑟 𝑤|𝑑 .[1, 2]
• Any other random words are noise data generated from noise distribution 𝑝 𝑛 𝑤 .
• Assumption : the size of noise data is k times of the real data size.
• the posterior probabilities of a pair of word w and document d come from
real/noise data distribution are:
Document Representation Learning (Cont …)
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
• Since we want to fit the neural probabilistic model 𝑝 𝜃 𝑑|𝑤 to the real
distribution 𝑝 𝑟 𝑑|𝑤 , we can rewrite :
• Given each word 𝑤𝑡 𝑖
that appears in the citation context, and the cited
document 𝑑 𝑡 ,we compute its contribution to the log-likelihood along with k
randomly generated noise words 𝑤 𝑛1
……. 𝑤 𝑛 𝑘
using the following objective
function:
• The training objective is to learn both the word representations and
document representations that maximize the equation.[1]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
• Noise-contrastive estimation treats the normalization constraint of 𝑝 𝜃 to be a
constant Z(s) so we can rewrite equation [1, 2]
as,
• The parameters of the neural network are 𝜃 and Z(w), where 𝜃 is the
projection function that maps words and documents in to the n-dimensional
space.
• While learning we will tune both document representation matrix and word
representation matrix to optimize the probability p(𝑤 𝑚 | 𝑑).
Document Representation Learning (Cont …)
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[1]
Citation Recommendation
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
• Using the fine-tuned word and document representations, we can get the
normalized probability distribution p(w|d).
• The table of p(d|w) is pre-calculated using Bayes’ rule and stored as an
inverted index.
• Given a query q = [𝑤1, 𝑤2,𝑤3 ….. 𝑤|𝑞|], the task is to recommend a list of
document R = [𝑑1, 𝑑2 …. 𝑑 𝑁] that need to be cited. [1]
• We use term – frequency – inverse – context – frequency (TF-ICF) to
measure p(𝑤𝑗 | 𝑞). [1]
Experiment (Data and Metrics)
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
• Snapshot of CiteSeer paper and citation database – Oct. 2013
• Dataset splitted into two parts – paper crawled before 2011(training data) and
paper crawled after 2011(testing data).
• Citations are extracted along with their citation context.(One line before and
after the sentence where citation occur).
• No of citation context and citation pairs : |C| = 8,992,476 (training set) and
1,628,698 (testing set). [1]
• No of recommendation is limited to 10 for each query. Dimension of
representational vectors set to n = 600 and no of negative samples k = 10.[1]
• Metrics : MAP, Recall, MRR, nDCG.
Baseline Comparison of Neural Probabilistic Model (NPM)
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
[1]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
Baseline Comparison of Neural Probabilistic Model (NPM)
[1]
[1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
Baseline Comparison of Neural Probabilistic Model (NPM)
[1]
Conclusion and Future Work
• Used the distributed representations of words and documents to build a neural
network based citation recommendation model.
• The proposed model then learns the word and document representations to
calculate the probability of citing a document given a citation context.
• A comparative study on a snapshot of CiteSeer dataset with existing state-of-
the-art methods showed that the proposed model significantly improves the
quality of context-based citation recommendation.
• Since only neural probabilistic models were considered for context-based local
recommendations, one could explore for a combined model.
References
1. Wenyi Huangy, ZhaohuiWuz, Chen Liangy, Prasenjit Mitrayz, C. Lee Giles - A Neural
Probabilistic Model for Context Based Recommendation -
http://www.cse.psu.edu/~zzw109/pubs/AAAI2015NeuralProbabilisticModelCitationRecommenda
tion.pdf
2. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean - Distributed
Representations ofWords and Phrases and their Compositionality -
http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-
compositionality.pdf
3. McNee et al. (2002) : www-users.cs.umn.edu/~mcnee/konstan-nectar-2006.pdf
4. Deep Learning, NLP, and Representations – (Posted : July 7, 2014)
http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/
5. Standard IR measures :
https://en.wikipedia.org/wiki/Information_retrieval#Performance_and_correctness_measures
Any Question ?
Thank You !!

More Related Content

What's hot

Document Summarization
Document SummarizationDocument Summarization
Document SummarizationPratik Kumar
 
Programmer information needs after memory failure
Programmer information needs after memory failureProgrammer information needs after memory failure
Programmer information needs after memory failureBhagyashree Deokar
 
Survey on Text Classification
Survey on Text ClassificationSurvey on Text Classification
Survey on Text ClassificationAM Publications
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarizationdamom77
 
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents ijsc
 
Information retrieval 6 ir models
Information retrieval 6 ir modelsInformation retrieval 6 ir models
Information retrieval 6 ir modelsVaibhav Khanna
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
 
EXPERT OPINION AND COHERENCE BASED TOPIC MODELING
EXPERT OPINION AND COHERENCE BASED TOPIC MODELINGEXPERT OPINION AND COHERENCE BASED TOPIC MODELING
EXPERT OPINION AND COHERENCE BASED TOPIC MODELINGijnlc
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural networkINFOGAIN PUBLICATION
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarizationAbdelaziz Al-Rihawi
 
Probabilistic Information Retrieval
Probabilistic Information RetrievalProbabilistic Information Retrieval
Probabilistic Information RetrievalHarsh Thakkar
 
Extraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity MiningExtraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity Miningiosrjce
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241Urjit Patel
 
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...TELKOMNIKA JOURNAL
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...ijtsrd
 

What's hot (18)

Document Summarization
Document SummarizationDocument Summarization
Document Summarization
 
Programmer information needs after memory failure
Programmer information needs after memory failureProgrammer information needs after memory failure
Programmer information needs after memory failure
 
Survey on Text Classification
Survey on Text ClassificationSurvey on Text Classification
Survey on Text Classification
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarization
 
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
 
Information retrieval 6 ir models
Information retrieval 6 ir modelsInformation retrieval 6 ir models
Information retrieval 6 ir models
 
K0936266
K0936266K0936266
K0936266
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector Quantization
 
EXPERT OPINION AND COHERENCE BASED TOPIC MODELING
EXPERT OPINION AND COHERENCE BASED TOPIC MODELINGEXPERT OPINION AND COHERENCE BASED TOPIC MODELING
EXPERT OPINION AND COHERENCE BASED TOPIC MODELING
 
Using lexical chains for text summarization
Using lexical chains for text summarizationUsing lexical chains for text summarization
Using lexical chains for text summarization
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
Probabilistic Information Retrieval
Probabilistic Information RetrievalProbabilistic Information Retrieval
Probabilistic Information Retrieval
 
Extraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity MiningExtraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity Mining
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
 
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...
 

Viewers also liked (8)

Juegos nocturnos martes
Juegos nocturnos martesJuegos nocturnos martes
Juegos nocturnos martes
 
Orakarn resume
Orakarn resumeOrakarn resume
Orakarn resume
 
Test
TestTest
Test
 
Presentation_NEW.PPTX
Presentation_NEW.PPTXPresentation_NEW.PPTX
Presentation_NEW.PPTX
 
Kenneth Miele Resume
Kenneth Miele ResumeKenneth Miele Resume
Kenneth Miele Resume
 
Vigilancia
VigilanciaVigilancia
Vigilancia
 
Faclon Labs
Faclon Labs Faclon Labs
Faclon Labs
 
Webmecanik Marketing Automation Software
Webmecanik Marketing Automation SoftwareWebmecanik Marketing Automation Software
Webmecanik Marketing Automation Software
 

Similar to Context-Based Citation Recommendation Using Neural Probabilistic Model (NPM

Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxarunchoubeybxr
 
score based ranking of documents
score based ranking of documentsscore based ranking of documents
score based ranking of documentsKriti Khanna
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithmRupali Bhatnagar
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarizationdamom77
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...Jeff Z. Pan
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?Anandhan22
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxTutors India
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxphdassistance101
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptxSaravanaD2
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionAlessandro Suglia
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionClaudio Greco
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptxthenmozhip8
 
Need it 10 4 pages check sample.docx
Need it 10 4 pages check sample.docxNeed it 10 4 pages check sample.docx
Need it 10 4 pages check sample.docxstirlingvwriters
 
A Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationA Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationDaniel Wachtel
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Kai Li
 

Similar to Context-Based Citation Recommendation Using Neural Probabilistic Model (NPM (20)

Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptx
 
score based ranking of documents
score based ranking of documentsscore based ranking of documents
score based ranking of documents
 
Abstractive Review Summarization
Abstractive Review SummarizationAbstractive Review Summarization
Abstractive Review Summarization
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
Query Based Summarization
Query Based SummarizationQuery Based Summarization
Query Based Summarization
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarization
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Scientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked DataScientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked Data
 
How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptx
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptx
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptx
 
Need it 10 4 pages check sample.docx
Need it 10 4 pages check sample.docxNeed it 10 4 pages check sample.docx
Need it 10 4 pages check sample.docx
 
A Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationA Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper Recommendation
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 

Context-Based Citation Recommendation Using Neural Probabilistic Model (NPM

  • 1. CONTEXT BASED CITATION RECOMMENDATION Which work you are referring ? Know similar works … Guided By : Dr. Animesh Mukherjee & Dr. Pawan Goyal
  • 2. Citation and Citation Context • Citations – crucial for assignments of academic credit. Helps support claims in one’s own work. • Citation Context (c) – sequence of words that appear around a particular citation. • Citation context contains words that describe or summarize the cited papers. • Semantics of cited documents should be close to citation context.[1] • Citation Recommendation Engine – helps checking completeness of citations when authoring a paper and find prior work related to topic under investigation and to find missing relevant citations.[1] [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
  • 3. Citation and Citation Context [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [1]
  • 4. Motivation and Approach • Motivation : Since all the words in the citation context are used to describe the same citation, the semantics of these words should be similar. • This motivates to learn the semantic embedding for words in citation contexts and cited documents and to recommend citations based on semantic distance. • Proposed to learn distributed semantic representations of words and documents. • Using the representations, we train a Neural Network Model that estimate probability of citing a paper given a citation context.[1] • Neural Network Model will tune distributed representations of words and documents so that semantic similarity between citation context and cited paper will be high.[1] [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf
  • 5. Related work : Global Citation Recommendation [1] McNee et al. (2002) : www-users.cs.umn.edu/~mcnee/konstan-nectar-2006.pdf • It suggest a list of references for an entire given manuscript.  McNee et al. (2002) used a partial list of reference as the input query and recommended additional references based on collaborative filtering.[1]  Strohman et al. (2007) assumed that the input is an incomplete paper manuscript and recommended papers with high text similarity using bibliography similarity.  Bethard and Jurafsky (2010) introduced features such as topical similarity, author behavioral patterns, citation count, and recency of publication to train a classifier for global recommendation.
  • 6. Related work : Local Citation Recommendation [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf • Input query is a particular context where a citation should be made and output is a short list of papers that need to be cited given context.  He et al. (2010) assumed that user has provided placeholders for citation in a query manuscript. A probabilistic model was trained to measure the relevance between the citation contexts and the cited documents.[1]  In another work (He et al. 2011), they first segmented the document into a sequence of disjointed candidate citation contexts, and then applied the dependency feature model to retrieve a ranked list of papers for each context.[1]  IBM translation model-1 was trained to learn the probability of citing a document given a word. Lu et al. (2011) used the translation model to learn the translation probabilities between words in the citation context and the cited document.[1]
  • 7. Citation Recommendation Model : Problem Definition [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf • We propose to model the citation context given the cited paper : p(c|d) • Given a dataset of training samples with |C| pairs of citation context 𝑐𝑡 and cited document 𝑑 𝑡, the objective is to maximize the log-likelihood:[1] • By assuming that words in the citation contexts are mutually conditional independent : • Objective function can be now written as :
  • 8. [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf Neural Probabilistic Model • In a neural probabilistic model, the conditional probability p(w | d) can be defined using a soft max function:[1,2] where |V| is the size of the vocabulary that consists of all words appearing in the citation contexts, and 𝑠 𝜃 𝑤, 𝑑 is a neural network based scoring function with parameter. The scoring function 𝑠 𝜃 𝑤, 𝑑 is defined as: where, is a logistic function that rescales the inner product of the word representation 𝑉𝑤 and the document representation 𝑣 𝑑 to 0,1 .
  • 9. [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf [3] http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/ Word Representation Learning • Negative sampling is used to learn the distributed representations of words using the surrounding words.[1] • Given a pair of citation context c and cited document d, the skip-gram model will go through all the words in the citation context using a sliding window.[1] • For each word 𝑤 𝑚 that appears in citation context c, words that appear within M words before/after 𝑤 𝑚 are treated as positive samples. Suppose 𝑤 𝑝 is one of the positive sample for 𝑤 𝑚 .The training objective is defined as: where 𝑤 𝑛𝑖 is a negative sample randomly generated by noise distribution 𝑝 𝑛(.), and k is the number of negative samples.[1,2,3]
  • 10. [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf Document Representation Learning • Given a pair of citation context c and cited document d, we assume that each word w that appears in the context c and the cited document d meets the real data distribution 𝑝 𝑟 𝑤|𝑑 .[1, 2] • Any other random words are noise data generated from noise distribution 𝑝 𝑛 𝑤 . • Assumption : the size of noise data is k times of the real data size. • the posterior probabilities of a pair of word w and document d come from real/noise data distribution are:
  • 11. Document Representation Learning (Cont …) [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf • Since we want to fit the neural probabilistic model 𝑝 𝜃 𝑑|𝑤 to the real distribution 𝑝 𝑟 𝑑|𝑤 , we can rewrite : • Given each word 𝑤𝑡 𝑖 that appears in the citation context, and the cited document 𝑑 𝑡 ,we compute its contribution to the log-likelihood along with k randomly generated noise words 𝑤 𝑛1 ……. 𝑤 𝑛 𝑘 using the following objective function: • The training objective is to learn both the word representations and document representations that maximize the equation.[1]
  • 12. [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [2] http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf • Noise-contrastive estimation treats the normalization constraint of 𝑝 𝜃 to be a constant Z(s) so we can rewrite equation [1, 2] as, • The parameters of the neural network are 𝜃 and Z(w), where 𝜃 is the projection function that maps words and documents in to the n-dimensional space. • While learning we will tune both document representation matrix and word representation matrix to optimize the probability p(𝑤 𝑚 | 𝑑). Document Representation Learning (Cont …)
  • 14. Citation Recommendation [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf • Using the fine-tuned word and document representations, we can get the normalized probability distribution p(w|d). • The table of p(d|w) is pre-calculated using Bayes’ rule and stored as an inverted index. • Given a query q = [𝑤1, 𝑤2,𝑤3 ….. 𝑤|𝑞|], the task is to recommend a list of document R = [𝑑1, 𝑑2 …. 𝑑 𝑁] that need to be cited. [1] • We use term – frequency – inverse – context – frequency (TF-ICF) to measure p(𝑤𝑗 | 𝑞). [1]
  • 15. Experiment (Data and Metrics) [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf • Snapshot of CiteSeer paper and citation database – Oct. 2013 • Dataset splitted into two parts – paper crawled before 2011(training data) and paper crawled after 2011(testing data). • Citations are extracted along with their citation context.(One line before and after the sentence where citation occur). • No of citation context and citation pairs : |C| = 8,992,476 (training set) and 1,628,698 (testing set). [1] • No of recommendation is limited to 10 for each query. Dimension of representational vectors set to n = 600 and no of negative samples k = 10.[1] • Metrics : MAP, Recall, MRR, nDCG.
  • 16. Baseline Comparison of Neural Probabilistic Model (NPM) [1] http://www.cse.psu.edu/~zzw109/pubs/AAAI2015-NeuralProbabilisticModelCitationRecommendation.pdf [1]
  • 19. Conclusion and Future Work • Used the distributed representations of words and documents to build a neural network based citation recommendation model. • The proposed model then learns the word and document representations to calculate the probability of citing a document given a citation context. • A comparative study on a snapshot of CiteSeer dataset with existing state-of- the-art methods showed that the proposed model significantly improves the quality of context-based citation recommendation. • Since only neural probabilistic models were considered for context-based local recommendations, one could explore for a combined model.
  • 20. References 1. Wenyi Huangy, ZhaohuiWuz, Chen Liangy, Prasenjit Mitrayz, C. Lee Giles - A Neural Probabilistic Model for Context Based Recommendation - http://www.cse.psu.edu/~zzw109/pubs/AAAI2015NeuralProbabilisticModelCitationRecommenda tion.pdf 2. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean - Distributed Representations ofWords and Phrases and their Compositionality - http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their- compositionality.pdf 3. McNee et al. (2002) : www-users.cs.umn.edu/~mcnee/konstan-nectar-2006.pdf 4. Deep Learning, NLP, and Representations – (Posted : July 7, 2014) http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/ 5. Standard IR measures : https://en.wikipedia.org/wiki/Information_retrieval#Performance_and_correctness_measures