Anaphoric Pronoun ResolutionFinding links • Pronoun to antecedentEnriching text • Input: preprocessed document • Output: All found anaphoric pronoun references to words/phrases
Areas of useDocument summarization • Improving sentence comparisons Ontology enrichment • Enriching results • Populating with more data.Entity level sentiment analysis Question answering • Adding more information to indata. • Extracting more RDF- tripples
Preprocessing Required Additional • Sentence splitting • Dependency parsing • Tokenization • Part of Speech-tagging • Named Entity Reconition • Gender Detection
Model representationAnaphora pairs Candidate selection/ranking • Pronoun • Find pronoun • Antecedent • Pair with antecedent candidates - Entities • Filter out improbable pairs (rules) - Nouns, cardinals, foreign words • Rank candidate pairs • Select the most probable candidate (if any)
Feature representationDistance Features Overlap Features/Filters • Sentence distance • Gender • Hobbs distance • AnimacityAntecedent Features • Number • PoS-tag • Entity • Gender Pronoun Features • Animacity • Word string • Number • Gender • Entity tag • Animacity • ... • ...
Machine learning models Running the modelsModels • Condidtional Random Fields (CRF) • Control confiedence threshold - Mallet - Precision/Recall trade • Logistic Regression off - LiblinearTraining the models • OntoNotes Conll 2012 • English • 1667 documents • Various domains
Further Work/Ideas for ImprovementFull coreference/anaphora resolution Improved Features • Change model representations • Improved gender detection - Clusters - Chains • Improved animacity detection • Generalize comparisons (not only • Additional overlap features Multi pass approach pronoun - antecedent)Non referential/cataphora detection • First pass(es) rule based • Training separate models • Harder classifications with machine learning models • Rule based