Can we use Linked DataSemantic Annotators forthe Extraction of Domain-Relevant Expressions?Prof. Michel Gagnon, Ecole Polytechnique de Montréal, CanadaProf. Amal Zouaq, Royal Military College of Canada, CanadaDr. Ludovic Jean-louis, Ecole Polytechnique deMontréal, Canada
IntroductionWhat is Semantic Annotation on the LOD?Candidate SelectionDisambiguationTraditionally focused on named entity annotation
Semantic AnnotatorsLOD-based Semantic AnnotatorsMost used annotatorsREST API / Web ServicesDefault configurationAlchemy(entities +keywords)YahooZemantaWikimetaDBPediaSpotlightOpenCalaisLupedia
Annotation EvaluationFew comparative studies have been publishedon the performance of linked data SemanticAnnotatorsThe available studies generally focus on“traditional” named entity annotation evaluation(ORG, Person, etc.)
NEs versus domain topicsSemantic Annotators start to go beyond thesetraditional NEsPredefined classes such as PERSON, ORGExpansion of possible classes (NERD Taxonomy :sport events, operating systems, political events, …) RDF (NE? Domain topic ?) Semantic Web (NE? Domain topic ?)Domain Topics: important “concepts” for thedomain of interest
Research Question 1Previous studies did not draw any conclusion aboutthe relevance of the annotated entitiesBut relevance might prove to be a crucial point fortasks such as information seeking or queryansweringRQ1: Can we use linked data-based semanticannotators to accurately identify Domain-relevantexpressions?
Research MethodologyWe relied on a corpus of 8 texts taken fromWikipedia, all related to the artificial intelligencedomain. Together, these texts represent 10570words.We obtained 2023 expression occurrences amongwhich 1151 were distinct.
Building the Gold StandardWe asked a human evaluator to analyze eachexpression and make the following decisions:Does the annotated expression represent anunderstandable named entity or topic?Is the expression a relevant keyword accordingto the document?Is the expression a relevant named entityaccording to the document?
Distribution of ExpressionsNote that only 639 out of 1151 detected expressionsare understandable (56%). Thus a substantialnumber of detected expressions represents noise.
Frequency of detected ExpressionsMost of the expressions were detected by onlyone (76%) or two (16%) annotators.Semantic annotators seemed complementary.(326 relevant)(117 relevant)(40 relevant)(15 relevant)(4 relevant)(4 relevant)(1 relevant)
EvaluationStandard precision, recall and F-score“Partial” recall, since our Gold Standardconsiders only the expressions detected byat least one annotator, instead of consideringall possible expressions in the corpus.
Few HighlightsF-score is unsatisfactory for almost all annotatorsBest F-score around 62-63 (All expressions, topics)Alchemy seems to stand outIt obtains a high value for recall when all expressionsor only topics are consideredBut the precision of Alchemy (59%) is not veryhigh, compared to the results of Zemanta(81%), OpenCalais (75%) and Yahoo (74%)None of the annotators was good at detecting relevantnamed entities.DBpedia Spotlight annotated many expressions thatare not considered understandable
Research Question 2Given the complementarity of SemanticAnnotators, we decided to experiment withthis second research question:Can we improve the overall performanceof semantic annotators using votingmethods and machine learning?
Voting and machine learningmethodsWe experimented with the followingmethods:Simple votesWeighted votes where the weight is theprecision of individual annotators on atraining corpusK-nearest-neighbours classifierNaïve Bayes classifierDecision tree (C4.5)Rule induction (PART algorithm)
Experiment SetupWe divided the set of annotatedexpressions into 5 balanced partitionsOne partition for testingRemaining partitions for trainingWe ran 5 experiments for each of thefollowing situations: all entities, onlynamed entities and only topics
HighlightsOn the average, weighted vote displays thebest precision (0.66) among combinationmethods if only topics are consideredAlchemy (0.59)Alchemy finds many relevant expressionsthat are not discovered by other annotatorsMachine learning methods achieve betterrecall and F-score on the average, especiallyif decision tree or rule induction (PART) isused
ConclusionCan we use Linked Data SemanticAnnotators for the Extraction of Domain-Relevant Expressions?Best overall performance (Recall/F-Score:Alchemy) (59%- 69%-63%)Best precision: Zemanta (81%)Need of filtering!Combination/machine learning methodsIncreased recall with weighted votingscheme
Future WorkLimitationsThe size of the gold standardA limited set of semantic annotatorsIncrease the size of the GSExplore various combinations for annotators/featuresCompare Semantic Annotators with keywordextractors
QUESTIONS?Thank you for your firstname.lastname@example.org://azouaq.athabascau.ca