Benchmarking Domain-specific Expert Search using Workshop Program Committees
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Benchmarking Domain-specific Expert Search using Workshop Program Committees

  • 348 views
Uploaded on

Traditionally, relevance assessments for expert search have been gathered through self-assessment or based on the opinions of co-workers. We introduce three benchmark datasets for expert search......

Traditionally, relevance assessments for expert search have been gathered through self-assessment or based on the opinions of co-workers. We introduce three benchmark datasets for expert search that use conference workshops for relevance assessment. Our data sets cover entire research domains as opposed to single institutions. In addition, they provide a larger number of topic-person associations and allow a more objective and fine-grained evaluation of expertise than existing data sets do. We present and discuss baseline results for a language modelling and a topic-centric approach to expert search. We find that the topic-centric approach achieves the best results on domain-specific datasets.

Presented at CSTA workshop, CIKM 2013,
October 28, 2013

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
348
On Slideshare
348
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Benchmarking Domain-Specific Expert Search Using Workshop Program Committees Georgeta Bordea1, Toine Bogers2 & Paul Buitelaar1 1 Digital Enterprise Research Institute National University of Ireland 2 Royal School of Library & Information Science University of Copenhagen CSTA workshop @ CIKM 2013 October 28, 2013
  • 2. Outline • Introduction • Domain-specific test collections for expert search - Information retrieval - Semantic web - Computational linguistics • Benchmarking our new collections - Expert finding - Expert profiling • Discussion & conclusions 2
  • 3. Introduction • Knowledge workers spend around 25% of their time searching for information - 99% report using other people as information sources - 14.4% of their time is spent on this (56% depending on your definition) - Why do people search for other people? (Hertzum & Pejtersen, 2005) ‣ Search documents to find relevant people ‣ Search people to find relevant documents • Expert search engines support this need for people search - Searching for people instead of documents 3
  • 4. Introduction “machine learning” “speech recognition” 4
  • 5. Related work • Historical solution (80s and 90s) - Manually constructing a database of people’s expertise • Automatic approaches to expert search since 2000s - Automatically retrieve expertise evidence and associate this with experts - Expert finding (“Who is the expert on topic X?”) ‣ Find the experts on a specific topic - Expert profiling (“What is the expertise of person Y?”) ‣ Find out what one expert knows about different topics 5
  • 6. Related work • TREC Enterprise track (2005-2008) - Focused on enterprise search → searching the data of an organization - W3C collection (2005-2006) - CSIRO collection (2007-2008) • UvT Expert Collection (2007, updated in 2012) - University-wide crawl of expertise evidence ‣ Publications, course descriptions, research descriptions, personal home pages - Topics & relevance (self-)assessments from manual expertise database 6
  • 7. Related work W3C # people # documents # topics CSIRO UvT 1,092 3,490 496 331,037 370,715 36,699 99 50 981 • Problems with these data sets - Relevance assessments ‣ W3C → Assessment by people outside organization inaccurate and incomplete ‣ CSIRO → Assessment by co-workers biased towards social network ‣ UvT → Self-assessment by experts is subjective and incomplete - Focus on a single organization → relatively few experts per expertise area 7
  • 8. Solution: Domain-specific test collections • Documents - Where? Collect publications from relevant journals and conferences in a specific domain - Why? More challenging because of lower level of granularity • Topics - Where? Collect topics descriptions from conference workshop websites - Why? Rich descriptions with explicitly identified sub-topics (“areas of interest”) • Relevance assessments - Where? Program committees listed on workshop websites - Why? Combines peer judgments with self-assessment 8
  • 9. Collection 1: Information retrieval (IR) • Research domain(s) • Research domain(s): - digital - Information retrieval,Inform libraries, and recommender systems • Topics • Topics - Workshops with substantial portion - Workshops held at conferences held at conferences withdedicated to these substantial portion dedicated to domains between 2001 and 2012 ‣ CIKM ‣ IIiX ‣ SIGIR ‣ RecSys ‣ ECIR ‣ ECDL ‣ WWW ‣ JCDL ‣ WSDM ‣ TPDL 9
  • 10. Collection 1: Information retrieval (IR) • Documents - Based on DBLP Computer Science Bibliography ‣ Good coverage of research domains ‣ ArnetMiner version available with (automatically extracted) citation information - Selected publications from all relevant IR venues ‣ Core venues → Hosting conferences for selected IR workshops (~9,000 docs) ‣ Curated venues → Additional venues with substantial IR coverage (~16,000 docs) ‣ Venue has to have at least 5 publications in ArnetMiner DBLP data set ‣ Resulted in ~25,000 publications - Collected full-text versions using Google Scholar for 54.1% of publications 10
  • 11. Collection 2: Semantic Web (SW) • Research domain(s) - Semantic Web • Topics - Workshops held at conferences in the Semantic Web Dog Food data set ‣ ISWC ‣ WWW ‣ EKAW ‣ ASWC ‣ ESWC ‣ I-Semantics • Documents - Based on Semantic Web Dog Food corpus (SPARQL public endpoint) - Full-text PDF versions available for all publications 11
  • 12. Collection 3: Computational linguistics (CL) • Research domain(s) - Computational linguistics, natural language processing • Topics - Workshops held at conferences in the ACL Anthology Reference Corpus ‣ ACL ‣ SemEval ‣ CoLing ‣ NAACL ‣ ANLP ‣ HLT ‣ EACL ‣ EMNLP ‣ LREC • Documents - Based on ACL Anthology Reference Corpus - Full-text PDF versions available for all publications 12
  • 13. Topics & relevance assessments • Topic representations - Title - Long description (complete workshop description) - Short description (teaser description, typically first paragraph) - Areas of interest 13
  • 14. <topic id="014"> (IRiX)</title> tle>Workshop on Information Retrieval in Context <ti <year>2004</year> <website>http://ir.dcs.gla.ac.uk/context/</website> iety of theoretical <short_description>This workshop will explore a var eractive IR research.</ orks, characteristics and approaches to future int framew short_description> nt information cription>There is a growing realisation that releva <long_des ong_description> [...] for future interactive IR (IIR) research.</l <areas_of_interest> a> <area>Contextual IR theory - modeling context</are [...] </areas_of_interest> <organizers> <name>Peter Ingwersen</name> [...] </organizers> <program_committee> <name>Pia Borlund</name> [...] </program_committee> </topic> 14
  • 15. Topics & relevance assessments • Topic representations - Title - Long description (complete workshop description) - Short description (teaser description, typically first paragraph) - Areas of interest - Manually annotated topics with fine-grained expertise topics • Relevance assessments - PC members and organizers typically have expertise in one or more areas of interest → combination of peer judgments and self-assessment - Relevance value of ‘2’ for organizers and ‘1’ for PC members 15
  • 16. Collections by numbers Information retrieval Semantic Web Computational linguistics # (unique) authors 26,098 9,983 4,480 # documents 24,690 10,921 2,311 % full-text documents 54.1% 100% 100% # workshops (= topics) 60 340 190 # expertise topics 488 4,660 6,751 avg. # authors/document 2.7 2.2 3.3 avg. # experts/topic 14.9 25.8 24.9 16
  • 17. Benchmarking the collections • Benchmark results on our collections using state-of-the-art approaches on two tasks - Profile-centric model (M1, “Model 1”) — expert finding, expert profiling ‣ Aggregate all content for an expert into a document representation and produce ranking - Document-centric model (M2, “Model 2”) — expert finding, expert profiling ‣ Retrieve relevant documents, then associate with experts and produce ranking - Saffron (Bordea et al., 2012) ‣ Automatically extracts expertise terms from text, ranks them by term frequency, length, and ‘embeddedness’, associates documents and experts with these terms ‣ Topic-centric extraction (TC) — expert finding, expert profiling ‣ Document-count ranking (DC) — expert finding 17
  • 18. Expert finding Profile-centric Document-centric Saffron - TC Saffron - DC 0.18 0.16 0.14 0.12 0.10 P@5 0.08 0.06 0.04 0.02 0.00 Information retrieval! Semantic Web! Computational linguistics! 18
  • 19. Expert profiling Profile-centric Document-centric Saffron - TC 0.18 0.16 0.14 0.12 0.10 MAP 0.08 0.06 0.04 0.02 0.00 Information retrieval! Semantic Web! Computational linguistics! 19
  • 20. Discussion & conclusions • Contributions - Three new domain-specific test collections for expert search ‣ Available at http://itlab.dbit.dk/~toine/?page_id=631 - Workshop websites for topic creation & relevance assessment - Benchmarked performance for expert finding and expert profiling • Findings - Term extraction approaches outperform language modeling on domaincentered collections (as opposed to organization-centric collections) • Caveats - Incomplete assessments & social selection bias for PC members? 20
  • 21. Future work • Expansion - Add additional domains ‣ Need an active workshop scene & access to documents - Add additional topics to existing collections ‣ IR collection has 100+ workshops that need manual cleaning ‣ Conference tutorials could also be added (but very incomplete relevance assessments!) • Drilling down - Incorporate social evidence in the form of citation networks - Investigate the temporal aspect (topic drift?) 21
  • 22. Questions? Comments? Suggestions? 22