Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Use of Wikipedia categories on information retrieval: a brief research


Published on

Trabajo presetndo al V Congreso español de recuperacion de informacion CERI 18. Zaragoza, junio 2018.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Use of Wikipedia categories on information retrieval: a brief research

  1. 1. Use of Wikipedia categories on information retrieval research: a brief review Jesús Tramullas Dept. Library & Information Science, Univ. of Zaragoza Piedad Garrido-Picazo Dept. Computer Science & Systems Engineering, Univ. of Zaragoza Ana I. Sánchez Casabón Dept. Library & Information Science, Univ. of Zaragoza
  2. 2. About Wikipedia Categories  Wikipedia categories are a classification scheme built for organizing and describing Wikipedia articles.  Started at 2003.  System that combines a hierarchical organization with relations among different categories, which creates poly-hierarchies and associations.
  3. 3. Research Questions  RQ1: to identify the uses and applications that researchers are doing from Wikipedia category system in computer science research.  RQ2: to review how a knowledge organization system, developed collaboratively, is being used as a research tool in different approaches to information processing and retrieval.
  4. 4. Research Method  Systematic literature review.  Sources: Scopus and WoS, Nov. 2017-Jan 2018.  Boolean query: “Wikipedia" and "categories,” in title, keyword and abstract fields, and limits 2002-2017.  Scopus: 666; WoS: 311.  Processed datasets: from 680 to 546 papers.
  5. 5. RQ1: results and discussion  Previously, bibliographical data published open in Zotero and Mendeley.  Answered in the affirmative: Variety of approaches, uses, and applications that researchers make with the Wikipedia categories structure.  It’s impossible to establish precise divisions.
  6. 6. RQ1: two big groups  Firstly, studies that analyzed the category system itself within the context of Wikipedia.  Secondly, those papers that use Wikipedia categories in the context of studies on different aspects of information processing, usually on documentary corpus independent of Wikipedia.
  7. 7. RQ2: results and discussion  Information Retrieval.  Entity processing.  Indexing and classification of document corpus.  Creating and using taxonomies.  Creating and using ontologies.  Semantic treatment.  Other uses
  8. 8. Conclusions, 1  Wikipedia is an important field of research for different areas of computer science, in general, and information retrieval, in particular.  Detected significant topics offer a close relationship between them, reflecting the classic major topics on information retrieval.
  9. 9. Conclusions, 2  It’s necessary to emphasize its use as a tool of support and validation in different types of approaches to the study and analysis of documentary corpus, including studies about information processing, classification and retrieval.  It provides a broad field both for the classification schemas validation, as for creating new ones.
  10. 10. Problems  The variety of terms used by researchers in describing their work highlights an underlying problem to systematic reviews, as is the disparity of opinion of the authors in the drafting of titles, abstracts and selecting keywords.
  11. 11. Future work  First, to carry on and survey the results of applying text classification techniques to the corpus data, to compare with our proposal.  Second, to complete the review with a quantitative or bibliometric analysis.  Finally, to study the research focused in applications of Computer Science to other fields.
  12. 12. Questions? Esta obra está bajo una licencia de Creative Commons Reconocimiento-CompartirIgual 4.0 Internacional.