Use of Wikipedia Categories in IR Research

•

1 like•379 views

Jesús Tramullas

Trabajo presetndo al V Congreso español de recuperacion de informacion CERI 18. Zaragoza, junio 2018.

Education

Use of Wikipedia
categories on information
retrieval research:
a brief review
Jesús Tramullas
Dept. Library & Information Science, Univ. of Zaragoza
Piedad Garrido-Picazo
Dept. Computer Science & Systems Engineering, Univ. of Zaragoza
Ana I. Sánchez Casabón
Dept. Library & Information Science, Univ. of Zaragoza

About Wikipedia Categories
 Wikipedia categories are a classification
scheme built for organizing and describing
Wikipedia articles.
 Started at 2003.
 System that combines a hierarchical
organization with relations among different
categories, which creates poly-hierarchies
and associations.

Research Questions
 RQ1: to identify the uses and applications that
researchers are doing from Wikipedia category
system in computer science research.
 RQ2: to review how a knowledge organization
system, developed collaboratively, is being
used as a research tool in different approaches
to information processing and retrieval.

Research Method
 Systematic literature review.
 Sources: Scopus and WoS, Nov. 2017-Jan 2018.
 Boolean query: “Wikipedia" and "categories,”
in title, keyword and abstract fields, and limits
2002-2017.
 Scopus: 666; WoS: 311.
 Processed datasets: from 680 to 546 papers.

RQ1: results and discussion
 Previously, bibliographical data published
open in Zotero and Mendeley.
 Answered in the affirmative: Variety of
approaches, uses, and applications that
researchers make with the Wikipedia
categories structure.
 It’s impossible to establish precise divisions.

RQ1: two big groups
 Firstly, studies that analyzed the category
system itself within the context of Wikipedia.
 Secondly, those papers that use Wikipedia
categories in the context of studies on
different aspects of information processing,
usually on documentary corpus independent
of Wikipedia.

RQ2: results and discussion
 Information Retrieval.
 Entity processing.
 Indexing and classification of document
corpus.
 Creating and using taxonomies.
 Creating and using ontologies.
 Semantic treatment.
 Other uses

Conclusions, 1
 Wikipedia is an important field of research for
different areas of computer science, in
general, and information retrieval, in
particular.
 Detected significant topics offer a close
relationship between them, reflecting the
classic major topics on information retrieval.

Conclusions, 2
 It’s necessary to emphasize its use as a tool of
support and validation in different types of
approaches to the study and analysis of
documentary corpus, including studies about
information processing, classification and
retrieval.
 It provides a broad field both for the
classification schemas validation, as for
creating new ones.

Problems
 The variety of terms used by researchers in
describing their work highlights an underlying
problem to systematic reviews, as is the
disparity of opinion of the authors in the
drafting of titles, abstracts and selecting
keywords.

Future work
 First, to carry on and survey the results of
applying text classification techniques to the
corpus data, to compare with our proposal.
 Second, to complete the review with a
quantitative or bibliometric analysis.
 Finally, to study the research focused in
applications of Computer Science to other
fields.

Questions?
Esta obra está bajo una licencia de Creative Commons
Reconocimiento-CompartirIgual 4.0 Internacional.

What's hot

Scientometric Mapping of Library and Information Science in Web of Science 8638812142

National Data Archive (NADA) 3.0mehmood78

Lecture 4: Metadata6500jmk4

Metric Fields in Information ScienceGladys Wakat

Google Scholar and Web of Science: Similarities and Differences in Citation A...Balachandar Radhakrishnan

bibliometricsanitharajan

Data documentation and contextual descriptionsArhiv družboslovnih podatkov

Assignment 5 presentation (smaller w audio)blewter8

FAIRsharing presentation to IUPAC WorkshopPeter McQuilton

Open University DataMartin Mitrevski

056-Science Europe Draft Proposal for a Sceince Europe position statement on ...innovationoecd

Exploration of a Data Landscape using a Collaborative Linked Data Framework.Laurent Alquier

TAIR ICAR 2010 PresentationPhoenix Bioinformatics

Bibliographic description an overviewDr. Utpal Das

Bibliometric - MIT MetaResourcesMicah Altman

Crosslinks ericmeeks

Citation Analysis for the Free, Online LiteratureBalachandar Radhakrishnan

Ontological search for academic resourcesTechnological Ecosystems for Enhancing Multiculturality

Cataloguer MakeoverVioleta Ilik

Overview of Bibliometrics - IAP Course version 1.1Micah Altman

What's hot (20)

Scientometric Mapping of Library and Information Science in Web of Science

National Data Archive (NADA) 3.0

Lecture 4: Metadata

Metric Fields in Information Science

Google Scholar and Web of Science: Similarities and Differences in Citation A...

bibliometrics

Data documentation and contextual descriptions

Assignment 5 presentation (smaller w audio)

FAIRsharing presentation to IUPAC Workshop

Open University Data

056-Science Europe Draft Proposal for a Sceince Europe position statement on ...

Exploration of a Data Landscape using a Collaborative Linked Data Framework.

TAIR ICAR 2010 Presentation

Bibliographic description an overview

Bibliometric - MIT MetaResources

Crosslinks

Citation Analysis for the Free, Online Literature

Ontological search for academic resources

Cataloguer Makeover

Overview of Bibliometrics - IAP Course version 1.1

Similar to Use of Wikipedia Categories in IR Research

A Collaborative Approach Toward Scientific Paper Recommendation Using Citatio...Melinda Watson

Ir 01Mohammed Romi

Opening Scholarly Communication in Social Sciences (OSCOSS)GESIS

Knowledge Representation on the WebRinke Hoekstra

Research Paper Selection Based On an Ontology and Text Mining Technique Using...IOSR Journals

M017116571IOSR Journals

Scientific Knowledge Graphs: an OverviewAngelo Salatino

Citation metricsVasantha Raju N

Research on ontology based information retrieval techniquesKausar Mukadam

A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...TELKOMNIKA JOURNAL

Finding articles and books using database for your discipline pubricaPubrica

Applying machine learning techniques to big data in the scholarly domainAngelo Salatino

The Rhetoric of Research ObjectsCarole Goble

A Two-Stage Method For Scientific Papers AnalysisJustin Knight

A Two-Stage Method For Scientific Papers Analysis.PdfSophia Diaz

Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Giannis Tsakonas

Mathew.pptSurbhiTanwar12

Course syllabus metadata systems for warsawRichard.Sapon-White

Annotated Bibliography Of Evaluating The Educational Impact Of Digital LibrariesJoaquin Hamad

Chapter TwoReview of the LiteratureC.docxchristinemaritza

Similar to Use of Wikipedia Categories in IR Research (20)

A Collaborative Approach Toward Scientific Paper Recommendation Using Citatio...

Ir 01

Opening Scholarly Communication in Social Sciences (OSCOSS)

Knowledge Representation on the Web

Research Paper Selection Based On an Ontology and Text Mining Technique Using...

M017116571

Scientific Knowledge Graphs: an Overview

Citation metrics

Research on ontology based information retrieval techniques

A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...

Finding articles and books using database for your discipline pubrica

Applying machine learning techniques to big data in the scholarly domain

The Rhetoric of Research Objects

A Two-Stage Method For Scientific Papers Analysis

A Two-Stage Method For Scientific Papers Analysis.Pdf

Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...

Mathew.ppt

Course syllabus metadata systems for warsaw

Annotated Bibliography Of Evaluating The Educational Impact Of Digital Libraries

Chapter TwoReview of the LiteratureC.docx

Recently uploaded

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George

Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1

Paris 2024 Olympic Geographies - an activityGeoBlogs

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a

Staff of Color (SOC) Retention Efforts DDSDDavid Douglas School District

The basics of sentences session 2pptx copy.pptxheathfieldcps1

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

Accessible design: Minimum effort, maximum impactdawncurless

Interactive Powerpoint_How to Master effective communicationnomboosow

Presiding Officer Training module 2024 lok sabha electionsanshu789521

The Most Excellent Way | 1 Corinthians 13Steve Thomason

Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand

A Critique of the Proposed National Education Policy ReformChameera Dedduwage

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar

Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Recently uploaded (20)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17

Employee wellbeing at the workplace.pptx

Paris 2024 Olympic Geographies - an activity

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf

Staff of Color (SOC) Retention Efforts DDSD

The basics of sentences session 2pptx copy.pptx

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

Accessible design: Minimum effort, maximum impact

Interactive Powerpoint_How to Master effective communication

Presiding Officer Training module 2024 lok sabha elections

The Most Excellent Way | 1 Corinthians 13

Concept of Vouching. B.Com(Hons) /B.Compdf

A Critique of the Proposed National Education Policy Reform

Introduction to ArtificiaI Intelligence in Higher Education

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx

Hybridoma Technology ( Production , Purification , and Application )

Sanyam Choudhary Chemistry practical.pdf

Use of Wikipedia Categories in IR Research

1. Use of Wikipedia categories on information retrieval research: a brief review Jesús Tramullas Dept. Library & Information Science, Univ. of Zaragoza Piedad Garrido-Picazo Dept. Computer Science & Systems Engineering, Univ. of Zaragoza Ana I. Sánchez Casabón Dept. Library & Information Science, Univ. of Zaragoza

2. About Wikipedia Categories  Wikipedia categories are a classification scheme built for organizing and describing Wikipedia articles.  Started at 2003.  System that combines a hierarchical organization with relations among different categories, which creates poly-hierarchies and associations.

4. Research Questions  RQ1: to identify the uses and applications that researchers are doing from Wikipedia category system in computer science research.  RQ2: to review how a knowledge organization system, developed collaboratively, is being used as a research tool in different approaches to information processing and retrieval.

5. Research Method  Systematic literature review.  Sources: Scopus and WoS, Nov. 2017-Jan 2018.  Boolean query: “Wikipedia" and "categories,” in title, keyword and abstract fields, and limits 2002-2017.  Scopus: 666; WoS: 311.  Processed datasets: from 680 to 546 papers.

6. RQ1: results and discussion  Previously, bibliographical data published open in Zotero and Mendeley.  Answered in the affirmative: Variety of approaches, uses, and applications that researchers make with the Wikipedia categories structure.  It’s impossible to establish precise divisions.

7. RQ1: two big groups  Firstly, studies that analyzed the category system itself within the context of Wikipedia.  Secondly, those papers that use Wikipedia categories in the context of studies on different aspects of information processing, usually on documentary corpus independent of Wikipedia.

8. RQ2: results and discussion  Information Retrieval.  Entity processing.  Indexing and classification of document corpus.  Creating and using taxonomies.  Creating and using ontologies.  Semantic treatment.  Other uses

9. Conclusions, 1  Wikipedia is an important field of research for different areas of computer science, in general, and information retrieval, in particular.  Detected significant topics offer a close relationship between them, reflecting the classic major topics on information retrieval.

10. Conclusions, 2  It’s necessary to emphasize its use as a tool of support and validation in different types of approaches to the study and analysis of documentary corpus, including studies about information processing, classification and retrieval.  It provides a broad field both for the classification schemas validation, as for creating new ones.

11. Problems  The variety of terms used by researchers in describing their work highlights an underlying problem to systematic reviews, as is the disparity of opinion of the authors in the drafting of titles, abstracts and selecting keywords.

12. Future work  First, to carry on and survey the results of applying text classification techniques to the corpus data, to compare with our proposal.  Second, to complete the review with a quantitative or bibliometric analysis.  Finally, to study the research focused in applications of Computer Science to other fields.

13. Questions? Esta obra está bajo una licencia de Creative Commons Reconocimiento-CompartirIgual 4.0 Internacional.

Use of Wikipedia Categories in IR Research

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Use of Wikipedia Categories in IR Research

Similar to Use of Wikipedia Categories in IR Research (20)

More from Jesús Tramullas

More from Jesús Tramullas (20)

Recently uploaded

Recently uploaded (20)

Use of Wikipedia Categories in IR Research