Discovering Emerging Research Topics in Work and Organizational Psychology

eduworks-network.eu
facebook.com/eduworksnetwork
@EduworksNetwork
Discovering Emerging Research
Topic and Trends in W&O
Psychology by Text Mining
Scientific Articles
Vladimer Kobayashi, Stefan T Mol, Ph.D., Gábor
Kismihók, Ph.D.

Contents
• Background – Why this study? Hasn’t this been done before?
• Objectives – What are we really trying to do here?
• Materials – The ingredients
• Methods – The tools
• Results – Show me the outcome!
• Conclusion and Future Work – What has been achieved? How
to proceed
Kobayashi, Mol, & Kismihók - University of Amsterdam 2

Background
• The psychological literature is huge (PsychINFO abstracts 3.7 million
documents and PubPsych has 900,000 searchable records)
• Text Mining applications
• Mining biomedical literature
• Web textual data
• Opinion and Sentiment mining from product reviews, microblogging, users’ posts and
comments.
• Text Mining opportunities for gaining insight into trends in the scientific
literature
• Key term extraction to support efficient document search and retrieval
• Identifying topics to group document with similar themes
• So far little text mining effort has been made in the W&O psychology
Literature

Objectives
• Apply text mining, specifically, topic modeling techniques to the
W&O literature
• Pair topics and publication dates to reveal topical trends in this
field
Contributions
• Efficient search and retrieval of W&O psychology literature
• Supporting systematic literature review and automatic
knowledge discovery
• Identifying topics (or themes) and topic trends

Terminology
• Document – a file that contains sequence of characters or text
• Corpus – collection of documents
• Term – smallest unit in a document (e.g. word, phrase,
sentence, or even a single character)
• Vocabulary or lexicon – set of all unique terms

SOURCE
• Abstracts from 4 journals
1975-2014
1096 abstracts
2008-2014
89 abstracts
1977-2014
1115 abstracts
1991-2014
602 abstracts
Total number of abstracts: 2902

For this study…
• DOCUMENT
• A single abstract
• CORPUS
• Collection of abstracts
• TERMS
• Words
• VOCABULARY
• Set of all unique words (after preprocessing) in the corpus

Why Abstracts only?
• The abstract contains the gist of the whole article
• Commonly, articles are indexed based on titles, keywords and
abstracts.

Techniques
• String Processing
• Natural Language Processing
• Topic Modeling
• Latent Dirichlet Allocation Model
• Assumes that each document is a mixture of topics
• Each word is generated from a specific topic
• An algorithm for topic discovery
• Topical Trend Analysis

Analysis done separately for each journal

Original abstract
Preprocessed abstract
 Lower case transformation
 Stopwords removal
 Delete punctuations
 Stemming

Abstracts
Vocabulary The document-by-term
matrix
a a
 
11 1
N
 
 
 a a
V 1
VN

Documents
The entries (the a’s) are the tf-idf
weight of the terms in each
document

tf-idf
• There are many ways to assign weights to terms in the
documents
• The most popular is the tf-idf, computed by
, , tf-idf tf idf t d t d t  
frequency of term t in document d inverse document frequency of term t
idf log
N
t
number of documents in the corpus where t
occurs 

a a
 
11 1
N
 
 
 a a
V 1
VN

Documents
Vocabulary
Apply Latent Dirichlet
Allocation Model
1. List of Topics
2. Topic classification of
documents
Apply separately for each journal

Topical Trends
• Topic for each document
• Publication dates of documents
• Create a chart depicting the evolution of topics from the
publication dates and topics of the documents

Document Topic Publication Date
Document 1 Topic 3 1990
Document 2 Topic 5 1993
… … …
Document N Topic 12 1998
Publication Date Topic 1 Topic T
1975 Number of
publications
… Number of
publications
1976 Number of
publications
… Number of
publications
… … … …
2014 Number of
publications
… Number of
publications

Conclusion
Demonstrated the use of text mining to this type of application
Idea of what is keeping the researchers of W&O psychology
busy
Offers a view of how W&O Psychology topics evolve and gain
attention (which might reflect the development and maturation
of the field)
Can be alternative to traditional content analysis
Facilitate peer review process by suggesting to researchers the
outlet that will most likely accept their work.

Future Work
• Aside from extracting topics one can also extract concepts,
techniques, and key issues
• Create a hierarchy of topics
• Consider other parts of the document and not just the abstract.

MAIN REFERENCES
• Learning Topic Models by Arora, Ge, and Moitra (2012)
• Text Mining Infrastructure in R by Feinerer, Hornik, and Meyer
(2008)
• Understanding Evolution of Research Themes by Wang, Zhai,
and Roth (2013)

ACKNOWLEDGEMENT
• We would like to thank our colleague Ms Sofija Pajic for helping
us out in interpreting the topics.

Discovering Emerging Research Topics in Work and Organizational Psychology

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to Discovering Emerging Research Topics in Work and Organizational Psychology

Similar to Discovering Emerging Research Topics in Work and Organizational Psychology (20)

More from Eduworks Network

More from Eduworks Network (14)

Recently uploaded

Recently uploaded (20)

Discovering Emerging Research Topics in Work and Organizational Psychology

Editor's Notes