More Related Content
Similar to FSHP Poster (20)
FSHP Poster
- 1. TEMPLATE DESIGN © 2008
www.PosterPresentations.com
Detection of Emerging Research Trends by Biomedical Text Mining Algorithm
Majid Mirzai, Tyler Chia (MPH), Reid Orenstein, Anish Patel, Sachin Devi (Ph.D)
LECOM School of Pharmacy, 5000 Lakewood Ranch Blvd. Bradenton, Florida 34211
Objective
The objective was to demonstrate how a newly developed
text mining tool can be used to identify emerging research
trends.
Background
Methods
Article titles containing the word “obesity” were downloaded
from PubMed. A primary text mining algorithm developed in
Visual Basis 6.0 (VB6) was used to calculate word frequency.
All words since year 1880 were used to create a data
visualization technique known as “word cloud.” Words that
appeared more frequently were presented larger than words
that appeared less frequently. Average percentage increase
in word frequency was then calculated for all words over the
period of five years (2011-2015) to identify the emerging
trends in obesity. A secondary text mining algorithm was
developed to filter unique words that appeared for the very
first time in 2015 along with the word “obesity” in order to
identify the most recent scientific trends.
Article titles containing the word “obesity” were downloaded
from PubMed and imported into a custom text analytics
program written in Visual Basic 6.0. PubMed article titles were
analyzed to identify emerging research trends and novel
scientific breakthroughs
A word cloud of all the words that appeared along with the word
“obesity” in the titles of all the articles published since 1880.
Font size of the word is directly proportional to the word
frequency of that particular word. The most prominent themes in
obesity research were treatment, childhood, overweight,
diabetes, insulin, metabolic, etc.
A representative list of the words that appeared for the very first
time in the year 2015. Over 100 obesity-related therapeutic
targets were identified. Not only therapeutic targets were
identified but several biomarkers, genes, proteins, etc.
associated with obesity were also identified.
Publication trends for the PubMed articles containing the words
“nonalcoholic” and “obesity: in their titles. The word
“nonalcoholic” was published for the first time along with the
word “obesity” in 1986. Subsequent publication containing the
words “nonalcoholic” and “obesity” together occurred in 1987
followed by a 12-year lengthy period of no publication at all.
Interestingly, in the last 15 years, there is a continuous stream
of publications containing the word “nonalcoholic” and “obesity”.
These data clearly indicate a growing research trend in
“nonalcoholic” + “obesity” research area.
Average percentage increase (APC) in word frequency was
calculated to identify emerging trends in obesity research over a
five-year period (2011-2015). This is a representative list of
interesting emerging trends in the past five years.
Results
Conclusion
A total of 58,215 articles containing the word “obesity” were
downloaded from PubMed. The primary text mining algorithm
found the words with highest frequency were “obesity”
(n=39,830), “treatment” (n=3,758), “risk” (n=3,382), and
“childhood” (n=3,052). Over the five years from 2011 to 2015,
average percentage change found an increase in the terms
“nonalcoholic” by 750%, “placental” by 680%, “microbiome”
by 680%, and “dopamine” by 580%. In 2015 alone, the
secondary algorithm uncovered over 100 terms that were not
previously present. Biomarkers, genes, proteins, etc. were
discovered as novel therapeutic targets associated with
obesity. These terms represent potential obesity-related novel
therapeutic targets that can be used for future research.
This study demonstrates the ability of a text mining algorithm
to uncover emerging research trends that would normally be
buried under the vast number of publications.
PubMed is the largest database of biomedical literature that
contains over 25 million citations. Analyzing this vast number
of articles, coupled with the rapid rate of publication, presents
a challenge to the scientific community. Therefore, there is a
need for a high-performing scalable tool to identify emerging
novel scientific trends. We hypothesized that analyzing the
titles of the scientific articles can assist in identifying
emerging research trends. In the present study, text mining
algorithms were used to unearth novel and emerging
scientific trends using the case study of “obesity as research
interest.”
Bag,S. et al. (2015) Fabp4 is central to obesity associated genes: a
functional gene network-based polymorphic study. J. Theor. Biol., 364,
344-354.
Bekhuis,T. (2006) Conceptual biology, hypothesis discovery, and text
mining: Swanson’s legacy. Biomed. Digit. Libr., 3, 2.
Bikman,B.T. (2012) A role for sphingolipids in the pathophysiology of
obesity-induced inflammation. Cell Mol. Life Sci., 69, 2135-46.
Charles,D. (2015) In the search for the perfect sugar substitute, another
candidate emerges. NPR. 25 August 2015. Web.
Choi,S. and Snider,A.J. (2015) Sphingolipids in high fat diet and obesity
-related diseases. Mediat. Inflamm., 2015, 1-12.
Cluny,N.L. et al. (2015) Interactive effectsoligofructose and obesity
predisposition on gut hormones and microbiota in diet-induced obese
rats. Obesity, 23, 769-778.
Doroghazi,R.M. (2015) A candid discussion of obesity. Am. J. Med.,
128, 213-214.
Feng,R. et al. (2014) Higher vaspin levels in subjects with obesity and
type 2 diabetes mellitus: a meta-analysis. Diabetes Res. Clin. Pr.,
106, 88-94.
He,W. et al (2013) Social media competitive analysis and text mining: a
case study in the pizza industry. Int. J. Inform. Manage., 33, 464-472.
Hossain,A. et al. (2015) Rare sugar D-allulose: potential role and
therapeutic monitoring in maintaining obesity and type 2 diabetes
mellitus. Pharmacol. Ther., 155, 49-59.
Klöting,N. et al. (2006) Vaspin gene expression in human adipose
tissue: association with obesity and type 2 diabetes. Biochem. Bioph.
Res. Co., 339, 430-436.
Kumar,M.J. (2013) Making your research paper discoverable: title plays
the winning trick. IETE Technical Review, 30, 361-363.
Lopez,C. et al. (2014) How can catchy titles generated without loss of
informativeness. Expert Syst. Appl., 41, 1051-1062.
Lyssenko,V. et al. (2009) A common variant in the melatonin receptor
gene (MTNR1B) is associated with increased risk of future type 2
diabetes and impaired early insulin secretion. Nat. Genet., 41, 82-88.
Massey,V.L. et al. (2015) Oligofructose protects against arsenic-
induced liver injury in a model of environment/obesity interaction.
Toxicol. Appl. Pharmacol., 284, 304-314.
Pedrami,F. et al. (2016) Text analytics of AJPE article titles reveal
emerging trends in pharmacy education in the past two decades. Am.
J. Pharm. Educ., In press.
PubMed Help [Internet]. (2005) National Center for Biotechnology
Information (US). PubMed Help. [Updated 2016 Feb 14].
Romero,C. and Ventura,S. (2013) Data mining in education. Wiley
Interdiscip. Rev. Data Min. Knowl. Discov., 1, 12-27.
Woting,A. et al. (2015) Alleviation of high fat-induced obesity by
oligofructose in gnotobiotic mice is independent of presence of
Bifidobacterium longum. Mol. Nutr. Food Res., 59, 2267-2278.
of Zaremba,S. et al. (2009) Text-mining of PubMed abstracts by
natural language processing to create a public knowledge base on
molecular mechanisms of bacterial enteropathogens. BMC
Bioinformatics, 10, 177.
References
Top 25 most
frequently appearing
words along with the
word “obesity” in the
titles of the PubMed
articles.
Top 25 most frequently
appearing words along with
the word “obesity” in the
titles of PubMed articles
after ignoring articles,
prepositions, conjunctions,
etc.
Overview of Methodology
Most Frequently Appearing Words
Overall Analysis - Word Cloud
5 Year Analysis - Emerging Trends in Obesity Research
Publication Trends of “nonalcoholic”
1 Year Analysis - Novel Scientific Breakthroughs in 2015