The presentation gives an overview of the integration of artificial intelligence in the field of epidemiology, health analytics, data interpretation and the use of the lgorithms to predictvthe epidemics and related data.
2. Contents
Introduction
AI techniques used in epidemiological research & health analytics
Natural language processing
Machine learning
Artificial neural networks
Decision Tree
Naïve Bayes Algorithm
AI algorithms in epidemics research (Covid 19)
AI algorithms in epidemics research (Dengue)
Merits of ML algorithms in epidemiological research & health analytics
Challenges of ML algorithms in epidemiological research & health analytics
Future of ML algorithms in epidemiological research & health analytics
References
Princi Thapak, AUUP
2
3. Introduction
• Epidemiology is a branch of medicine that deals with the study of the incidence, distribution and
control of diseases and other health-related factors. (Frérot et al., 2018)
• Monitoring the incidence of the population have impacts enormously on preventive health.
• The large majority of AI applications for epidemiology are predictive analysis applications.
• An important element is both tracking epidemiological phenomena and related preventive actions.
(Timothy et al., 2019)
• Machine learning and Natural Language processing are majorly used for epidemiological
research.
Princi Thapak, AUUP
4. Big Data
Machine
learning
Deep
Learning
Artificial
Intelligence
Artificial Intelligence
Use of computerised techniques
that mimic human activities
Machine learning
Computational and statistical tools
that uses algorithms
Deep Learning
The use of huge amount of data for
input in various algorithms
Image 1: An overview of the relationship between artificial intelligence (AI), machine learning (ML), big data, and deep neural
networks. (Fatima et al., 2018)
Princi Thapak, AUUP
4
5. AI techniques used in epidemiological research
& health analytics
Natural language
processing (NLP)
methods
Machine learning (ML)
techniques
Princi Thapak, AUUP
5
6. Natural language processing
Natural language processing (NLP) methods that extract information from
unstructured data such as clinical notes/medical journals to supplement and enrich
structured medical data. It turns texts to machine-readable structured data, which can
then be analyzed. (Murf et al., 2011)
Natural language processing can provide access to such information contained in free
text that may not be fully captured by conventional diagnostic coding. (Klien et al., 2020)
Instead of hand-coding large sets of rules, NLP can rely on machine learning to
automatically learn these rules by analyzing a set of examples (a large corpus, like a
book, down to a collection of sentences), and making a statistical inference.
Princi Thapak, AUUP 6
8. Machine learning
Machine learning is a branch of computer science that has the potential to transform epidemiologic
sciences. (Subramanian et al., 2018)
It’s an umbrella term used to describe a wide variety of models and strategies that focus on
algorithmic modeling. (Darcy et al., 2015)
It focus on “Big Data,” it offers epidemiologists new tools to tackle problems for which classical
methods are not well-suited.
A distinct advantage of machine learning methods includes the robust handling of large numbers
of variables combined in interactive linear and nonlinear ways to detect patterns in the data for
prediction. (Catherine et al., 2018)
For example: machine learning allows epidemiologists to evaluate as many variables as desired
without increasing statistical error. (Philemon et al., 2019)
Princi Thapak, AUUP
8
10. Artificial neural networks
Artificial neural network (ANN) is a computational model that consists of several processing
elements that receive inputs and deliver outputs based on their predefined activation functions.
It uses the processing of the brain as a basis to develop algorithms that can be used to model
complex patterns and prediction problems. (Olden et al., 2002)
ANN is used to solve the problems related to data optimisation and prediction.
In epidemiology it is efficient in assessing: risk, growth, severity, and control of disease
spread.
Since ANN is based on natural neuron system its an intelligent forecast model.
(Philemon et al., 2019)
Princi Thapak, AUUP
10
11. Image 2: A single artificial neuron depiction artificial neural network of diabetic
patients. (Qifang et al., 2019)
Princi Thapak, AUUP
11
12. Decision Tree
A Decision tree is the denotative representation of a decision-making process.
Decision trees in AI are used to arrive at conclusions based on the data available
from decisions made in the past and predict an outcome. (Brieman et al., 1984)
In epidemiology, they are used to determine the risk of disease transmission by
affected individuals. (James et al., 2013)
Princi Thapak, AUUP 12
13. Image 3: A hypothetical classification decision tree for predicting a binary outcome, type
2 diabetes mellitus. (Qifang et al,. 2019)
Princi Thapak, AUUP
13
14. • Kernlab
• Caret
• PROC SVM in SAS
• e1071
• Sklearn in Python
Decision
Tree
Soft wares
Princi Thapak, AUUP 14
15. Naïve Bayes Algorithm
Naïve bayes classifier is one of the simple and most effective classification
algorithms which helps in building the fast machine learning models that can
make quick predictions.
Naive bayes is suitable for solving multi-class prediction problems.
The naïve bayes classifier analyzes each protein sequence information and
identifies whether the sequence is diseased or normal.
Initially, the classifier calculates the probability value of the normal sequence and
the diseased sequences from the training data. (Shinglei et al., 2020)
Princi Thapak, AUUP 15
16. AI algorithms in epidemics research
(Covid 19)
In a model based on the k-means clustering algorithm was developed and used to
categorize countries based on the number of confirmed COVID-19 cases using a dataset
that contains features such as the prevalence of HIV/AIDS, diabetes, and tuberculosis in
156 countries in addition to data on the number of COVID-19 related deaths, confirmed
cases and recovered cases. (Carrillo-Larco and Castillo-Cara 2020)
In an ANFIS-based model was developed to estimate and forecast the number of
confirmed cases of COVID-19, 10 days ahead using data of previously confirmed cases.
(Al-Qaness et al., 2020) & (Raul G et al., 2021)
In 10 Brazilian states with a high daily COVID-19 incidence, a stacked ensemble of
learning algorithms such as cubist regression (CUBIST), ridge regression (RIDGE) and
SVM ahead time series forecasting of the COVID-19 cumulative confirmed cases.
(Ribeiro et al., 2020)
Princi Thapak, AUUP 16
17. AI algorithms in epidemics research
(Dengue)
WHO reported the incidence of dengue cases increased from 2.2 million in 2010 to 3.2
million in 2015.
Artificial Intelligence in Medical Epidemiology’s (AIME) platform provides critical
information for disease prediction and outbreak management via machine learning and
artificial intelligence with the goal to identify the location of the next disease outbreak up to
three months before it occurs.(Dhesi et al., 2019)
AIME can predict the outbreak of Dengue Fever with 87% accuracy through an algorithm
that uses 11 different variables like weather data, construction data, dengue death data and
other points.(Bala et al., 2019)
Princi Thapak, AUUP 17
18. AI models
used for
outbreak
predictions
1.Long short-
term memory
(LSTM),
1.Partial
derivative
regression
1.Non-linear
machine
learning (PDR-
NML)
1.Convolutional
Neural Network
1.Adaptive-
network-based
fuzzy inference
system
(ANFIS),
Smart Chart: Depicting various AI models used for outbreak predictions
Princi Thapak, AUUP
18
19. Merits of ML algorithms in epidemiological research
& health analytics
Improve the efficacy of epidemiology predictive models.
Evaluate as many variables as desired without increasing statistical error.
Multiple testing bias can be avoided.
Ensures that the computer completes the task in the most efficient manner.
It is flexible enough don’t have to put so much effort on it.
They are more fault tolerant.
Reduces the chances of manual errors.
Princi Thapak, AUUP
19
20. Challenges of ML algorithms in epidemiological research
& health analytics
Require lengthy offline batch training.
Require large amounts of hand-crafted, structured training data.
Lack of data, and lack of good data.
Black box problems.
Data can be manipulated.
Reasoning power still doesn’t mimic humans.
Princi Thapak, AUUP 20
21. Future of AI in epidemiological research and health
analytics
AI in epidemiology can help identify the emerging risk from unfortunate pandemics
if any in the future, notifying government and hospitals, and identify epi-centers of the
pandemic.
AI can’t replace direct medical testing and surveillance in any way. But it can help
government and healthcare policy makers take appropriate decisions such as identifying
hot spots to direct more testing.
As medical information gets digitized, AI will continue to convert it into usable data,
making epidemiological predictions a lot more manageable and efficient.
AI will be making more advanced tools which will be more efficient in prediction of
any epidemic.
Princi Thapak, AUUP 21
22. References
Bala Murali Sundram1 *, Dhesi Baha Raja1 , Fazilah Mydin2 , Ting Choo Yee3 , Kamesh Raj1 and Fadzilah Kamaludin Utilizing
Artificial Intelligence as a Dengue Surveillance and Prediction Tool Sundram et al., J Appl Bioinforma Comput Biol 2019, 8:1
DOI: 10.4172/2329-9533.1000165
Breiman L, friedmanj stone CJ et al. Classification and regression trees 1st ed. London, united kingdom chapman & hall ltd.1984.
Catherine kreatsoulas, s.V. Subramanian machine learning in social epidemiology: learning from experience SSM - population health
volume 4, april 2018, pages 347-349 https://doi.Org/10.1016/j.Ssmph.2018.03.007
Dhesi baha raja, rainier mallol et al., 2019 Artificial intelligence model as predictor for dengue outbreaks Malaysian journal of health and
medicine DOI: https://doi.Org/10.37268/mjphm/vol.19/no.2/art.176
Darcy am , louie ak , roberts lw machine learning and the profession of medicine. Jama 2016;315:551–2. Doi:10.1001/jama.2015.18421
Fatima rodriguez david scheinker and robert A. Harrington promise and perils of big data and artificial intelligence in clinical medicine
and biomedical research dec 2018 https://doi.Org/10.1161/circresaha.118.314119
Frérot M, lefebvre A, aho S, callier P, astruc K, aho glélé LS. 2018. What is epidemiology? Changing definitions of epidemiology 1978–
2017. PLOS ONE 13(12):e0208442
James G witten D hastie T et al. An introduction to statistical learning: with applications in R. New york, NY springer publishing
company 2013.
Princi Thapak, AUUP 22
23. References
James G witten D hastie T et al. An introduction to statistical learning: with applications in R. New york, NY springer publishing
company 2013.
Murff HJ , FitzHenry F , Matheny ME , et al Automated identification of postoperative complications within an electronic medical record
using natural language processing. JAMA 2011;306:848–55. doi:10.1001/jama.2011.1204
Philemon manliura datilo, Zuhaimy Ismail, Jayeola Dare, Areview of epidemic forecasting using artificial neuralnetworks. 2019
doi: 10.15171/ijer.2019.24
Qifang Bi, Katherine E Goodman, Joshua Kaminsky, Justin Lessler What is Machine Learning? A Primer for the
Epidemiologist American Journal of Epidemiology, Volume 188, Issue 12, December 2019, Pages 2222–2239
https://doi.org/10.1093/aje/kwz189
Raul G. Nogueira, Davies rishi gupta et al., 2021 epidemiological surveillance of the impact of the COVID-19 pandemic on stroke care
using artificial intelligence originally published4 mar 2021. Stroke2021;52:1682–1690 https://doi.Org/10.1161/strokeaha.120.031960
Shenglei Chen, Geoffrey, I.Webb, LinyuanLiu, XinMa A novel selective naïve Bayes algorithm Volume 192, 15 March 2020, 105361
https://doi.org/10.1016/j.knosys.2019.105361
Timothy L. Wiemken1 and Robert R. Kelley2 Annual Review of Public Health Machine Learning in Epidemiology and Health Outcomes
Research. The Annual Review of Public Health is online at publhealth.annualreviews.org https://doi.org/10.1146/annurev-
publhealth040119-094437
Princi Thapak, AUUP 23