SlideShare a Scribd company logo
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
1
Dengue Endemic
Forecasting for San Juan
Team 2:
▪ Saurav Gupta
▪ Sasidhar Konda
▪ Ankita Paunikar
▪ Parmod Rathee
▪ Huixian Wang
1. Executive Summary:
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
2
Dengue is the major cause of death and illness in Puerto Rico. There are around 4001 million
people infected every year in the world and worse yet, Dengue virus can be transmitted by
mosquito bites very quickly and for now, there is no effective vaccines to prevent the spread of
this disease. Therefore, it is extremely meaningful to analyze the historical data of Dengue
endemic, and use forecast the disease outbreaks in the future. The objective of our project is to
determine the relationship between environmental factors, such as temperature and humidity, and
the amount of disease cases, and to examine other intervention factors which might affect the
spread of Dengue disease. Through the process of model selection and analysis of forecasting
results, we are able to provide the public and governmental health services with relatively accurate
information by implementing our model. This paper reflects our conclusions based on our findings
of times series models and recommendations to better prepare for future Dengue endemic
outbreaks within limited resources.
2. Statement of the Problem:
Historical surveillance data is supported San Juan, Puerto Rico. The data include weekly
laboratory-confirmed and serotype-specific cases for the location. Environmental data (like
temperature and humidity) from weather stations, satellites, and climate models are also provided.2
Forecasted model will be able to answer following key points.
A. Timing of peak incidence, i.e when the highest incidence of dengue occurs during July
and October every year.
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
3
B. Maximum Quarterly incidence, the number of dengue cases reported during the quarter
when incidence peaks, is during July 2008.
3. Background:
From the starting, as team, we had the consensus that we are going to do the forecasting. When
we’re looking for the datasets for this project we had verity of datasets to choose from. We had a
dataset to forecast the sales of the shampoo to the dataset where we should forecast the price of a
stock. We selected Dengue dataset as it is still an ongoing struggle of the state and to utilize our
analytics learning to something beyond business domain. To familiarize ourselves with the dataset
we read articles and documents which are listed in the references section. They provided us a great
deal of clarity as there were few biological variables in the dataset. We understand the concern of
US government in dengue control as they are devoting resources for the containment of these
endemic. On the official website of the Centers for Diseases Control and Prevention we found that
“Travel-associated dengue infections occur and several dengue outbreaks have been detected in
the continental United States, most dengue cases among U.S. citizens occur because of endemic
transmission in some U.S. territories, such as Puerto Rico”.3
4. Methodology:
Forecasting models are based on an infectious disease – dengue cases data collected by the Centers
for Disease Control and Prevention (CDC), which include satellite precipitation, humidity and
temperature from 1990 to 2007. Dengue cases forecasting models were developed using
autoregressive integrated moving average models and produced quarterly forecast over a 3-year
forecasting period.
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
4
Event Description: In Puerto Rico, the 1994 water shortage most affected the agricultural sector,
with losses estimated at more than $94 million. Production was off by more than 50 percent for
vegetables like plantains, a staple in Puerto Rican cuisine whose price has doubled. Tourism did
not suffer, government officials say, but hotels, hospitals and other commercial customers spent
thousands of dollars a month on water to stay open. And health officials said water stored in open
containers everywhere bred mosquitoes and was a factor in the worst outbreak of dengue fever
since the 1960's.4 Dengue is endemic to Puerto Rico, which sees 3,000 to 9,000 cases in non-
epidemic years. The worst epidemics since 1990 saw 24,700 cases in 1994, 17,000 in 1998, and
10,508 in 2007. During the most recent epidemic in 2007, half of the cases were hospitalized, and
one third were hemorrhagic.5
Environmental data: Satellite-derived environmental data were obtained from National Oceanic
and Atmospheric Administration, which gave weekly average temperature (in Kelvin) and average
specific humidity (g/kg)
5. Results:
The Quarterly total cases we modeled using Point: 7 + Point: 18 + Point: 19 + Point: 34 + Point: 35 +
Point: 62 + AR(15) model. Although temperature and humidity are not significant regressors to
dengue quarterly cases, the several explosive outbreaks of dengue endemic in San Juan were all
occurred in seasons when both temperature and humidity are higher than normal days.
6. Conclusions and recommendations:
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
5
Environmental data such as temperature, precipitation, humidity, and vegetation could be better used
to improve the accuracy of dengue predictions.
Based on our analysis we would give following recommendations:
 When analyzing characteristics of weather condition of peak transmission seasons, we could
apply proper insecticides to water storage areas during rain seasons, preventing mosquitoes
from accessing egg-laying habitats.
 Create awareness poll in San Juan about Dengue specifically before June and October month
since it is peak Dengue period every year
 As per the forecast for upcoming period government should set up mobile medical units.
 Create awareness about clean and hygienic environment among citizens.
APPENDIX:
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
6
Time Series
Prediction Errors
Autocorrelation Plots
Stationary test probability
The series graph does not suggest any strong
trend or seasonality, however, there are at least
four major outbreaks corresponding to those
peaks in the plot. A plot of the predicted values
of the models and the original data is displayed.
The prediction errors appear to be random.
The autocorrelation plots of the residuals are
displayed and none of the spikes are significantly
different from zero. These plots confirm that the
model explains all the significant autocorrelation
that was in the original data.
The white noise tests indicate that there is no
significant autocorrelation in the residuals.
Residuals also pass tests for stationarity.
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
7
Parameter Estimates
Statistics Fit
Forecast
Forecast dataset
The autoregressive lag12 and lag15 are not
significantly different from zero. Except these
two lags, all the other AR coefficients are
significant at the 10% level. This model should
not be disqualified because statistical significance
can be misleading and large p-values for
estimates are not enough to disqualify a model.
The RMSE for this model is 116.86, which is
superior than other models we tried. MAPE can be
interpreted as: the forecasts average a 43.42%
error.
The forecasts look plausible based on a visual
inspection of the historical data.
Evidence suggests that this is the best model we
found and we may accept this model since its
performance is superior than any other model we
tried.
University of Connecticut (MS-BAPM) Data Mining and Business Intelligence
8
Developed Models
7. References
1. Official CDC website.
https://www.cdc.gov/dengue/index.html
2. Official website of National Oceanic and Atmospheric Administration.
http://dengueforecasting.noaa.gov/
3. Official CDC website.
https://www.cdc.gov/dengue/about/inpuerto.html
4. New York times article from 1995.
http://www.nytimes.com/1995/01/23/us/taps-go-dry-as-puerto-rico-copes-with-drought.html
5. World Health Organization
http://www.who.int/mediacentre/factsheets/fs117/en/
6. Puerto Rico Declares Dengue Epidemic
http://www.healthmap.org/site/diseasedaily/article/puerto-rico-declares-dengue-epidemic-101812
The screen capture above shows all models
during quarter 34, 19, 7, 18, 62, and 35, the
total cases of Dengue disease spike up, so we
added 6 interventions into the model. Also,
AR (15) is implied based on previous
analysis.

More Related Content

What's hot

Covics-19
Covics-19Covics-19
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
UN Global Pulse
 
HSA 535 Enhance teaching - snaptutorial.com
HSA 535  Enhance teaching - snaptutorial.comHSA 535  Enhance teaching - snaptutorial.com
HSA 535 Enhance teaching - snaptutorial.com
DavisMurphyA51
 
Covics 19 final
Covics 19 finalCovics 19 final
Covics 19 final
Layla Hosseini-Gerami
 
Science.abb3221
Science.abb3221Science.abb3221
Science.abb3221
gisa_legal
 
EXL Analytics
EXL AnalyticsEXL Analytics
EXL Analytics
AkashJames7
 
Covid forecasting-03252020 4
Covid forecasting-03252020 4Covid forecasting-03252020 4
Covid forecasting-03252020 4
Mumbaikar Le
 

What's hot (7)

Covics-19
Covics-19Covics-19
Covics-19
 
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
Using Mobile Data and Airtime Credit Purchases to Estimate Food Security - Pr...
 
HSA 535 Enhance teaching - snaptutorial.com
HSA 535  Enhance teaching - snaptutorial.comHSA 535  Enhance teaching - snaptutorial.com
HSA 535 Enhance teaching - snaptutorial.com
 
Covics 19 final
Covics 19 finalCovics 19 final
Covics 19 final
 
Science.abb3221
Science.abb3221Science.abb3221
Science.abb3221
 
EXL Analytics
EXL AnalyticsEXL Analytics
EXL Analytics
 
Covid forecasting-03252020 4
Covid forecasting-03252020 4Covid forecasting-03252020 4
Covid forecasting-03252020 4
 

Similar to Dengue Outrage Forecasting via SAS

Lockdown optimization for Corona Virus
Lockdown optimization for Corona VirusLockdown optimization for Corona Virus
Lockdown optimization for Corona Virus
Shivanand (Shiva) Rai
 
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
HFG Project
 
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNINGANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
mlaij
 
Analysis of Covid-19 in the United States using Machine Learning
Analysis of Covid-19 in the United States using Machine LearningAnalysis of Covid-19 in the United States using Machine Learning
Analysis of Covid-19 in the United States using Machine Learning
mlaij
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
IRJET Journal
 
Epidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots ModelEpidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots Model
IJECEIAES
 
Coronavirus Case Tracking
Coronavirus Case TrackingCoronavirus Case Tracking
Coronavirus Case Tracking
DavidAhmed4
 
Covid 19 ppt 1
Covid 19  ppt 1Covid 19  ppt 1
Covid 19 ppt 1
TithiPurkait
 
Covid 19 ppt
Covid 19  pptCovid 19  ppt
Covid 19 ppt
TithiPurkait
 
Role of data science during covid times
Role of data science during covid timesRole of data science during covid times
Role of data science during covid times
TanyaAgarwal71
 
How a U.S. COVID-19 Data Registry Fuels Global Research
How a U.S. COVID-19 Data Registry Fuels Global ResearchHow a U.S. COVID-19 Data Registry Fuels Global Research
How a U.S. COVID-19 Data Registry Fuels Global Research
Health Catalyst
 
HPC FINAL PROJECT
HPC FINAL PROJECTHPC FINAL PROJECT
COVID-19 data configuration and statistical analysis
COVID-19 data configuration and statistical analysisCOVID-19 data configuration and statistical analysis
COVID-19 data configuration and statistical analysis
AnshJAIN50
 
Spatial-Temporal Data Science of COVID-19 Data.pptx
Spatial-Temporal Data Science of COVID-19 Data.pptxSpatial-Temporal Data Science of COVID-19 Data.pptx
Spatial-Temporal Data Science of COVID-19 Data.pptx
SanjayBhargavMadaman
 
Dengue Fever Presentation.pdf
Dengue Fever Presentation.pdfDengue Fever Presentation.pdf
Dengue Fever Presentation.pdf
kwadwoAmedi
 
Predicting asthma related emergency department visits using big data
Predicting asthma related emergency department  visits using big dataPredicting asthma related emergency department  visits using big data
Predicting asthma related emergency department visits using big data
redpel dot com
 
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODELSPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
IRJET Journal
 
The PHE data that goes against the narrative
The PHE data that goes against the narrativeThe PHE data that goes against the narrative
The PHE data that goes against the narrative
PandataAnalytics
 
The PHE data that goes against the narrative
The PHE data that goes against the narrativeThe PHE data that goes against the narrative
The PHE data that goes against the narrative
HeikeBrunner1
 
IRJET- A Prediction Engine for Influenza Pandemic using Healthcare Analysis
IRJET- A Prediction Engine for Influenza  Pandemic using Healthcare AnalysisIRJET- A Prediction Engine for Influenza  Pandemic using Healthcare Analysis
IRJET- A Prediction Engine for Influenza Pandemic using Healthcare Analysis
IRJET Journal
 

Similar to Dengue Outrage Forecasting via SAS (20)

Lockdown optimization for Corona Virus
Lockdown optimization for Corona VirusLockdown optimization for Corona Virus
Lockdown optimization for Corona Virus
 
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
Supplementary Actuarial Analysis of Tuberculosis, LAGOS STATE, NIGERIA HEALTH...
 
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNINGANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
ANALYSIS OF COVID-19 IN THE UNITED STATES USING MACHINE LEARNING
 
Analysis of Covid-19 in the United States using Machine Learning
Analysis of Covid-19 in the United States using Machine LearningAnalysis of Covid-19 in the United States using Machine Learning
Analysis of Covid-19 in the United States using Machine Learning
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
 
Epidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots ModelEpidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots Model
 
Coronavirus Case Tracking
Coronavirus Case TrackingCoronavirus Case Tracking
Coronavirus Case Tracking
 
Covid 19 ppt 1
Covid 19  ppt 1Covid 19  ppt 1
Covid 19 ppt 1
 
Covid 19 ppt
Covid 19  pptCovid 19  ppt
Covid 19 ppt
 
Role of data science during covid times
Role of data science during covid timesRole of data science during covid times
Role of data science during covid times
 
How a U.S. COVID-19 Data Registry Fuels Global Research
How a U.S. COVID-19 Data Registry Fuels Global ResearchHow a U.S. COVID-19 Data Registry Fuels Global Research
How a U.S. COVID-19 Data Registry Fuels Global Research
 
HPC FINAL PROJECT
HPC FINAL PROJECTHPC FINAL PROJECT
HPC FINAL PROJECT
 
COVID-19 data configuration and statistical analysis
COVID-19 data configuration and statistical analysisCOVID-19 data configuration and statistical analysis
COVID-19 data configuration and statistical analysis
 
Spatial-Temporal Data Science of COVID-19 Data.pptx
Spatial-Temporal Data Science of COVID-19 Data.pptxSpatial-Temporal Data Science of COVID-19 Data.pptx
Spatial-Temporal Data Science of COVID-19 Data.pptx
 
Dengue Fever Presentation.pdf
Dengue Fever Presentation.pdfDengue Fever Presentation.pdf
Dengue Fever Presentation.pdf
 
Predicting asthma related emergency department visits using big data
Predicting asthma related emergency department  visits using big dataPredicting asthma related emergency department  visits using big data
Predicting asthma related emergency department visits using big data
 
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODELSPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
SPECULATING CORONA VIRUS IMPLEMENT AMALGAM AI MODEL
 
The PHE data that goes against the narrative
The PHE data that goes against the narrativeThe PHE data that goes against the narrative
The PHE data that goes against the narrative
 
The PHE data that goes against the narrative
The PHE data that goes against the narrativeThe PHE data that goes against the narrative
The PHE data that goes against the narrative
 
IRJET- A Prediction Engine for Influenza Pandemic using Healthcare Analysis
IRJET- A Prediction Engine for Influenza  Pandemic using Healthcare AnalysisIRJET- A Prediction Engine for Influenza  Pandemic using Healthcare Analysis
IRJET- A Prediction Engine for Influenza Pandemic using Healthcare Analysis
 

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 

Dengue Outrage Forecasting via SAS

  • 1. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 1 Dengue Endemic Forecasting for San Juan Team 2: ▪ Saurav Gupta ▪ Sasidhar Konda ▪ Ankita Paunikar ▪ Parmod Rathee ▪ Huixian Wang 1. Executive Summary:
  • 2. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 2 Dengue is the major cause of death and illness in Puerto Rico. There are around 4001 million people infected every year in the world and worse yet, Dengue virus can be transmitted by mosquito bites very quickly and for now, there is no effective vaccines to prevent the spread of this disease. Therefore, it is extremely meaningful to analyze the historical data of Dengue endemic, and use forecast the disease outbreaks in the future. The objective of our project is to determine the relationship between environmental factors, such as temperature and humidity, and the amount of disease cases, and to examine other intervention factors which might affect the spread of Dengue disease. Through the process of model selection and analysis of forecasting results, we are able to provide the public and governmental health services with relatively accurate information by implementing our model. This paper reflects our conclusions based on our findings of times series models and recommendations to better prepare for future Dengue endemic outbreaks within limited resources. 2. Statement of the Problem: Historical surveillance data is supported San Juan, Puerto Rico. The data include weekly laboratory-confirmed and serotype-specific cases for the location. Environmental data (like temperature and humidity) from weather stations, satellites, and climate models are also provided.2 Forecasted model will be able to answer following key points. A. Timing of peak incidence, i.e when the highest incidence of dengue occurs during July and October every year.
  • 3. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 3 B. Maximum Quarterly incidence, the number of dengue cases reported during the quarter when incidence peaks, is during July 2008. 3. Background: From the starting, as team, we had the consensus that we are going to do the forecasting. When we’re looking for the datasets for this project we had verity of datasets to choose from. We had a dataset to forecast the sales of the shampoo to the dataset where we should forecast the price of a stock. We selected Dengue dataset as it is still an ongoing struggle of the state and to utilize our analytics learning to something beyond business domain. To familiarize ourselves with the dataset we read articles and documents which are listed in the references section. They provided us a great deal of clarity as there were few biological variables in the dataset. We understand the concern of US government in dengue control as they are devoting resources for the containment of these endemic. On the official website of the Centers for Diseases Control and Prevention we found that “Travel-associated dengue infections occur and several dengue outbreaks have been detected in the continental United States, most dengue cases among U.S. citizens occur because of endemic transmission in some U.S. territories, such as Puerto Rico”.3 4. Methodology: Forecasting models are based on an infectious disease – dengue cases data collected by the Centers for Disease Control and Prevention (CDC), which include satellite precipitation, humidity and temperature from 1990 to 2007. Dengue cases forecasting models were developed using autoregressive integrated moving average models and produced quarterly forecast over a 3-year forecasting period.
  • 4. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 4 Event Description: In Puerto Rico, the 1994 water shortage most affected the agricultural sector, with losses estimated at more than $94 million. Production was off by more than 50 percent for vegetables like plantains, a staple in Puerto Rican cuisine whose price has doubled. Tourism did not suffer, government officials say, but hotels, hospitals and other commercial customers spent thousands of dollars a month on water to stay open. And health officials said water stored in open containers everywhere bred mosquitoes and was a factor in the worst outbreak of dengue fever since the 1960's.4 Dengue is endemic to Puerto Rico, which sees 3,000 to 9,000 cases in non- epidemic years. The worst epidemics since 1990 saw 24,700 cases in 1994, 17,000 in 1998, and 10,508 in 2007. During the most recent epidemic in 2007, half of the cases were hospitalized, and one third were hemorrhagic.5 Environmental data: Satellite-derived environmental data were obtained from National Oceanic and Atmospheric Administration, which gave weekly average temperature (in Kelvin) and average specific humidity (g/kg) 5. Results: The Quarterly total cases we modeled using Point: 7 + Point: 18 + Point: 19 + Point: 34 + Point: 35 + Point: 62 + AR(15) model. Although temperature and humidity are not significant regressors to dengue quarterly cases, the several explosive outbreaks of dengue endemic in San Juan were all occurred in seasons when both temperature and humidity are higher than normal days. 6. Conclusions and recommendations:
  • 5. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 5 Environmental data such as temperature, precipitation, humidity, and vegetation could be better used to improve the accuracy of dengue predictions. Based on our analysis we would give following recommendations:  When analyzing characteristics of weather condition of peak transmission seasons, we could apply proper insecticides to water storage areas during rain seasons, preventing mosquitoes from accessing egg-laying habitats.  Create awareness poll in San Juan about Dengue specifically before June and October month since it is peak Dengue period every year  As per the forecast for upcoming period government should set up mobile medical units.  Create awareness about clean and hygienic environment among citizens. APPENDIX:
  • 6. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 6 Time Series Prediction Errors Autocorrelation Plots Stationary test probability The series graph does not suggest any strong trend or seasonality, however, there are at least four major outbreaks corresponding to those peaks in the plot. A plot of the predicted values of the models and the original data is displayed. The prediction errors appear to be random. The autocorrelation plots of the residuals are displayed and none of the spikes are significantly different from zero. These plots confirm that the model explains all the significant autocorrelation that was in the original data. The white noise tests indicate that there is no significant autocorrelation in the residuals. Residuals also pass tests for stationarity.
  • 7. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 7 Parameter Estimates Statistics Fit Forecast Forecast dataset The autoregressive lag12 and lag15 are not significantly different from zero. Except these two lags, all the other AR coefficients are significant at the 10% level. This model should not be disqualified because statistical significance can be misleading and large p-values for estimates are not enough to disqualify a model. The RMSE for this model is 116.86, which is superior than other models we tried. MAPE can be interpreted as: the forecasts average a 43.42% error. The forecasts look plausible based on a visual inspection of the historical data. Evidence suggests that this is the best model we found and we may accept this model since its performance is superior than any other model we tried.
  • 8. University of Connecticut (MS-BAPM) Data Mining and Business Intelligence 8 Developed Models 7. References 1. Official CDC website. https://www.cdc.gov/dengue/index.html 2. Official website of National Oceanic and Atmospheric Administration. http://dengueforecasting.noaa.gov/ 3. Official CDC website. https://www.cdc.gov/dengue/about/inpuerto.html 4. New York times article from 1995. http://www.nytimes.com/1995/01/23/us/taps-go-dry-as-puerto-rico-copes-with-drought.html 5. World Health Organization http://www.who.int/mediacentre/factsheets/fs117/en/ 6. Puerto Rico Declares Dengue Epidemic http://www.healthmap.org/site/diseasedaily/article/puerto-rico-declares-dengue-epidemic-101812 The screen capture above shows all models during quarter 34, 19, 7, 18, 62, and 35, the total cases of Dengue disease spike up, so we added 6 interventions into the model. Also, AR (15) is implied based on previous analysis.