NDGeospatialSummit2022 - Using Machine Learning and Quasi Binomial Model to Predict and Understand the Impact of COVID 19 in North Dakota

Using Machine Learning and Quasi-
Binomial Model to Predict and Understand
the Impact of COVID-19 in North Dakota
Valquiria F. Quirino;
Avram Slone;
Jerry Dogbey-Gakpetor;
Karen L. Olson;
Nancy M. Hodur
North Dakota Geospatial Summit - Sept. 14-15, 2022

Funding and Genesis
• The NDSU Center for Social Research was the recipient of a CDC grant to study
COVID-19 disparities in underserved populations in North Dakota.
• We’ve partnered with the ND Dept. of Health (NDDoH), and are using data
provided by them to research COVID-19 disparities from various perspectives.
• One such perspective is a county-level model of COVID-19 infections and
outcomes based on the sociodemographic, health, and geographic features of
each county’s population.

Project Goals
1. Use random forest to predict the proportion of the county level population
with confirmed and probable COVID-19 cases (Y1), confirmed and probable
COVID-19 cases resulting in hospitalization (Y2), and confirmed and probable
COVID-19 cases resulting in death (Y3).
2. Use quasi-binomial models to investigate the effect of various independent
variables on county level population with confirmed and probable COVID-19
cases (Y1), confirmed and probable COVID-19 cases resulting in hospitalization
(Y2), and confirmed and probable COVID-19 cases resulting in death (Y3).

Modeling COVID-19
• From the early days of the pandemic, COVID-19 models were used to inform
public policy decisions worldwide.
• The first model to garner substantial public attention was published on March
16th, 2020 and predicted that with no coordinated intervention COVID-19 would
kill 2.2 million people in the USA and infect over 80% of the country’s
population.1,2
• Modeling has also been used frequently to determine predictors of serious illness
and mortality from COVID-19.3

Data-Driven and Epidemiological Forecasts
• Predictive models of the COVID-19 pandemic can be broadly categorized as
either data-driven or epidemiological.4
• Data-driven models typically take the form of predictive curves, using past data
to predict future outcomes.
• Epidemiological models divide the population of study into recognized
epidemiological groups and model the movement of individuals between groups
based on individual and environmental features.

Data-Driven Forecasts
• These models are built using assumptions about the future. For example, the
implementation of a mask mandate, or a social distancing policy.
• As such, they are most useful as short-term predictive tools, since policies are
changed and updated frequently, and practical policy implementation usually
does not match theoretical policy goals.5
• Highlight the need for harm-reductive policy, but fail to account for the shifting
dynamics of spread.6

Epidemiological Forecasts
• These models are built by separating the population into recognizable
epidemiological groups such as Susceptible, Infected, and Recovered (SIR
Model).7
• Dividing the population in this way allows for modeling that can be catered to
social, health, and geographical patterns that impact each group differently.
• Because they are models within models, they are particularly susceptible to
changes in on-the-ground pandemic dynamics, but are also more easily
adaptable to them.

Random Forest Modeling and COVID-19
• A machine learning algorithm commonly used predictively and in variable
selection.
• Random forest has been used frequently in modeling COVID-19 patient outcomes
as well as ranking the importance of various sociodemographic, geographic,
socioeconomic, and health variables.8,9,10,11
• Among the most popular and reliable machine learning algorithms for predicting
health outcomes.12,13,14

Quasi-Binomial Model and COVID-19
• Quasi-binomial logistic regression allows for fitting a model while taking
dispersion into account.
• COVID-19 mortality data has been used to show that quasi-binomial distribution
is an effective model in the analysis of overly-dispersed data.15
• Has been frequently used in predictive analyses of COVID-19 spread and
mortality, as well as in studies of pandemic behaviors.16,17,18

Distribution of COVID-19 Cases (Y1)
• Slope County was the
least affected by COVID-
19 between 2020 and
2022.
• The eight counties shown
in light green – Rolette,
Eddy, Cass, Stutsman,
Burleigh, Morton,
Hettinger and Stark -
were the most affected
by COVID-19.

Distribution of COVID-19 Cases Resulting in Hospitalization
(Y2)
• The percent of the
population
hospitalized due to
COVID-19 was less
than 4 percent for all
ND counties.
• The five counties
shown in light green
– Sheridan, Grant,
Sioux, Emmons and
McIntosh - had the
highest percent of
COVID-19
hospitalizations.

Distribution of COVID-19 Cases that Resulted in Deaths (Y3)
• The percentage of
the population that
died due to COVID-19
was less than 1
percent for all ND
counties.
• The two counties
shown in light green
(Pierce and Dickey)
had the highest
percentage of COVID-
19 deaths.
Note: COVID-19 deaths in Slope County are unknown.

Data Acquisition
The data used for this work come from several different sources:
1. North Dakota Dept. of Health (NDDoH)
• A dataset with information about 234,998 individuals who were diagnosed with
COVID-19 between March 11th, 2020 and February 13th, 2022 was provided by
NDDoH. The dataset had information about demographics, vaccination status,
self-reported underlying diseases, and county of residency of individuals who
had or probably had COVID-19.
• Additionally, two-class county categorization (Rural or Urban) and the total
number of COVID-19 tests per county from March 11, 2020 to February 13,
2022 - which is the sum of PCR test and rapid antigen tests - was obtained from
NDDoH’s publicly available data.

Data Acquisition
2. United States Census Bureau
• The population estimates from the 2020 Decennial Census were bridged
with race categories and the vintage 2020 population estimates. These
datasets were used to obtain county level estimates of total population
and population by gender, race, and age range.
• The 2020 Decennial Census state redistricting summary file was used to
obtain the total group quarters population. This population includes
persons living in both institutionalized and non-institutionalized group
quarters.
• Current county land areas were obtained from the US Census Bureau.
• The Census was also the source for three economic variables (per capita
income, median household income and median family income).

Data Acquisition
3. 2020 North Dakota Behavioral Risk Factor Surveillance System (BRFSS) data
• Per-county weighted estimates of four comorbid conditions (Asthma, chronic
lung disease, diabetes and angina or coronary heart disease).
4. Center for Disease Control and Prevention (CDC)
• The number of individuals who received COVID-19 vaccines per county
between March 11, 2020 and February 13, 2022 was obtained from the CDC’s
publicly available data.
5. Other
• The source for the percentage of the adult population (age 18 and older) that
reports a body mass index (BMI) greater than or equal to 30 kg/m2 (age-
adjusted) in 2019 was the county health rankings website.19

Materials and Methods
• Case data from NDDoH were cleaned. 223,349 observations were used. This
included each individual’s first known COVID-19 infection.
• Using various data sources, general population variables and variables for the
COVID-19 infected population were calculated (see next slide).
• Descriptive statistics were obtained for the dependents and independent variables.
• Random forest models were produced to predict the proportion of the county level
population with confirmed and probable COVID-19 cases (Y1), hospitalization (Y2),
and death (Y3).
• The models were produced using twenty-six variables.
• Prediction accuracies as well as the most important variables for each model are
reported.
• Finally, quasi-binomial models were produced. They were used to investigate the
impact of the independent variables on the dependent variables.

Variables Used
General Population
1. Population demographics (gender,
age, race)
2. Comorbidities
3. County and place of residence
4. Population density
5. COVID-19 tests performed
6. COVID-19 vaccination status
7. Household and per-capita income
variables
Population with COVID-19
1. Population demographics (gender,
age, race) of those with COVID-19
2. Comorbidities of those with COVID-
19
3. COVID-19 vaccination status of
those with COVID-19
4. Place of residence of those with
COVID-19

How does random forest work?
• Random forest is a supervised machine learning algorithm.
• It builds a number of decision trees on bootstrapped samples of the dataset.
• It uses averaging to improve prediction accuracy and control overfitting.
• When building these trees, a random sample of m predictors is chosen as split
candidates from the full set of p predictors (usually m is approximately sqrt(p)).
• Random forest makes it easy to evaluate variable importance.
• The variable importance feature of random forest modelling serves as a source
for variable selection for simpler models.

Source: 20

How does quasi-binomial model work?

Descriptive Statistics for the
Dependent Variables and Categorical Variable
Variables Counties N N Missing Min Mean* Median Max Std Dev
Y1 - Percent of Population with COVID-19 All 53 0 8.2 25.4 25.7 37.8 5.3
Y2 – Percent of Population who was Hospitalized with COVID-19 All 53 0 0.1 1.1 1.1 2.4 0.5
Y3 – Percent of Population who Died with COVID-19 All 52 1 0.1 0.4 0.3 0.8 0.2
Y1 – Percent of Population with COVID-19 Rural 45 0 8.2 24.5 25.3 37.8 5.1
Y2 – Percent of Population who was Hospitalized with COVID-19 Rural 45 0 0.1 1.2 1.1 2.4 0.5
Y3 – Percent of Population who Died with COVID-19 Rural 44 1 0.1 0.4 0.3 0.8 0.2
Y1 – Percent of Population with COVID-19 Urban 8 0 25.5 30.5 30.5 33.8 2.8
Y2 – Percent of Population who was Hospitalized with COVID-19 Urban 8 0 0.7 1.0 1.0 1.6 0.3
Y3 – Percent of Population who Died with COVID-19 Urban 8 0 0.2 0.3 0.3 0.5 0.1
There are differences in the COVID-19 statistics for the dependent variables and for rural and
urban counties.
Note: Statistics were calculated for each column independently without regard of missing values in
other columns.
*See graph in the next slide

*Y1 vs Y2 vs Y3 and Rural vs Urban
25.4
1.1 0.4
24.5
1.2 0.4
30.5
1 0.3
Percent of the Population with
Covid-19 (Y1)
Percent of Population
Hospitalized with Covid-19
(Y2)
Percent of Population that
Died with Covid-19 (Y3)
Averages of Dependent Variables
Mean of All Counties Mean of Rural Counties Mean of Urban Counties

Descriptive
Statistics for
the
Numerical
Independent
Variables
Note 1: Statistics were calculated for each column independently without regard for missing values in other columns.
Note 2: Variable names shown in black are measures of a general population characteristic. Variable names shown in blue are measures of the population with COVID-19.
Variable # Variable Names N
N
Missing Min Mean Median Max Std Dev
X1 Percent Difference in the Female and Male Population 53 0 -8.0 -2.7 -2.0 2.0 2.6
X2 Percent Difference in the Female and Male Population with COVID-19 53 0 -0.5 4.1 4.1 10.6 2.2
X3 Percent of the Population younger than 17-years-old 53 0 17.7 23.3 22.3 36.0 3.9
X4 Percent of the Population between 18- and 34-years-old 53 0 12.7 18.8 17.1 35.1 4.8
X5 Percent of the Population between 35- and 64-years-old 53 0 29.9 35.9 36.0 39.4 2.0
X6 Percent of the Population 65-years-old and older 53 0 9.0 22.1 23.2 34.0 6.1
X7 Percent of the Population with COVID-19 younger than 17-years-old 53 0 6.2 18.4 19.0 40.4 6.0
X8 Percent of the Population with COVID-19 between 18- and 34-years-old 53 0 8.6 32.6 32.7 43.8 7.1
X9 Percent of the Population with COVID-19 between 35- and 64-years-old 53 0 9.4 28.7 28.8 38.0 5.7
X10 Percent of the Population with COVID-19 65-years-old and older 53 0 6.1 21.3 21.5 30.9 5.3
X11 Percent of the Population who is Black 53 0 0.1 1.3 0.8 6.4 1.3
X12 Percent of the Population with COVID-19 who is Black 53 0 0.0 21.1 17.4 166.7 25.3
X13 Percent of the Population who has Two or More Races 53 0 0.4 1.9 1.9 3.9 0.8
X14 Percent of the Population with COVID-19 who has Two or More Races 53 0 2.3 25.3 21.8 100.0 17.3
X15 Percent of the Population who is Asian 53 0 0.0 0.9 0.6 4.3 0.9
X16 Percent of the Population with COVID-19 who is Asian 52 1 0.0 16.6 13.7 57.1 15.1
X17 Percent of the Population who is Native Hawaiian or Pacific Islander 53 0 0.0 0.1 0.0 0.2 0.1
X18 Percent of the Population with COVID-19 who is Native Hawaiian or Pacific Islander 36 17 0.0 29.7 8.1 300.0 56.9
X19 Percent of the Population who is American Indian 53 0 0.6 7.1 2.0 82.4 16.9
X20 Percent of the Population with COVID-19 who is American Indian 53 0 0 17.6 15.4 56.5 12.0
X21 Percent of the Population who is White 53 0 12.9 88.8 94.1 98.2 17.1
X22 Percent of the Population with COVID-19 who is White 53 0 6.7 18.7 18.7 25.8 4.5
X23 Population Density (in people per square miles) 53 0 0.6 9.4 3.8 104.2 17.3
X24 COVID-19 Tests performed per Capita 53 0 0.9 4.4 4.2 8.5 1.7
X25 Percent of Individuals with COVID-19 within those who Reside in Congregate Settings 44 9 8.9 48.7 52.3 98.3 18.8
X26 Percent of Total Population who live in Institutionalized and Not-Institutionalized Group Quarters 53 0 0.0 2.7 2.3 9.1 2.1
X27 Percent of Individuals within each County with Asthma 53 0 0.0 9.7 8.9 37.7 7.1
X28 Percent of Individuals within each County with Diabetes 53 0 0.0 12.7 11.6 31.4 8.0
X29 Percent of Individuals within each County with Lung Disease 53 0 0.0 4.9 5.0 14.7 3.5
X30 Percent of Individuals within each County with Angina or Coronary Heart Disease 53 0 0.0 5.7 4.3 25.1 4.7
X31 Percent of the Total Population with COVID-19 and Asthma 53 0 0.1 0.7 0.7 1.4 0.2
X32 Percent of the Total Population with COVID-19 and Diabetes 53 0 0.5 1.6 1.5 3.5 0.5
X33 Percent of the Total Population with COVID-19 and Chronic Lung Disease 53 0 0.3 1.5 1.4 2.6 0.4
X34 Percent of the Total Population with COVID-19 and Cardiovascular Disease (CVD) 53 0 0.5 1.5 1.4 2.5 0.4
X35 Percent of the Adult Population who is Obese 53 0 32.0 37.3 37.0 49.0 3.1
X36 Percent of Total Population with COVID-19 who is Unvaccinated 53 0 6.8 20.7 21.3 28.6 4.0
X37 Percent of Total Population who Received 1-Dose of the COVID-19 Vaccine 53 0 14.3 52.1 54.1 85.2 14.0
X38 Percent of Total Population with COVID-19 who Received 1-Dose of the COVID-19 Vaccine 53 0 0.1 0.4 0.4 1.2 0.2
X39 Percent of Total Population who Received 2-Doses of the COVID-19 Vaccine 53 0 11.0 44.8 46.0 72.3 12.5
X40 Percent of Total Population with COVID-19 who Received 2-Doses of the COVID-19 Vaccine 53 0 0.9 3.4 3.4 8.7 1.3
X41 Percent of Total Population who Received 3-Doses of the COVID-19 Vaccine 53 0 5.0 20.4 20.9 42.1 7.1
X42 Percent of Total Population with COVID-19 who Received 3-Doses of the COVID-19 Vaccine 53 0 0.2 0.8 0.7 3.2 0.5
X43 Median Household Income in the Past 12 Months in 2020 (Inflation Adjusted Dollars) 53 0 41,893.0 62,256.8 61,477.0 82,750.0 9,711.6
X44 Per Capita Income in the Past 12 Months in 2020 (Inflation Adjusted Dollars) 53 0 17,460.0 35,216.8 35,278.0 45,782.0 5,030.3
X45 Median Family Income in the Past 12 Months in 2020 (Inflation Adjusted Dollars) 53 0 46,071.0 81,682.0 81,439.0 108,816.0 11,899.1

Random Forest
REMINDER: 26 variables were used.
Data partitioning: Training (70%) and Testing (30%).
Number of trees & number of variables considered in
deciding how to partition the models: 500 and 5.
Model Development
Dependent Variable Mean of squared
residuals (n=37)
% Var
Explained
Y1 – Percent of the Population
with COVID-19
12.098 60.70
Hospitalized with COVID-19
0.180 20.01
who Died with COVID-19
0.026 3.62

Random Forest Results
Model Evaluation
Dependent variable Testing Dataset –
Pseudo R-square (n=16)
Full Model -
Pseudo R-square (n=53)
with COVID-19
53.1% 84.7%
Y2 - Percent of the Population
Hospitalized with COVID-19
54.3% 78.9%
Y3 - Percent of the Population
who Died with COVID-19
14.6% 61.4%

Variable Importance for COVID-19 Cases (Y1)
Variable Importance – Random Forest – Y1 % Increase in
MSE
Increase
in Node Purity
COVID19TestsPerformed_perCapita 9.64 69.01
PercentTotalPopulation_LiveinInstitutandNotInstitutionilizedGroupQuarters 7.90 56.07
PopulationDensity_in_PeopleperSquaremiles 7.51 73.13
PercentTotalPopulation_COVID19Vaccine_1Dose 6.87 74.98
PercentIndividuals_withinCounty_withAsthma 6.58 28.26
PercentTotalPopulation_COVID19Vaccine_2Doses 5.81 52.42
Percent_Population_2orMoreRaces 5.40 42.85
Percent_Population_18to34yo 5.04 24.24
Percent_Population_65plusyo 4.81 15.60
PercentDifference_FemaleandMale_Population 4.51 17.98
Percent_Population_NativeHawaiianPacificIslander 3.75 26.25
Percent_Population_White 2.97 13.55
Percent_Population_AmericanIndian 2.92 12.15
Urban_or_Rural_County 2.51 3.80
PercentIndividuals_withinCounty_withAnginaorcoronaryheartdisease 1.91 14.57
PercentObeseAdults 1.21 6.55
MedianHouseholdIncome_inpast12monthsin2020_InflationAdjustedDollars 0.34 10.66
Percent_Population_Asian -0.02 14.85
PerCapitaIncome_inpast12monthsin2020_InflationAdjustedDollars -0.40 5.99
PercentIndividuals_withinCounty_withLungDisease -0.53 21.65
MedianFamilyIncome_inpast12monthsin2020_InflationAdjustedDollars -0.82 10.65
PercentIndivivuals_withinCounty_withDiabetes -1.51 10.03
Percent_Population_Black -2.33 4.46

Variable Importance for COVID-19 Hospitalizations (Y2)
Variable Importance – Random Forest –Y2 % Increase in
MSE
Increase in
Node
Purity
MedianFamilyIncome_inpast12monthsin2020_InflationAdjustedDollars 6.28 0.33
Percent_Population_White 4.43 0.27
Percent_Population_Black 4.06 0.19
Percent_Population_Asian 2.52 0.19
PerCapitaIncome_inpast12monthsin2020_InflationAdjustedDollars 1.06 0.16
Percent_Population_NativeHawaiianPacificIslander 0.26 0.04
PercentIndivivuals_withinCounty_withDiabetes 0.01 0.12
Percent_Population_2orMoreRaces -0.29 0.09
PercentIndividuals_withinCounty_withLungDisease -0.40 0.06
Percent_Population_0to17yo -0.74 0.23
PercentIndividuals_withinCounty_withAnginaorcoronaryheartdisease -1.97 0.15
Urban_or_Rural_County -2.03 0.00

Variable Importance for COVID-19 Deaths (Y3)
VariableImportance–RandomForest–Y3 % Increase in
MSE
Increase in
Node
Purity
PerCapitaIncome_inpast12monthsin2020_InflationAdjustedDollars 7.34 0.05
MedianFamilyIncome_inpast12monthsin2020_InflationAdjustedDollars 5.66 0.05
Percent_Population_2orMoreRaces 1.12 0.01
PercentIndividuals_withinCounty_withLungDisease 0.58 0.02
Percent_Population_Black 0.28 0.01
Percent_Population_White -0.71 0.02
PercentTotalPopulation_COVID19Vaccine_3Doses -0.81 0.01
PercentIndividuals_withinCounty_withAnginaorcoronaryheartdisease -1.02 0.01
Percent_Population_Asian -1.38 0.01
Urban_or_Rural_County -1.40 0.00
Percent_Population_NativeHawaiianPacificIslander -1.47 0.01
PercentIndivivuals_withinCounty_withDiabetes -3.92 0.04

Quasi-
Binomial
Model Results
for the
Analysis of
Percent of the
Population
who with
COVID-19 (Y1)
Pr(>|z|)

Quasi-Binomial Model Results for the Analysis of Percent of
the Population who was Hospitalized with COVID-19 (Y2)

Quasi-Binomial Model Results for the Analysis of Percent of
the Population who Died with COVID-19 (Y3)
Pr(>|z|)

Take Away Points
• From the random forest models:
• The accuracy of the models produced using the full dataset varied from about
61% for COVID-19 cases resulting in deaths (Y3) to 85% for COVID-19 cases
(Y1).
• There was a large discrepancy between the results of the models obtained
using the training and the testing datasets for hospitalization due to COVID-
19 (Y2) (20.01% vs 78.94%) and COVID-19 cases resulting in deaths (Y3)
(3.62% vs 61.38%).
• The variable importance results showed that:
• COVID-19 vaccines (1 dose), Population density and Covid-19 tests performed per capita
were important variables to predict COVID-19 cases (Y1).
• Income is an important predictor for COVID-19 hospitalizations (Y2) and COVID-19
deaths (Y3).

Take Away Points
• From the quasi-binomial model:
• 84.53% of the variation in the percentage of the population with COVID-19 (Y1)
was explained by the independent variables.
• Based on the variable of importance for Y1 and holding all variables constant;
• Y1 increases by 51% for a percent increase in proportion COVID-19 test per Capita
• Y1 increases by 50% for a percent increase in the proportion of population density in people
per square miles
• Y1 decreases by 50% for a percent increase in the proportion of total Dose 1 vaccine
administered.
• 52.85% of the variation explained in the population who was hospitalized with
COVID-19 (Y2).
• 46.2% of the variation explained in the percent of the population who died of
COVID-19 (Y3).

Next Steps
• We will reevaluate initial variables used in the random forest models and add
additional variables (e.g., distance to COVID-19 test sites and health professional
shortage areas).
• Run partial dependencies for each dependent variable in random forest to learn
the direction of influence of the most important predictive variables.

Questions? Suggestions?

How to find us?
Email:
• Valquiria Quirino (valquiria.qurino@ndsu.edu)
• Avram Slone (avram.slone@ndsu.edu)
• Jerry Dogbey-Gakpetor(jerry.dogbeygakpetor@ndsu.edu)
Address:
Center for Social Research at NDSU
1616 12th Ave. N
Prairie Hall, Room 210
Fargo, ND 58102

Thank you!

Sources
1. Neil M Ferguson, Daniel Laydon, Gemma Nedjati-Gilani, et al. (2020) Impact of non-
pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare
demand. Imperial College London (16-03-2020). https://doi.org/10.25561/77482.
2. Biggs, A. T., & Littlejohn, L. F. (2021). Revisiting the initial COVID-19 pandemic
projections. The Lancet Microbe, 2(3). https://doi.org/10.1016/s2666-5247(21)00029-x
3. Ansari, R. M., & Baker, P. (2021). Identifying the predictors of COVID-19 infection
outcomes and development of prediction models. Journal of Infection and Public Health,
14(6), 751–756. https://doi.org/10.1016/j.jiph.2021.03.006
4. Martin-Moreno, J. M., Alegre-Martinez, A., Martin-Gorgojo, V., Alfonso-Sanchez, J. L.,
Torres, F., & Pallares-Carratala, V. (2022). Predictive models for forecasting public health
scenarios: Practical experiences applied during the first wave of the COVID-19 pandemic.
International Journal of Environmental Research and Public Health, 19(9), 5546.
https://doi.org/10.3390/ijerph19095546

Sources
5. Chintalapudi, N., Battineni, G., & Amenta, F. (2020). COVID-19 virus outbreak
forecasting of registered and recovered cases after sixty day lockdown in Italy: A Data
Driven Model Approach. Journal of Microbiology, Immunology and Infection, 53(3), 396–
403. https://doi.org/10.1016/j.jmii.2020.04.004
6. Shankar, S., Mohakuda, S. S., Kumar, A., Nazneen, P. S., Yadav, A. K., Chatterjee, K., &
Chatterjee, K. (2021). Systematic review of predictive mathematical models of COVID-19
epidemic. Medical Journal Armed Forces India, 77.
https://doi.org/10.1016/j.mjafi.2021.05.005
7. Kermack, W. O., & McKendrick, A. G. (1927). A contribution to the mathematical theory
of epidemics. Proceedings of the Royal Society of London. Series A, Containing Papers of a
Mathematical and Physical Character, 115(772), 700–721.
https://doi.org/10.1098/rspa.1927.0118
8. Wang, J., Yu, H., Hua, Q., Jing, S., Liu, Z., Peng, X., Cao, C., & Luo, Y. (2020). A descriptive
study of random forest algorithm for predicting COVID-19 patients outcome. PeerJ, 8.
https://doi.org/10.7717/peerj.9945

Sources
9. Grekousis, G., Feng, Z., Marakakis, I., Lu, Y., & Wang, R. (2022). Ranking the
importance of demographic, socioeconomic, and underlying health factors on US
COVID-19 deaths: A geographical random forest approach. Health & Place, 74,
102744. https://doi.org/10.1016/j.healthplace.2022.102744
10. Gangloff, C., Rafi, S., Bouzillé, G., Soulat, L., & Cuggia, M. (2021). Machine
learning is the key to diagnose COVID-19: A proof-of-concept study. Scientific
Reports, 11(1). https://doi.org/10.1038/s41598-021-86735-9
11. Zhang, X., Maggioni, V., Houser, P., Xue, Y., & Mei, Y. (2022). The impact of
weather condition and social activity on COVID-19 transmission in the United
States. Journal of Environmental Management, 302, 114085.
https://doi.org/10.1016/j.jenvman.2021.114085
12. Alrehaili, Meaad, & Assiri, Fatmah. (2022). Development of Ensemble Machine
Learning Model to Improve COVID-19 Outbreak Forecasting. Jordanian Journal of
Computers and Information Technology, 8(2).

Sources
13. Wang, J., Yu, H., Hua, Q., Jing, S., Liu, Z., Peng, X., Cao, C., & Luo, Y.
(2020). A descriptive study of random forest algorithm for predicting
COVID-19 patients outcome. PeerJ, 8. https://doi.org/10.7717/peerj.9945
14. Iwendi, C., Bashir, A. K., Peshkar, A., Sujatha, R., Chatterjee, J. M.,
Pasupuleti, S., Mishra, R., Pillai, S., & Jo, O. (2020). COVID-19 patient health
prediction using boosted random forest algorithm. Frontiers in Public
Health, 8. https://doi.org/10.3389/fpubh.2020.00357
15. Shoukri, M. M., & Aleid, M. M. (2022). Quasi-binomial regression model
for the analysis of data with extra-binomial variation. Open Journal of
Statistics, 12(01), 1–14. https://doi.org/10.4236/ojs.2022.121001
16. Brainard, J., Rushton, S., Winters, T., & Hunter, P. R. (2020). Introduction
to and spread of COVID-19-like illness in care homes in Norfolk, UK. Journal
of Public Health, 43(2), 228–235. https://doi.org/10.1093/pubmed/fdaa218

Sources
17. Oliver, N., Barber, X., Roomp, K., & Roomp, K. (2020). Assessing the
impact of the COVID-19 pandemic in Spain: Large-scale, online, self-
reported population survey. Journal of Medical Internet Research, 22(9).
https://doi.org/10.2196/21319
18. Nordsci Conference Proceedings 2021 book 1, volume 4. (2021).
https://doi.org/10.32008/nordsci2021/b1/v4
19. How healthy is your county?: County Health Rankings. County Health
Rankings & Roadmaps. (n.d.). Retrieved September 14, 2022, from
http://www.countyhealthrankings.org/
20. Introduction to random forest in machine learning. Section. (n.d.).
Retrieved September 14, 2022, from https://www.section.io/engineering-
education/introduction-to-random-forest-in-machine-learning/

NDGeospatialSummit2022 - Using Machine Learning and Quasi Binomial Model to Predict and Understand the Impact of COVID 19 in North Dakota

Recommended

Recommended

More Related Content

Similar to NDGeospatialSummit2022 - Using Machine Learning and Quasi Binomial Model to Predict and Understand the Impact of COVID 19 in North Dakota

Similar to NDGeospatialSummit2022 - Using Machine Learning and Quasi Binomial Model to Predict and Understand the Impact of COVID 19 in North Dakota (20)

More from North Dakota GIS Hub

More from North Dakota GIS Hub (20)

Recently uploaded

Recently uploaded (20)

NDGeospatialSummit2022 - Using Machine Learning and Quasi Binomial Model to Predict and Understand the Impact of COVID 19 in North Dakota