Multipleregression covidmobility and Covid-19 policy recommendation

Mobility Tracking During Covid-19:
A Multiple Regression Analysis
Kan Yuenyong

Presented: Wednesday, September 8, 2021, Revised: Thursday, September 23, 2021
GSPA
NIDA DA8120 Quantitative Research II

Why This Paper?
• Multiple Regression Analysis and Covid-19 policy is
the contemporary agenda

• It demonstrates how to use Python to do data
wrangler, to use R to do statistical analysis, and is
enable to publish in standard academic journal

• Data source from Descartes Labs, a spinoﬀ startup
from the Los Alamos National Laboratory

• Open Data, can be accessed at http://github.com/
descarteslabs/DL-COVID-19.

• A chance to compare between: (1) Apple mobility
trend report, (2) Google community reports, (3) Twitter
mobility report, and (4) Descartes Labs’ Mobility
report

What is Descartes Lab?
• Descartes Labs is building a digital twin of the world by applying machine learning to satellite imagery and
other massive data sets, such as weather data, pricing and customer data. The solution is based in the
cloud, which means it can scale storage for the massive data sets, and scale compute capability to enable
analysis results and data to be returned more quickly.

• The Descartes Labs data refinery oﬀers geographic data including the entire library of satellite data from
the NASA Landsat and ESA Sentinel missions, the entire Airbus OneAtlas* catalog, and NOAA’s Global
Surface Summary of the Day weather dataset. The data has been combined and cleaned, so it is ready for
machine learning analysis.

• Looking ahead, the company is building what it describes as a “digital twin” of the earth, the idea being
that in doing so it can better model the imagery that it injests and link up data from diﬀerent regions more
seamlessly (since, after all, a climatic event in one part of the world inevitably impacts another). Notably,
“digital twinning” is a common concept that we see applied in other AI-based enterprises to better predict
activity: this is the approach that, for example, Forward Networks takes when building models of an
enterprise’s network to determine how apps will behave and identify the reasons behind an outage.

• Extracting, managing, predicting genuine big data online realtime

A comparison between mobility data
• Human mobility data provide valuable insight into how
we adjust our travel behaviors during the COVID-19
pandemic.

• Human mobility records from Descartes Labs, Apple,
Google, and Twitter are compared.

• Multi-source mobility datasets well capture the general
impact of COVID-19 pandemic on mobility in the U.S.
but present unique and even contrasting
characteristics

• The proposed responsive index quantifies the level of
mobility-based reaction in response to the COVID-19
pandemic

• All selected mobility datasets suggest a statistically
significant positive correlation between the responsive
index and median income at the U.S. county level.

Mobility data comparison on four platforms

Research Question -> Policy Recommendation
• Whether lockdown policy is relevant to control Covid-19 outbreak?

• Do people voluntarily stay home, or because of the government’s order, or both?

• How we can track people’s mobility & activity? And with what tool?

• What is the model to explain lockdown policy?

• Evidence-based Policy (EBP) on recommendation for the Pincer tactic: (1) a (quasi)
lockdown to slowdown the spreading + (2) at least 70% of total population vaccination to
build herd immunity; optimizability between national health security vs economic security

• Next stage = Crossing the Rubicon (Open the country + New normal health policy to
contain the Covid-19 + Full economic stimulus campaign)

Data as of early July 2021 Data as of early September 23, 2021
Thailand Moving Average Trend in Google mobility report
Orange = 7 days moving average; Dark Blue = 20 days moving average

Original code of Google Mobility report in Thailand is from my Kaggle:

https://www.kaggle.com/kanyuenyong/covid-19-community-mobility-reports-in-thailand,

updated version, i.e moving average, is in my local Anaconda-Jupyter platform

The recent available vaccine technologies
Covid-19/Operation Warp Speed gao-21-319.pdf

Researchers discover hidden SARS-CoV-2 'gate' that opens to allow COVID infection

https://phys.org/news/2021-08-hidden-sars-cov-gate-covid-infection.html
“N343”, a possible key to pave the way for “universal”
Covid-19 therapeutic and vaccine

https://www.bangkokbiznews.com/news/detail/954822
Thailand Self-reliance Vaccine Technology

https://www.bangkokpost.com/thailand/general/2186103/bangkok-could-open-by-nov-1-says-ccsa

Evidence-Based Policy
The latest development of policy science can be called “evidence-based policy” (EBP) focusing on systems thinking that “recognizes the
world as a complex system composed of a large number of influences that, for the most part, are interacting simultaneously, rather than focus
on component parts in isolation” and thus a causal linkage between evidence and relevance knowledge as well as between evidence and
short-term/intermediate/long-term policy outcomes. The EBP has been largely applied in health policy and medicine such as Cochrane
Collaboration (C1) and the Campbell Collaboration (C2), both C1 and C2 use research synthesis and meta-analysis to assess the causally
relevant outcomes as well as medical intervention (Dunn, 2018: 41).

https://covid19scenariomodelinghub.org/viz.html
Scenario Projection with different variables

Possible Vaccine Approval (scenario)
https://www.covid19-predictions.org/?page=editparameters

https://www.scmi.de/en/blog/item/265-scenarios-corona-epidemic;

https://www.atlanticcouncil.org/in-depth-research-reports/2025-post-covid-scenarios-latin-america-and-the-caribbean/#scenarios

Steps in Data Wrangling
• The overall process of extracting the primary data (State, Date, and Mobility) involved utilizing SQL que- ries to filter out the
Descartes Labs data set in groupings for the individual states of California, Florida, New York, Pennsylvania, and Texas

• another partition was applied on the date in the form of a range in order to maintain consistent statewide data from March 10,
2020, to May 28, 2020. Once completed and initially filtered under the above two constraints, an API call was created from
www.data.world to be read and manipulated further via the Python programming language (Python 3.8.1). Likewise, a Python
script was created to preprocess the dataset by managing and organizing the filtered data into data frames

• Additionally, to construct a variable for the eﬀect or contribution of government-imposed restrictions, a logical comparison was
implemented to make a binary coded variable that would take assigned values of 0 (No Restriction present) or 1 (Restriction
present) based upon the respected date values from the data found in the Tracking Involuntary Government Restrictions (TIGR)
Dataset.

• The Cases and Deaths data were compiled from JHU CSSE, Worldometer, and IHME and integrated into the same Python script,
adding on to the initial data frame.

• Random sample from the source data: a random subset of the data was taken for each of the individual states (20%) via
Python’s random sample() function and flattened as averages to yield distinctive data records for each date in the range

• Finally, a conversion of the data's format (DataFrame to CSV) took place, which enabled the migration directly to RStudio (Version
3.6.3) for performing statistical modeling and analysis. A variety of packages were utilized in RStudio namely broom for
summarizing the model results, ggplot2 as a graphical visualization tool, readr to process the CSV data file, dpylr for data
manipulation, lindia [12] for creating regression diagnostic plots along with verifying linear model assumptions, and knitr [13] for
printing and exporting the results of our analyses.

The Mobility Model
• Y=β0 + β1X1 + β2X2 + β3X3 + ε

• Where Y = Mobility Index, X1 = Restriction (Policy), X2 = Cases, X3 =
Deaths, ε has a normal distribution with zero mean and unit variance

Future research recommendation
• To tackle with high-complicated data set, Logistic
model tree (LMT), a combination between Logistic
regression and Model Tree Learning with supervised
training algorithm will be chosen

• A quantum based algorithm, i.e Deutsch-Jozsa
Algorithm, a basic quantum algorithm, on a machine
learning-based logistic regression problem to resolve
a complex big data analytics (i.e. data from Descartes
lab can be categorized in this manner) with Hadamard
transformation to ultra-speeding up the simulation

•

• Evidence-Based Policy (EBP) is a problem of policy
optimizability (between emergence/forces in complex
adaptive system) practiced in the futuristic algorithmic
governance (plus holacracy)

GSPA
NIDA DA8120 Quantitative Research II
Q&A?
END

Multipleregression covidmobility and Covid-19 policy recommendation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Multipleregression covidmobility and Covid-19 policy recommendation

Similar to Multipleregression covidmobility and Covid-19 policy recommendation (20)

More from Kan Yuenyong

More from Kan Yuenyong (20)

Recently uploaded

Recently uploaded (20)

Multipleregression covidmobility and Covid-19 policy recommendation