An estimation of the average, minimum and maximum ultimate death toll is given along with a predicted date. The expected ultimate death toll is estimated based on both forms of Hubbert equation, i.e. the Parabola and the Linearization (where appropriate), while the predicted date is determined based on a data forecast using the composite growth rate function (geometric sequence).
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
Data driven forecast of the covid-19 death toll
1. Data driven forecast of the Covid-19 death toll
Mohamed Bouanane
Management Consulting Director
Toulouse – France
Version of May 1, 2020
COVID-19 is an emerging pandemic infection that has spread worldwide since January 2020
starting from Wuhan, Hubei province in China. Many epidemiologists and mathematicians are
trying to find the most accurate model in order to predict the magnitude and the end of the
pandemic as well as to advise governments define the better date for opening up the economies
after having established a lock-down.
We have started, on the 22nd of March, estimating the evolution of the deaths toll for some
European countries to satisfy a self-curiosity. As the cumulative death toll showed an exponential
curve, therefore we have used a geometric sequence function to estimate the next day total death
cases. The cumulative death (CD) toll is time-dependent function which requires a common ratio –
r-value – evolving each day for calculating the next term. Based on the observation of the
countries data, the best function for estimating the next death case would be a composite growth
rate function as follows:
CDt = CD0 ∗ (r-value)t
where t is time in days
r-value = 1 + Cagrt
Cagrt−1 =
(
CDt−1
CD0
)
1
(t−1)
− 1
The Compound average growth rate Cagrt at day t is then estimated based on the previous Cagrt-1
and the historic data (growth / decline) of each country (usually three to five previous terms –
days). The forecast was published as a PDF document for the first time on 26th of March and
updated several times1
.
Methodology
The next step was to predict the ultimate death toll and find out when it would happen and if
possible the ending time of the pandemic. Covid-19 as any epidemic infection has a life-cycle
pattern as a bell-shaped curve (Figure 1) for the daily death (DD) curve over time (while the
cumulative death toll has an S-shaped curve), and composed of different phases: incubation,
spreading, acceleration, inflection, deceleration and flattening, and then ending.
Such life-cycle is very similar to that of extracting a finite mineral resource: a gradual rise from
zero resource production which then grows rapidly, reaching a peak representing the maximum
production, and then falls to approach zero production at a slow speed. The duration of this life-
cycle is highly dependent of each country’s strategy to counter the epidemic and the bell-shaped
curve is often not symmetrical simply because the growth rate does not match the decline rate.
The Hubbert bell-shaped curve has been used in modeling depletion of crude oil and predicting its
peak and its ultimately recoverable resource. Indeed, using such curve – a probability density
M. Bouanane 1/7
2. Data driven forecast of the Covid-19 death toll
function of a Logistic distribution (a common S-shaped curve) – in modeling the Covid-19 death
toll will help to determine the peak and the ultimate total death cases.
We define the following parameters as per the Hubbert’s equation:
CD(t) is cumulative death cases at day t;
UCD is the ultimate cumulative death cases;
DD(t) = d CD / dt is the daily death cases at day t;
k is a Logistic growth rate.
Then, the Hubbert’s polynomial equation (EQ#1) can be expressed in a differential form:
dCD
dt
= DD(t ) = k ∗ CD(t ) ∗ (1 −
CD (t )
UCD ) ( EQ#1)
At the start of the epidemic, CD/UCD is too small then Equation (1) reduces to DD = k*CD showing
an exponential growth at a rate k. At the end of the epidemic, CD almost equals UCD, then
Equation (1) shows an exponential decline.
Dividing Equation (EQ#1) by CD we get the second form called the Hubbert Linearization2
Equation
(EQ#2):
DD(t )
CD(t )
= k ∗ (1 −
CD(t)
UCD ) ( EQ#2)
Equation (EQ#2) is linear in the (CD; DD/CD) plane. Consequently, a linear regression on the data
points gives the axis intercepts: The k parameter is the intercept of the Y-axis (=DD/CD) for CD=0,
and the UCD value is the intercept of the X-axis (=CD) for DD=0. We can as well derive the
Hubbert’s curve parameters from the value of the line slope calculated by -k/UCD (Figure 1).
The Hubbert’s equation can be extended to the second derivatives (EQ#3) by calculating the
derivative of equation (EQ#1) and where the left term, called the decline rate, represents the
death toll relative daily increase3
. Therefore Equation (EQ#3) is a linear function of the cumulative
death toll. In this case the Hubbert’s Linearization line intercepts the X-axis at half the value of the
Ultimate cumulative death toll (UCD).
dDD
dt
∗
1
DD
= k ∗ (1 − 2
CD
UCD ) ( EQ#3)
Another way to plot the HL equation is to combine the equations (EQ#2) et (EQ#3) since they only
differ by a factor two on the slopes, their intercept with the Y-axis being the same and equal to k.
Therefore, the data points of the two representations could be mixed together in a unique
representation – Hybrid Hubbert’s Linearization – by multiplying the cumulative death by a factor
two in(EQ#3).
The next form of the Hubbert’s equation is simply the below polynomial function (EQ#4) in the
(CD; DD) plane where the points would follow the Hubbert Parabola passing through the origin
(0;0) and the point (UCD;0).
DD = k∗CD −
k
UCD
CD2
( EQ#4)
M. Bouanane 2/7
3. Data driven forecast of the Covid-19 death toll
R. Canogar4
has used the Hubbert Parabola to model the oil depletion. We use the Hubbert
Parabola in this paper to model the Cumulative death (CD) toll of the Covid-19 pandemic
infection. In a first plot, we place the points (CD; DD) and determine the polynomial regression –
the Parabola – that passes through the origin of the plane (Figure 3). The intercept of this
parabola with the X-axis gives the estimated Ultimate cumulative death toll (UCD).
In a second plot, we study the evolution of the expected UCD through time. Thus, we define the
function UCD(t) as the estimated UCD for day t via the Hubbert’s equation (4) by placing all the
data points (CD; UCD) in the plot of Figure 4.
At the beginning of the Covid-19 epidemic, it is obvious that the cumulative death toll CD(t) is too
low compared to the ultimate death toll UCD(t), thus the data points (CD(t); UCD(t)) are above
the line UCD=2*CD (except some strange data). When the epidemic reaches its peak, the CD(t)
reaches half of UCD(t) and then the data points (CD(t); UCD(t)) go below the line [UCD=2*CD].
After the peak and as the epidemic advances, the data points (CD(t); UCD(t)) approaches the line
[UCD=CD] and then CD(t) approaches the expected UCD(t) to reach the equality at the end of the
epidemic. According to the Logistic model, if the point (CD(t); UCD(t)) lies below the line
[UCD=2*CD] then the time is after the peak day, i.e. CD(t) > UCD(t)/2.
Results
It would be hazardous to make any prediction of the Covid-19 epidemic infection based on the
number of infected cases, because different countries follow different policies to counter the
epidemic, thus the value of infected cases is not counted similarly everywhere, since it highly
depends on how patients are screened. Moreover, many people are infected and unaware
because paucisymptomatic. The most realistic and reliable data is which counts the death cases if
it is reported in time and transparently.
However, estimating the end date of the epidemic would be highly risky and usefulness.
Practically, we estimate the expected ultimate death toll using the model and the methodology
explained above and give a plausible date when it would be reached, keeping in mind that such
estimate may change the next day since the model is highly dependent of the changes in the
complex real life. Thus, each predictive data should be read with precaution.
An estimation of the average, minimum and maximum ultimate death tolli
is reported along with a
predicted date. The expected ultimate death toll is estimated based on both forms of Hubbert
equation, i.e. the Parabola and the Linearization (where appropriate), while the predicted date is
determined based on a data forecast using the composite growth rate function (geometric
sequence).
As a starting point, we report in Table below the predictive ultimate death toll for Europe, North
America and World wide, despite it is highly difficult to make reliable forecast for a whole region
composed of many different countries following very different, even sometimes contradictory,
policies. In the near future we will focus on North American and some European countries.
According to the plot in Figure 4, all the regions (World, Europe and North America) have reached
their peak respectively, having their recent data points (CD(t); UCD(t)) between the two dashed
i
Data source: Daily updated data from Wikipedia & BNO.
M. Bouanane 3/7
4. Data driven forecast of the Covid-19 death toll
lines. However, the decrease of the daily death cases is not yet stabilized and particularly in
Europe (as of 29 April).
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
47
49
51
53
55
57
59
61
63
0
2 000
4 000
6 000
8 000
10 000
12 000
0
25 000
50 000
75 000
100 000
125 000
150 000
175 000
200 000
225 000
WW Daily deaths Moyenne glissante (WW Daily deaths)
WW Cumulative deaths
Days (22nd Feb - 27th Apr)
Dailydeathcases
Cumulativedeathcases
Figure 1: WW Daily death (bell-shaped) & Cumulative death toll (S-shaped)
175000 180000 185000 190000 195000 200000 205000 210000 215000
1,00 %
2,00 %
3,00 %
4,00 %
f(x) = − 5,216E-07 x + 1,319E-01
R² = 7,284E-01
Hubbert Linearization for the WW Death Toll (21st - 27th April)
WW Cumulative death cases CD
WWDailydeathcasesDD/CD
Figure 2: Hubbert Linearization on the WW Daily death cases /
Cumulative death cases (k=13,2% & UCD=252 826)
M. Bouanane 4/7
5. Data driven forecast of the Covid-19 death toll
0
25 000
50 000
75 000
100 000
125 000
150 000
175 000
200 000
225 000
0
2000
4000
6000
8000
10000
12000
f(x) = − 4,31E-07 x² + 1,13E-01 x
R² = 9,63E-01
Hubbert Parabola on the WW Death Toll (22nd Feb - 27th Apr)
WW Daily deaths Polynome (WW Daily deaths)
WW Cumulative death cases
WWDailydeathcases
Figure 3: Hubbert Parabola on the WW Death Toll (k=11,3% & UCD=262 108)
0
20 000
40 000
60 000
80 000
100 000
120 000
140 000
160 000
180 000
200 000
220 000
0
50 000
100 000
150 000
200 000
250 000
300 000
350 000
400 000
Covid-19 Expected Ultimate Death Toll vs Cumulative Deaths
U=2*CD Séries anonymes 3 WW Expected UCD
EU Expected UCD NA Expected UCD
Figure 4: Estimated Ultimate Death Toll (as of 29 April 2020)
M. Bouanane 5/7
6. Data driven forecast of the Covid-19 death toll
Region World Wide Europe North America
Current CD 226 470 135 659 66 425
Date 29-Apr 29-Apr 29-Apr
Max. Ultimate 346 374 228 922 91 819
Date 11-May > 13-May > 11-May
Min. Ultimate 296 806 158 990 81 198
Date 06-May 04-May > 11-May
Mean Ultimate 321 590 193 956 86 508
Date 09-May > 13-May > 11-May
M. Bouanane 6/7
7. Data driven forecast of the Covid-19 death toll
1 M. Bouanane, “Covid-19 – Forecast for Western European Countries & USA”, 26th Mar 2020.
2 M. King Hubbert, Techniques of Prediction as Applied to the Production of Oil and Gas, in: Saul
I. Gass (ed.): Oil and Gas Supply Modeling, National Bureau of Standards Special Publication 631,
Washington – National Bureau of Standards, 1982, pp. 16-141.
3 Khebab, "A Different Way to Perform the Hubbert Linearization", 18th Aug 2006.
4 Canogar Roberto, "The Hubbert Parabola". GraphOilogy, 06th Sept 2006.
M. Bouanane 7/7