1. Amy Ko Ko 1
Barbara Kuzmak
STAT 4893W
4/19/2015
Forecasting the population and Estimating the Population Trend
1. Background
There have been a lot of research articles that suggest the prediction for each
country’s population by 2050. The projections recently issued by the United Nations
suggest that “world population by 2050 could reach 8.9 billion, but in alternative
scenarios could be as high as 10.6 billion or as low as 7.4 billion”(United Nation
Population Fund, 2004).
The table below shows the population of most populated countries in 2010 based on the
UN population fund data.(UN population Fund 2004)
Table 1 The top 10 mostpopulated countries in 2010 (UN population Fund 2004)
Current Ranking Country Population (mils)
1 China 1275.2
2 India 1016.9
3 USA 285.0
4 Indonesia 211.6
5 Brazil 171.8
6 Russian Federation 145.6
2. 7 Pakistan 142.7
8 Bangladesh 138.0
9 Japan 127.0
10 Nigeria 114.7
The table below is the population prediction for 2050 from United Nation Population
Fund.
Table 2 the top 10 mostpopulated countries in 2050(UN population fund)
Ranking Country Population (mils)
1 India 1531.4
2 China 1395.2
3 USA 408.7
4 Pakistan 348.7
5 Indonesia 293.8
6 Nigeria 258.5
7 Bangladesh 254.6
8 Brazil 233.1
9 Ethiopia 171.0
10 Congo,DR 151.6
Comparing the two tables, we can observe that India’s population will surpass China’s
population within 50 years. One of the most reasonable aspects behind this prediction
would be the fertility rate. There has been “one-child policy” in China since in order to
3. Ko 3
control the population. “In 1979, the Chinese government embarked on an ambitious
program of market reform following the economic stagnation of the Cultural Revolution.
At the time, China was home to a quarter of the world's people, who were occupying just
7 percent of world's arable land. Two thirds of the population was under the age of 30
years, and the baby boomers of the 1950s and 1960s were entering their reproductive
years. The government saw strict population containment as essential to economic reform
and to an improvement in living standards.2 So the one-child family policy was
introduced.”(Hesketh &Zhu, 2015) The graph below is the fertility rate in China from
1970 to 2014 based on the UN data.
Figure1 the fertility rate from 1960 to 2005 inChina
It shows that the fertility rate has been dropped from 4.7 to 1.7 for 40 years. It implies
that the growth rate of the Chinese population is likely to be decreased or stagnated. Also,
Japan, the 9th most populated country in the world, is eliminated in the top 10 most
0
1
2
3
4
5
6
Year
1970
1975
1980
1985
1990
1995
2000
2005
Fertility rate
Fertility rate
4. Ko 4
populated countries in 2050 population prediction. Likewise, this can be attributed to low
fertility rate constantly ranging between 1 and 2 since 1975 based on the UN data.
2. Objective
This research aims to project the population for every decade from 2010 to 2050
and show the rank of the top ten most populated countries. Time series analysis method
is applied as a prediction method. The two main causes of population growth are fertility
rate and migration trend. In earlier part of this paper, it demonstrated how much the
fertility rate in China has been decreased. Thus it would be essential to achieve the
fertility rate data for more accurate forecasting. After the prediction of the population
until 2050 for each decade, it is followed with the GDP prediction for the top ten most
populated countries
Although the United Nation already presented the population prediction with
using professional methods, I would aim to apply my knowledge and technical skills that
I have gained in time series analysis class.
3.Analysis Method
There have been many research papers that used time series analysis method to
predict growth rate and population. “Time Series Test of Endogenous Growth
Models”(Jones 1995) which is one of the most cited Economics article applied the log
5. Ko 5
transformation on the data to achieve the stationary series. Jones chose an
augmented Dicky Fuller(ADF) method for forecasting the GDP per capita from 1929
to 1987. On the other hand, "On the Appropriate Transformation Technique and
Model Selection in Forecasting Economic Time Series: An Application to Botswana
GDP Data,"(Shangodoyin et.al 2010) applied AR(1) model to predict the GDP in
Botswana. Authors of the article used AIC and SIC and model residuals to judge which
model is the most precise.
I have learned ARIMA model and GARCH model during the Time Series
Analysis class for this semester. ARIMA model stands for Autoregressive Moving
Average Models. The model is a generalization of an autoregressive moving average
(ARMA) model. “Classical linear regression is often insufficient for explaining all of the
interesting dynamic in time series. The introduction of correlation as a phenomenon that
may be generated through lagged linear relations leads to proposing the
autoregressive(AR) and autogressive moving average models. Adding nonstationary
models to the mix leads to the ARIMA models.”(Shumway &Stoffer 2011) GARCH
model stands for Generalized ARCH model. “ARCH is an acronym meaning
AutoRegressive Conditional Heteroscedasticity. In ARCH models the conditional
variance has a structure very similar to the structure of the conditional expectation in an
AR model.” (Ruppert 2011) For example, an AR(1) process has a nonconstant
conditional mean but a constant conditional variance, while an ARCH(1) process is just
the opposite. “A deficiency of ARCH(q) models is that the conditional standard deviation
6. process has high-frequency oscillations with high volatility coming in short bursts.
However, GARCH models permit a wider range of behavior, in particular, more
persistent volatility. Because past values of the σt process are fed back into the present
value, the conditional standard deviation can exhibit more persistent periods of high or
low volatility than seen in an ARCH process. “(Rupert 2011) From the literature review
about time series analysis method, it is crucial to choose the best model that generates the
lowest AIC and BIC and the best model is different from data.
The starting point of the analysis is achieving population data of the top 20 most
populated countries from 1950 to 2050 listed in the UN Population Fund research paper.
So, the research is based on the population data from 17 countries.
7. These plots are time series plots for 17 selected countries from 1950 to 2010. Congo,
Democratic of Republic, Ethiopia and Nigeria show similar patterns with exponential
function graph. It implies that these countries have rapid growth rates than other
countries. In contrast, the growth rate of Japan and Russia show decreasing patterns and it
implies that the growth rates of the two countries are under downfall. In short, this
observation may imply that it is important to select the best model for each country rather
than selecting and applying the single model for whole 17 countries. According to
“Growth Forecasts Using Time Series and Growth models”, it is said that “it’s difficult to
choose the “best” model for forecasting real GDP per capita”(Aart 1999).
When it comes to selecting the most appropriate ARMA models for each country,
I took the first difference on the log-transformed data and used the ACF and the PACF
8. Ko 8
graphs. Consequently, the ARMA models for all the countries are either AR (1) model or
AR (2) model. For selecting the most suitable GARCH model, it turned out that the AIC
values of GARCH (2,1) are closer to zero for most countries besides Mexico that
GARCH (1,1) performs better. Then I calculated the error of predicted values for
GARCH(2,1) and and the AR model. The result is the AR model generated the lower
error value so I chose ARIMA models for my prediction. The next step was judging each
country on ARIMA model order. The judgment was made based on the behavior of
autocorrelation function (ACF) and partial ACF (PACF). If PACF shows there is a cut
off after lag(p), it’s AR(p) model. If ACF shows there is a cut of after lag(q), it’s MA(q)
model. The table below shows each model for 17 countries.
Table 3 each model for 17 countries
AR(1) AR(2) MA(2)
Brazil, China, Indonesia,
Japan, Mexico, Nigeria,
Philippines, Russia, Turkey,
Vietnam
USA, Bangladesh, Congo
DR, Egypt, Ethiopia, India
Pakistan
5. Result
Here’s the ranking for 2020 population
9. Table 4 Estimated the top 10 mostpopulated countries in2020
Current Ranking Country Population (thousands)
1 China 1,357,250
2 India 1,349,486
3 USA 338,972.1
4 Indonesia 240,168.9
5 Pakistan 203,699
6 Brazil 194,800.5
7 Bangladesh 161,916
8 Nigeria 151714
9 Russia 143,374.9
10 Japan 127,182.8
Comparing to the population in 2010, there are some changes. Although China manages
to keep the number one most populated country in the world in 2020, the gap between
India and China gets much narrower than in 2010. In 2010, Brazil and Russia were in
higher rank than Pakistan but the circumstance is altered in 2020 that Pakistan’s
population is expected to surpass Brazil and Russia within 10 years. As it is shown in
time series plot, the ranks of Russia and Japan dropped compared to 2010.
Here is the rank for 2030.
Table 5 estimated the top 10 mostpopulated countries in 2030
Current Ranking Country Population (thousands)
10. 1 India 1,467,972
2 China 1,354,696
3 USA 363,675.5
4 Indonesia 239,664.8
5 Pakistan 234,249
6 Brazil 194,393.4
7 Bangladesh 161,627.4
8 Nigeria 149,711.3
9 Russia 143,134.8
10 Ethiopia 128,273.96
Based on this analysis, it is expected that India’s population will outnumber China’s
population and settle down as the most populated country in the world. There has not
been many chances in rank compared to 2020. However, Japan will not be in one of the
10 most populated countries within 15 years. Instead, Ethiopia will show the first
appearance on the list.
Here is the rank for 2040.
Table 6 estimated top 10 populated countries in 2040
Current Ranking Country Population (thousands)
1 India 1,555,249
2 China 1,352,158
11. 3 USA 385,668.9
4 Pakistan 264,799
5 Indonesia 239,163.7
6 Brazil 193,988.6
7 Bangladesh 150,317.5
8 Nigeria 148,004.3
9 Ethiopia 144,599.3
10 Russia 142,897.7
In 2040 forecasting, we can observe that the population gap between China and India will
be larger compared to 2030. Within 15 years, Pakistan’s population will surpass
Indonesia. Likewise, it is expected that the population of Ethiopia will surpass the
population of Russia.
Here’s the rank for 2050.
Table 7 the top 10 mostpopulated countries in 2050
Current Ranking Country Population (thousands)
1 India 1,607,982
2 China 1,349,636
3 USA 404,353.2
4 Pakistan 295,349
5 Indonesia 238,665.5
6 Brazil 193,586.2
12. 7 Ethiopia 154,891.81
8 Nigeria 146,549.3
9 Russia 142,663.4
10 Congo, DR 134,223.19
Compared to 2040 analysis, it is easy to notice that Congo, Democratic of Republic will
make on the list for the first time. In the earlier part or the paper, Congo,DR showed rapid
growth pattern in time series plot which resembles the exponential function. In a longer
term, there is a high possibility that the population of Congo, Democratic of Republic
will be larger than the population of Russia.
Overall, the analysis shows the similar pattern with the United Nation report
although there are some numerical differences in the number of population. Both of them
show that India’s population will outnumber Chinese population within 35 years at the
most. Especially, it is exactly the same ranking with the report from the rank 1 to the rank
5 and the rank 10. My time series analysis estimates higher growth rate for Ethiopia and
Russia than the UN report but lower growth rate for Bangladesh from 2040 to 2050.
4.Conclusion
I performed the population forecasting based on the 17 most populated countries
data from 1950 to 2010. I used AR(1),AR(2) or MA(2) model based on the ACF and
PACF pattern for each country population data. Then I showed the estimated population
in 2020, 2030, 2040 and 2050 and then compared my analysis on 2050 to the report from
13. Ko 13
UN population fund. The analysis shows that India’s population will exceed China’s
population within 15 years and the population gap between India and China will larger
since India becomes the most populated country in the world. The Unites States will be
likely to maintain the third most populated country in the world from 2010. Due to the
low growth rate, Japan will likely to be disappeared from the list within 15 years. In
2050, it is estimated that there will be more African countries on the list than in 2010 as
Ethiopia, Nigeria and Congo, DR has high population growth rates while Asia will retain
its status as the most populated continent in the world. Consequently, the analysis shows
similar pattern with the United Nation Population report overall. However, the prediction
with my time series method showed relatively more positive outlook on the population
growth rates of Russia and Ethiopia than the result of UN population report. On the other
hand, the prediction with my time series analysis method estimated lower growth rate for
Bangladesh. Generally, using ARIMA model is a simple and relatively accurate time
series analysis method with knowledge of interpreting the behavior of ACF and PACF.
As long as the data can achieve stationarity, ARIMA model has a versatile application on
forecasting.
14. Reference
Aart. Kraay George Monokroussos(1999); Growth Forecasts Using Time Series and
Growth Models. World Bank. Development Research Group. Macroeconomics and
Growth.Washington, DC : World Bank, Development Research Group, Macroeconomics
and Growth.
Charles I. Jones(1995). Time Series Test of Endogenous Growth Models. The Quarterly
Journal of Economics, Vol. 110, No. 2 (May, 1995), pp. 495-525
Hesketh, T., Lu, L., & Xing, Zhu. W. (2015). The effect of China's one-child family
policy after 25 years. New England Journal of Medicine, 353(11), 1171-1176.
Ruppert, D., 2011. GARCH Models. In Ruppert, D. (ed) Statistics and Data Analysis for
Financial Engineering. Springer T., New York, NY: Springer New York.
Shangodoyin, D. K.; Setlhare, K.; Moseki, K. K.; and Sediakgotla, K. (2010) "On the
Appropriate Transformation Technique and Model Selection in Forecasting Economic
Time Series: An Application to Botswana GDP Data," Journal of Modern Applied
Statistical Methods: Vol. 9: Iss. 1, Article 28.
Shumway, Robert &Stoffer, David , 2011. The Time Series Analysis and Its Applications
Springer T., New York, NY: Springer New York.
World Population to 2300 Department of Economic and Social Affairs
Population Division, 2004