Slides from ICWSM'17 workshop on Social Media for Demographic Research (Montreal, May 2017)
Overview of demography
How can demographers contribute to the analysis of big data (social media)? How can social media contribute to population studies?
Concerns over data quality.
Data Revolution and the SDGs: overview and value, huge challenges for attaining a economic-demographic-
environment balance, and the urgent need for data scientists and demographers to work on these issues.
Forest laws, Indian forest laws, why they are important
Demography, data and development
1. Demography, Data and Development
TOC
Overview of demography / population sciences
Demography and new types of data.
How might social media be of use to demographers?
How might demography contribute to the analysis of
social media?
Data quality concerns
The « Data Revolution » and the UN Sustainable
Development Goals – the place for data scientists and
demographers.
2. Central core:
Description/measurement
Mathematical relations
Population
structure (stocks)
- Size
- Composition
(age, sex…)
- Distribution across
time and space
Processes of
demographic
change (flows)
- Births
- Deaths
- Migrations
Changes in
population size &
structure, over
time and space
Demography, Data and Development
3. Demography, Data and Development
The training of technical demographers gives them
a rigorous understanding of the linkages between
human population stocks and flows across space
and time.
Capacity to assess what is feasible with data on
populations and the limitations of those data.
Ability to evaluate data quality, to link and process
data from disparate sources, and to see data as
part of a larger, systemic framework.
4. Demography, Data and Development
A useful conceptual tool
is the Lexis diagram
Presentation of three
conceptual dimensions
that exist in two-
dimensional space:
age, period and cohort
(or generation).
Consider those born
during 1950, near the
start of the baby boom.
4
3
2
1
0
1950 1951 1952 1953 1954
Age
Calendar year
5. Demography, Data and Development
With age, they will
physically change: they
first become larger and
stronger; subsequently
they grow older and
face physical infirmities.
These are examples
of true “age” effects
on their health (the
vertical arrow)
4
3
2
1
0
1950 1951 1952 1953 1954
Age
Calendar year
6. Demography, Data and Development
Time “period effects”
are shown by the
horizontal arrow.
Over time, medical
discoveries are made,
prices and access to
different foods change,
and lifestyles evolve –
for instance, access to
exercise classes. All of
these affect health and
mortality risks.
4
3
2
1
0
1950 1951 1952 1953 1954
Age
Calendar year
7. Demography, Data and Development
“Cohort” effects include
cumulative effects. For
example: an epidemic
affecting children may affect
the cohort’s susceptibility to
risks later in life.
Being part of a large
generation may also give
rise to lasting cohort effects:
the behaviors of baby
boomers may simply
differ from those of other
generations.
4
3
2
1
0
1950 1951 1952 1953 1954
Age
Calendar year
8. Demography, Data and Development
Each of these effects has consequences for the health
and survival of the group.
Each is selective in different ways, affecting the “frailty”
and representativeness of those who remain “ alive ” –
under observation.
The conceptual and formal tools of demography can
be transposed to other areas including social media.
For example: studying the population of Facebook users
(birth cohorts or year of 1st registration cohorts), with
period effects (changes to the package, introduction of
competing software, numbers of users), etc.
How does the set of users selectively evolve over time?
9. Demography, Data and Development
Broad demography (population studies)
Causes and consequences of population phenomena
Marriage behaviors, family structures, sexuality, gender,
contraceptive use…
Health, risky behaviors, effects of health services…
Labor vs. refugee migrations, migrant integration, effects
on sending and host countries…
Population aging: pensions, support services…
Demographic dividend: falling fertility, changes to age
structure and economic development…
Population growth and environmental sustainability…
10. Demography, Data and Development
Demographers have much to gain from, and
contribute to, the study of new types of data
Some examples:
Using demographic methods to study populations of
Facebook or Twitter users by “age”, time and cohort.
Spatial analysis (often neglected by demographers)
For example: “now-casting” disaggregated population
estimates using data from cellphones, satellites and
drones.
11. Demography, Data and Development
Better measuring & explaining socio-demographic
phenomena
Big data sets often lack the variables needed for
rigorous hypothesis testing grounded in theory.
Traditional data too are lacking for some topics.
Examples:
o Use of social media data and cellphone records
to study mobility (migration and integration).
o Using Facebook postings to gain insight into
attitudes and strategies regarding behaviors
(cf. anthropological data)
12. Demography, Data and Development
Data quality concerns
Distinguish two cases with different standards
Targeting populations, exploring attitudes, getting
preliminary information of a topic with no good data
Rigorous scientific studies (precise measurement and
modeling of causal effects): replicability, robustness
and generalizability across populations; uncertainty
(significance levels); assessing and avoiding biaises.
Demographers tend to be more interested in the 2nd
- more demanding - case.
13. Demography, Data and Development
Essential that data be validated before use
Internal: ensuring consistency
External, via comparison with other data
There is enormous value to linking data or using
other means to confront new types of data with
old (census, survey and administrative data, etc.).
Goals: Devise robust standards of quality and
precision. Assess and avoid biases.
14. Demography, Data and Development
Big issues include:
The very rapid pace of change in new types of data,
The potential for large and evolving selectivity over
time, space and subgroups,
The disparate nature of new data, with often non-
transparent (non-public) algorithms for searches,
imputations… that can affect results,
Limits to access to data (and the levels of access
too may vary over time).
How can effective data quality assessments be
made in this situation?
15. Demography, Data and Development
Demographers are not the only ones concerned
by these issues.
But technical demographers have always been
keenly interested in assessing data quality, and
some demographic methods may be of value for
this work.
16. Demography, Data and Development
Call for a “Data Revolution” to support the
UN 2030 Development Agenda (SDGs)
The data landscape is very rapidly changing at
present: volume, speed and type of data; technologies;
number of producers and users, etc.
Notion that better evidence should lead to more
effective decisions, development initiatives, and
improved accountability.
Core elements of the DR: making better data more
rapidly available, more open and accessible, and
integrating new and old types of data.
17. Demography, Data and Development
MDGs versus SDGs
SDGs: massive increase in data requirements
Number of quantifiable indicators
Nbr of: Goals Indicators Geography
2000-2015 MDGs 8 60 LMICs
2015-2030 SDGs 17 230 World
These SDG indicators should be disaggregated
(/space, sex, subgroup), to assure that “no one is
left behind”.
(SDGs = Sustainable Development Goals, 2015-2030
MDGs = Millennium development goals, 2000-2015)
18. Demography, Data and Development
Yet already big data gaps for MDGs. Ex. for Africa:
2005: 11/51 countries had comparable poverty
estimates.
CRVS: < 6% of countries have barely viable data.
To address this, there is an urgent need to:
make effective use of new types of data,
develop new methods,
restructure and link data sources, and
develop the capacity of institutions in LMICs to
collect, edit, analyse and use new data.
Enormous challenges
19. Demography, Data and Development
Aren’t these goals unrealistic?
In part yes, especially in the short term.
Then should we care about them?
Yes. This agenda addresses hugely important
and urgent issues facing the world. Even partial
success is important.
Demographers and computer data scientists have
important roles to play in this effort.
20. Demography, Data and Development
The SDGs aim to improve human welfare through
inclusive socioeconomic progress, while protecting
the environment.
Focus on three core issues
Poverty reduction (economic development),
Environmental sustainability, and
Population growth.
21. Demography, Data and Development
Good news: Economic growth and related advances
(health, nutrition, schooling…) have greatly reduced
extreme poverty
Absolute poverty
is defined as living
on < $1.90 US per
day (in 2011 $)
Past 25 years:
improvements
driven in large part
by rapid economic
growth in Asia.
22. Demography, Data and Development
Today, ±700 million people live in absolute (extreme)
poverty. Many more live in yet very difficult conditions.
80% of the extreme poor are in sub-Saharan Africa and
South Asia – the two regions that will have rapid
population growth in upcoming decades.
Most live in rural areas and are poorly educated. Over
half are < 18 years of age. The quality of their upbringing
will have large effects on the future of our world.
To alleviate suffering and provide these people
with a chance at decent lives, inclusive
economic development must continue (SDG 8…)
23. Demography, Data and Development
Worrisome and urgent: the global environment
Clear evidence of rapid climate change and other types
of environmental degradation (e.g. destruction of ocean
fish stocks from overfishing, pollution and rising acidity).
Many scientists doubt that the 2015 Paris Agreement
aiming to limit global warming to 2 degrees is achievable.
Others argue that even attaining this goal is insufficient to
avoid devastating climatic change.
We are already living beyond LT sustainable thresholds.
The growth needed to reduce poverty must occur in
ways that simultaneously reduces the impact of human
activities on the environment (SDGs 12-15).
24. Demography, Data and Development
Major complication: rapid population growth
2017-2100: projected 3.7 billion increase in population
size (± entire world population in 1970), to 11.2 billion.
79% of this growth will occur in sub-Saharan Africa;
most of the rest in South/Central Asia.
The population of SSA will grow by nearly 3 billion
(= 1.4 X current pop of N & S America, and Europe).
Growth is caused by high fertility relative to mortality,
and to the young age structure (SSA median age is
18.5 → population growth momentum).
26. Demography, Data and Development
Prospects for environmental sustainability
As income levels and population sizes grow, there will
be a strong tendency for carbon use in Africa and
South Asia to increase significantly.
As incomes rise from very low levels, people will seek
to improve their living standards by buying fridges,
TVs, mopeds, cars… → ↑ in energy use.
The challenge is to improve human welfare while
accommodating rapid population growth and reducing
impacts (pollution…) on the environment → achieving a
LT population-economic-environmental balance ASAP.
27. Demography, Data and Development
Success will require environmentally friendly growth,
technological advances, and rapid falls in fertility.
Fertility reductions require parents in LMICs to be
motivated to invest more in each child by limiting
their fertility (better schooling…), along with the
ability to do so (good access to contraception).
Sustainable development and Falling fertility are thus
interrelated: improved schooling, lower child mortality
and urbanisation entices parents to opt for fewer
children, good access to contraception provides the
means, and lower fertility is an essential ingredient to
attaining a sustainable world.
28. Demography, Data and Development
Achieving this will be extraordinarily difficult.
Not achieving it will be a catastrophe: continued misery
for large parts of the world population, and risking
disastrous environmental change.
Improvements to the evidence base – the “Data
Revolution” – will be of real value for the design of
more effective policy and interventions.
The participation of cutting-edge computer data scientists
and demographers in Data Revolution work is critical.
[Examples: Big Data and the Well-Being of Women and Girls
Applications on the Social Scientific Frontier (Bapu Vaitla et al.,
4/2017, http://data2x.org/wp-content/uploads/2017/03/Big-Data-
and-the-Well-Being-of-Women-and-Girls.pdf]