2. business conducted, humanitarian assistance handled, public
officials elected, and
governance enacted. Economists rely on data to describe,
interpret, and forecast
economic activity. Despite the rich tradition of using large
datasets, institutional
economics have shied away from big data. This article
describes, reviews, and
reflects on big data, with a particular focus on economic
development. It illustrates
the vast opportunities and challenges for big data as an
important tool for the
benefit of the public. It suggests that big data and data
analytics, if used properly,
can provide real-time actionable information that can be used to
identify problems
and needs, offer services, and provide feedback on the
effectiveness of policy
action.
Keywords: big data, Google trends, humanitarian assistance
JEL Classification Codes: O1, O4
3. The world is undergoing a data revolution. The revolution is
already reshaping how
knowledge is produced, business conducted, humanitarian
assistance handled, public
officials elected, and governance enacted (Kitchin 2014). Data
now pours in from
nearly everywhere at all times and from every device — this is
undeniably an era of big
data. Big data is produced anyway (data exhaust), it is often
accessible in real time,
and it arises from the merging of different sources. It is an
endless source of data for
the economic and social world. Its impact on the economy has
been referred to as
“the new oil” (Pringle 2017).
Government agencies, international organizations, and private
institutions have
been collecting economic and social data for a long time.
Economists have relied on
these sources to describe, interpret, and forecast economic
activity. Macroeconomists,
in particular, have been at the forefront of exploiting large
datasets. For example,
4. Arthur F. Burns and Wesley C. Mitchell’s (1946) pioneering
search for patterns and
581
Big Data
regularities in the data led to the identification of the business
cycle. Similar work by
Simon Kuznets (1941) led to the creation of the National
Income and Product
Accounts. Unfortunately, current institutional economists have
shown very little
interest in it. A review of the table of the contents and abstracts
of the Journal of
Economic Issues, the Journal of Institutional and Theoretical
Economics, and the Journal of
Institutional Economics found no articles on big data. This is
surprising because early
institutionalists displayed a particular penchant for data to
understand economic
issues and to make policy recommendations.
My objective in this article is to describe, review, and reflect on
big data, with a
5. particular focus on economic development. I illustrate the vast
opportunities and
challenges that big data presents as an important tool for the
public good. I also show
that big data and data analytics, if used properly, can provide
real-time actionable
information that can be used to identify problems and needs,
offer services, and
provide feedback on the effectiveness of policy action. My
inspiration for this study
comes from the work of Wesley C. Mitchell, who believed that
acquiring the facts and
“detailed sifting of data outside the context of a worked out
model” (Hirsch 1976,
206) is the correct approach to understanding economic issues.
What Is Big Data?
The term “big data” emerged in the 1990s and gained
momentum in the early 2000s.
Similar to many new concepts, big data has been variously
defined and
operationalized. Clearly, size often comes to mind when
referring to big data. It is
6. commonly defined as the astonishing amount of structured and
unstructured data
that are being generated, captured, and stored at an amazing
speed. An example of big
data would be Walmart’s customer transaction data. Every hour,
Walmart handles
over one million transactions, which are captured into its
databases that are estimated
to contain over 2,560 terabytes of data (1 terabyte = 10244
byte) — equivalent to 167
times the information contained in all the books in the Library
of Congress (Economist
2010). In a single day, there are about 5.2 billion Google
searches, twenty-two billion
texts sent, and more than four million hours of content uploaded
to YouTube, with
users watching 5.97 billion hours of YouTube videos (Schultz
2017). In regard to
hardware and software, big data is often defined as data that is
too large and complex
for processing with traditional database management tools.
Paradoxically, what is
considered big data today may become small data in five years
due to advances in
7. technologies, platforms, and analytical capabilities. The data
science community
concentrates on its characteristics and defines big data in terms
of the 3V model:
volume (amount of data), velocity (speed of data flow), and
variety (range of data types
and sources). Other dimensions, such as variability (highly
inconsistent with periodic
peaks) and veracity (trust and uncertainty), are also added to the
3Vs to characterize
big data (Gandomi and Haider 2015).
The United Nations’ (UN) Department of Economic and Social
Affairs (2015)
classifies big data into three categories: (i) social networks
(human-sourced
information, such as Facebook, Twitter, blogs, Instagram,
YouTube, Internet searches,
582
Baban Hasnat
text messages, etc.), (ii) traditional business systems (process-
8. mediated data, such as
data generated in the context of business transactions, e-
commerce, credit cards, and
medical records), and (iii) Internet of Things (machine-
generated data, such as data
produced by weather, pollution, and traffic sensors, in addition
to mobile phone
tracking, satellite images and logs registered by computer
systems). Danah Boyd and
Kate Crawford (2012) describe big data as a cultural,
technological, and scholarly
phenomenon that rests on the interplay of technology (tools and
algorithms to gather,
store, etc., data); analysis (identifying patterns to understand
economic, social,
political, technical, and legal issues), and mythology (the
widespread belief that the
large data sets offer a higher form of intelligence and
knowledge).
The Use of Big Data for Development and Humanitarian
Assistance
Big data increasingly concerns people’s real behavior, not just
the topics on which
9. people seek information through searching Google or through
posting on Facebook.
Posts on social media may or may not represent a person, but
how that person spends
time, whom he/she associates with, what he/she buys, where
he/she goes, and so on,
can reveal an enormous amount about that person. Data
scientists can predict, with
reasonable accuracy, if the person will take out a payback loan,
develop diabetes, or
buy tickets (Pentland 2018). Thus, the growth of new
technologies and new sources of
data, often available in real time, offers a number of important
dividends for
development. It can improve the efficiency of low-income
people because they can
access a wide range of information on price and cost, thereby
allowing them to save
money and time. Development programs can be inclusive as
socially and economically
excluded groups increasingly voice their positions in defining
development priorities.
This gives people access, empowerment, voice, opportunity, and
security — something
10. that Amartya Sen (1999) has been advocating as the goal of
development.
Highlighting the importance of big data, the United Nations
declares: “It is time for
the development community and policymakers around the world
to recognize and
seize this historical opportunity to address twenty-first century
challenges, including
the effects of global volatility, climate change, and
demographic shifts, with twenty-
first century tools” (United Nations Global Pulse 2012, 6).
Big data and data analytics have appeared on policymakers’
radars only in the
last few years. They are still in the early years of understanding
big data and its
application in international development. Data analytics can be
used to predict the
characteristics of sub-groups such as, for example, school
dropout rates and social
welfare programs. An analysis of Twitter and Google trends and
other social media
can be used to assess the attitude of different groups to social
problems and issues or
11. their response to different prevention strategies. Big data can
allow the integration of
multiple sources of data into a data platform (UN Food and
Agricultural
Organization’s AQUASTAT n.d.), mapping (Ebola outbreaks,
the spread of crop
diseases, the location of victims in an earthquake, etc.),
monitoring trends (rural
poverty in China), and real-time early-warning signals (hunger,
drought, and ethnic
conflict). These tools are now starting to be used in
development programs and
Big Data
583
583
emergency management. Below I highlight some successful
cases in the use of big data
in economic development and humanitarian assistance:
• No census has been possible in Afghanistan since 1979 due to
12. security concerns.
By combing through satellite imagery, remote sensing data,
global information
system modeling, and demographic surveys, the United Nations’
Fund for
Population Activities was able to generate population maps for
Afghanistan.
• Combining satellite and other sources of data, the Food and
Agricultural
Organization has developed AQUASTAT, which is a global
water information
system that collects, analyses, and disseminates data and
information on water
resources, water use, agricultural water management and other
information
related to water (FAO).
• As mobile phones are becoming ever-present in the developing
world, it is now
possible to turn mobile phone-generated data into an economic
development
tool. For example, when mobile operators see airtime top-off
amounts
decreasing in a certain region, it is a sign of loss of income in
the region.
13. Policymakers can take action based on such information before
the information
appears in official indicators (World Economic Forum 2012).
Mobile payments
for agricultural products, input purchases, and subsidies,
combined with satellite
images, may improve predictions of food production trends and
incentives.
Early detection of production trends can help governments
provide targeted
assistance. Mining mobile phone data and proxies for poverty
indicators have
been developed, which gives policymakers a much more
economical and
continuous source of data on poverty trends (United Nations
Global Pulse
2016).
• Policymakers are increasingly resorting to big data to manage
epidemics and
healthcare. For example, the human population movement is a
challenge to
eliminate malaria in developing countries. Amy Wesolowski et
al. (2012)
14. analyzed the travel patterns of fifteen million mobile phone
owners in Kenya
over a period of twelve months. Combining travel data with
census and survey
data, together with spatially referenced malaria data, the global
information
system, and network analysis tools, the authors were able to
identify, map, and
quantify malaria risk areas. People’s lifestyles can be analyzed
from the data
generated by the use of smartphones and apps, which offer
opportunities for
primary prevention. In Iceland’s capital, Reykjavik, a
combination of behavioral
economics, big data, and mobile technology has helped identify
individuals at
increased risk of lifestyle-related diseases (i.e., diabetics) and
reverse their
condition (Thorgeirsson 2017). Global Viral, a non-profit
organization based in
San Francisco, uses big data to identify the locations, sources,
and drivers of
local outbreaks of global epidemics up to a week ahead of
global bodies, such as
15. the World Health Organization, that depend on traditional
techniques and
indicators.
• Big data shows particular promise in emergency management.
Immediately after
the April 2015 earthquake in Nepal, Flowminder/WorldPro used
mobile phone
584
Baban Hasnat
data to create a report on population displacement, which the
UN used to
coordinate humanitarian assistance. When a devastating
earthquake struck
Haiti in 2010, a group of volunteers took it upon themselves to
analyze
informational content on Facebook, Twitter, and text messages
to locate
affected areas and victims of the earthquake. The information
was quickly
loaded — with more than 1.4 million edits — on street maps to
construct a crisis
16. street map to assist humanitarian action.
• Big data and data analytics can be used to gain insight into
how firms respond
to trade reforms or economic shocks. For example, the US-
based company
Panjiva collects custom transaction information (e.g., source,
destination, types
of goods) via a machine-learning algorithm that covers data for
eight countries,
with 190 partner countries comprising 450 million records. The
data can convey
anticipated action from the US, China, and Europe in terms of
trade policies in
2017, the prospects for the shipping industry, and the industries
that have the
most to win and lose from trade.
• Combing real-time traffic conditions with past traffic patterns
and weather
forecasts, urban planners are better able to manage public
transportation, the
police and fire departments, and save time and gasoline for
citizens and
businesses.
17. Applications of Big Data: Two Case Studies
Several sectors of the economy that are important for
development are also quite data-
intensive. I present two case studies to show the use of big data.
The first case shows
the tracking of words. Figure 1 combines the actual
unemployment data from the
U.S. Bureau of Labor Statistics in October 2017 with simple
Google searches for the
word “unemployment” in the fifty U.S. states and Washington,
D.C., at the same
time. The figure clearly shows that the Google Trend data
correlates very closely with
the actual unemployment statistics. The potential for
development is straightforward.
Each month, the Bureau of Labor Statistics’ employees survey
60,000 households
(approximately 110,000 individuals) over the phone or in person
and inquire about
labor force activities. The survey results are published with a
time lag of one month.
Google search trend data are available for free and can be
18. accessed with a simple
computer in real time.
Figure 2 shows two indexes for China’s manufacturing capacity.
The PMI index
provides an overall view of activity in the manufacturing sector.
It is calculated from a
monthly survey of approximately 430 purchasing managers in
China. The SMI index
was created by SpaceKnow, a company that specializes in
geospatial analysis.
SpaceKnow has taken over two billion satellite photos in China
over the last fifteen
years. By analyzing changes in images across 6,000 industrial
sites and incorporating
the number of trucks in industrial parks and the frequency of
turnovers, it allows the
company to measure the manufacturing sector and competitive
capacity. The PMI
index comes with a four-week time lag, while the SpaceKnow
index can be received in
real time.
19. Big Data
585
585
Figure 1. State Unemployment Rate and Google Trend (October
2017)
Figure 2. Index for China’s Manufacturing Sector Activity
Based on Actual Survey
and Satellite Image
The Challenge
Despite its availability and advances in technological and
analytical capacity, big data
has not been widely adopted as a tool for economic development
because of the
586
Baban Hasnat
20. number of challenges. One of the most sensitive issues for
anyone wishing to explore
the use of big data for economic development and policymaking
is privacy. Safety,
diversity, pluralism, and democracy are compromised without
privacy. Recent
research has shown that it is possible to “de-anonymize”
previously anonymized
datasets. Much of the big data belongs to private companies,
and they may not have
any incentive to share proprietary data for security and privacy
concerns. Convincing
private companies to allow economists to access business data
is difficult because
there are important privacy and competitive issues that a private
company must
consider before it allows a researcher to access company data
(Hilbert 2016).
Access to big data is a major challenge. Economists
traditionally rely on their
own survey data or government survey data for their research.
Just because a
government entity collects data (i.e., the IRS, the Social
Security Administration, etc.)
21. does not mean that economists will be able to access it easily.
Certain protocols must
be followed, which is generally time-consuming. For example, a
Harvard researcher
needed very high-level security clearance, which took months to
obtain, and he also
had to submit information on all his places of residence in the
last ten years and
could only access the IRS data set in secure data rooms
authorized by the central
office (Einav and Levin 2013; Taylor, Schroeder and Meyer
2014). In addition, the
process could favor researchers who have the resources,
influence, and network to
gain access to the data, which may lead to “data haves’ and
‘data have-nots” (Boyed
and Crawford 2012).
Big data is worthless unless it is used for improved decision-
making. To do this,
organizations must resort to managing data (acquisition and
recording; extraction,
cleaning, and annotation; integration, aggregation, and
representation) and data
22. analytics (modeling and analysis and interpretations). Data
management for
computation may be a challenge for developing countries and
will require major
investments in information and communication technology.
Accurate and actionable
data mining and analysis, particularly in real-time, requires
extensive technical skills.
Developing countries may not be able to afford the data
scientists and infrastructure.
A significant share of big data is generated from people’s
perception, intentions,
and desires. Policymakers have to be careful about concluding
before making a
judgment about what the data is really conveying because
perception, intentions, and
desires can change rapidly. Additionally, combining data from
multiple sources may
also mean magnifying the data flaws (Bollier 2010). Thus,
theory and context matter
even more for extremely large data sets. A case in point is how
Google Trend data
failed to predict flu trends. Google Flu Trends (GFT) is a big
data tool that claimed to
23. accurately predict flu epidemics in the US. Because GFT could
predict an increase in
cases of flu before the Center of Disease Control, it was
trumpeted as the beginning
of the big data era. Unfortunately, the GFT’s prediction did not
match reality.
Despite improving its model, Google has been persistently
overestimating the flu
since at least 2011 (Fung 2014).
Economists typically look for a particular dataset to answer an
unsettled
question, but data mining leads to searches for the unsettled
question. Noting that big
data often involves billions of observations, Hal Varian (2014)
argued that the
Big Data
587
587
concept of statistical significance, a mainstay in hypothesis
testing, may be useless in
24. certain situations. Others worry that a substantial project that
uses big data is
essentially descriptive because the data will reveal correlations
rather than causality.
Conclusion
It is clear that the size, speed, and nature of big data are
extremely valuable in certain
situations and can be a powerful tool to address various social
ills and development
efforts by providing early warnings, real-time awareness, and
real-time feedback.
Nevertheless, we cannot ignore the data context and cultural
context. We must not
forget that big data has its limitations and biases. We need to
consider these and use
caution in interpreting the data. Correlation is not causation and
should not replace
or act as a proxy for official statistics. In fact, big data should
complement the existing
data. At present, some motivated persons and non-profit
organizations are
spearheading the use of big data for public benefit. The
25. prerequisites for making big
data effective for development are extensive technological
infrastructure, generic
software services, and human capacities and skills. Developing
countries have a long
way to go before big data becomes an everyday tool.
References
Bollier, David. The Promise and Peril of Big Data.
Communications and Society Program. The Aspen
Institute, 2010.
Boyd, Danah and Kate Crawford. “Critical Questions for Big
Data.” Information, Communication & Society
15, 5 (2012): 662-679.
Burns, Arthur F. and Wesley C. Mitchell. Measuring Business
Cycles. New York, NY: Columbia University
Press, 1946.
Economist. “Data, Data Everywhere.” Special report. The
Economist, February 25, 2010. Available at http://
www.economist.com/node/15557443. Accessed Nov 1, 2017.
Fung, Kaiser. “Google Flu Trends’ Failure Shows Good Data >
Big Data.” Harvard Business Review, March
26. 25, 2014
Einav, Liran and Jonathan D. Levin. “The Data Revolution and
Economic Analysis.” Working Paper No.
19035. NBER, May 2013. Available at
http://www.nber.org/papers/w19035.pdf. Accessed August
1, 2017
Gandomi, Amir and Murtaza Haider. “Beyond the Hype: Big
Data Concepts, Methods, and Analytics.”
International Journal of Information Management 35, 2 (2015):
137-144.
Hilbert, Martin. “Big Data for Development: A Review of
Promises and Challenges.” Development Policy
Review 34, 1 (2016): 135-174.
Hirsch, Abraham. “The A Posteriori Method and the Creation of
New Theory: W.C. Mitchell as a Case
Study.” History of Political Economy 8, 2 (1976): 195-206.
Kitchin, Rob. The Data Revolution: Big Data, Open Data, Data
Infrastructures and Their Consequences.
Thousand Oaks, CA: Sage Publishing, 2014.
Kuznets, Simon. National Income and Its Composition, 1919–
1938. New York, NY: National Bureau of
Economic Research, 1941.
27. Pentland, Alex Sandy. “Reinventing Society in the Wake of Big
Data: A Conversation with Alex ‘Sandy’
Pentland.” Edge, August 30, 2018. Available at
https://www.edge.org/conversation/
alex_sandy_pentland-reinventing-society-in-the-wake-of-big-
data. Accessed November 19, 2018.
Pringle, Ramona. “Data Is the New Oil.” CBC News, August 25,
2017. Available at http://www.cbc.ca/
news/technology/data-is-the-new-oil-1.4259677. Accessed
November 27, 2018.
Sen, Amartya. Development as Freedom. New York, NY:
Oxford University Press, 1999.
588
Baban Hasnat
Taylor, Linnet, Ralph Schroeder and Eric Meyer. “Emerging
Practices and Perspectives on Big Data
Analysis in Economics: Bigger and Better or More of the
Same?” Big Data & Society, July-December
2014, pp. 1-10.
Thorgeirsson, Tryggvi. “Hospital Impact — Behavioral
28. Economics and Big Data May Improve Health and
Reduce Healthcare Costs.” FierceHealthcare, September 26,
2017. Available at
https://www.fiercehealthcare.com/hospitals/hospital-impact-
behavioral-economics-may-improve-
health-and-reduce-healthcare-costs. Accessed December 8,
2017.
Schultz, Jeff. “How Much Data Is Created on the Internet Each
Day?” Micro Focus Blog, August 10, 2017.
Available at https://blog.microfocus.com/how-much-data-is-
created-on-the-internet-each-day.
Accessed on December 10, 2018.
United Nations. Department of Economic and Social Affairs
Statistics Division. Classification of Types of Big
Data, ESA/STAT/AC.289/26 11. UNSTAT, May 2015.
Available at https://unstats.un.org/unsd/
class/intercop/expertgroup/2015/AC289-26.PDF. Accessed
December 1, 2017.
United Nation. Food and Agriculture Organization.
AQUASTAT, n.d. Available at http://www.fao.org/
nr/water/aquastat/main/index.stm. Accessed December 5, 2017.
United Nations Global Pulse. Big Data for Development:
Challenges and Opportunities. UN Global Pulse, May
29. 2012. Available at
http://www.unglobalpulse.org/sites/default/files/BigDataforDev
elopment-
UNGlobalPulseMay2012.pdf. Accessed November 15, 2017.
———. Integrating Big Data into the Monitoring and Evaluation
of Development Programs. UN Global Pulse, 2016.
Available at http://unglobalpulse.org/sites/default/files/
IntegratingBigData_intoMEDP_web_UNGP.pdf. Accessed
November 16, 2017
Varian, Hal. “Big Data: New Tricks for Econometrics.” Journal
of Economic Perspectives 28, 2 (2014): 3-28.
Wesolowski, Amy, Nathan Eagle, Andrew J. Tatem, David L.
Smith, Abdisalan M. Noor, Robert W. Snow
and Caroline O. Buckee. “Quantifying the Impact of Human
Mobility on Malaria.” Science 338,
6104 (2012): 267-270.
World Economic Forum. Big Data, Big Impact: New
Possibilities for International Development. World
Economic Forum, 2012. Available at
http://www3.weforum.org/docs/
WEF_TC_MFS_BigDataBigImpact_Briefing_2012.pdf.
Accessed December 10, 2017.
30. Copyright of Journal of Economic Issues (Taylor & Francis Ltd)
is the property of Taylor &
Francis Ltd and its content may not be copied or emailed to
multiple sites or posted to a
listserv without the copyright holder's express written
permission. However, users may print,
download, or email articles for individual use.