Big Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development:Challenges and Opportunities May 2012 United Nations Global Pulse Download the full paper at:www.unglobalpulse.org/BigDataforDevWhitePaper
• Innovations in technology and greater affordability of digital devices have ushered in an Age of “Big Data,” an umbrella term for the explosion in the quantity and diversity of high frequency digital data.• “Big Data” comes from everywhere: sensors used to gather climate information, social media, digital pictures and videos posted online, transaction records of online purchases, and from cell phone GPS signals…• Real-time data can provide a snapshot of collective behavior changes at high frequency, high degrees of granularity and from a wide range of angles, which was never before possible• This data can be analyzed in real-time, which ultimately creates the potential for more efficient and effective decision making
―It is time for the development community and policymakers around the world to recognize and seize this historic opportunity to address 21stCentury challenges, including the effects of globalvolatility, climate change and demographic shifts, with 21st Century tools.‖
I. OPPORTUNITYThe world is experiencing a data revolution, or “data deluge.” the stock of digital data is expected to increase 44 times between 2007 and 2020, doubling every 20 months
• Relevance: Even in the developing world there is an increasing amount of mobile phones, social media and internet traffic• Intent: With increasing global volatility, policymakers must respond more quickly and more efficiently to crises; knowing what is happening, in real-time, can help policymakers respond to mitigate the effects of disasters.• Capacity: Big Data analytics can be used to discover trends in large, digital data sets• A Growing Body of Evidence: Social-science research has shown that Twitter, Mobile phone data analysis, and Google Analytics can reveal issues and trends of concern to global development, such as disease outbreaks or mobility patterns.
A loose taxonomy of types of digital data sources that are relevant to global development:Data Exhaust Passively collected -- mobile phone data, These digital services create networked transactional data, purchases, web sensors of human behavior searches etc. and/or real-time data collected by UN agencies, NGOs and other aid organizations to monitor their projects (e.g. stock levels or school attendance).Online Information Web content such as news media and This approach considers web usage and social media interactions (e.g. content as a sensor of human intent, blogs, Twitter), new sentiments, perceptions and want articles, obituaries, e- commerce, job postingsPhysical Sensors Satellite or infrared imagery of changing This approach focuses on remote sensing landscapes, traffic patterns, light of changes in human activity emissions, urban development and topographic changes, etc.Citizen Reporting or Crowd-sourced Data Information actively produced or submitted While not passively produced, this is a key by citizens through mobile phone-based information source for verification and surveys, hotlines , user-generated maps feedback etc
―…Big Data analytics refers to tools and methodologies that aim to transform massivequantities of raw data into ‗data about data‘ – for analytical purposes.‖ ―Some kinds of new digital data sources, particularly transactional records – such as thenumber of microfinance loan defaults, number oftext messages sent, or mobile phone-based food vouchers activated – are as close as it gets to indisputable, hard data.‖
• Privacy – The most sensitive issue, with conceptual, legal and technological implications• Access and Sharing – While some new data sources which may be relevant for development are accessible on the open web, there is still a great deal of data that is privately held by corporations and not available for analytical purposes• Analysis & Interpretation – What type of data is being analyzed? Who is a representative sample of the population? How do we avoid misunderstanding relationships within data? (for example, correlation does not necessarily mean causation)• Anomaly Detection – Characterizing (ab)normality in human ecosystems is very challenging. Must develop ways to characterize and detect socioeconomic anomalies in context.
―Because privacy is a pillar of democracy, we must remain alert to the possibility that it might be compromised by the rise of new technologies, and put in place all necessary safeguards.‖ ―Another challenge lies partly in the fact that a significant share of the new data sources in question reflects perceptions, intentions, and desires. Understanding the mechanisms by which people express perceptions, intentions and desires—as well as how they differ between region or linguistic culture, and change over time—is hard.‖
III. APPLICATION Changes in the number of Tweets mentioning the price of rice and actual food price inflation(official statistics) in Indonesia proved to be closely correlated in a recent collaborative research project between Global Pulse and social media analytics company Crimson Hexagon.
• It is important to know your data. Big Data for Development are certainly not perfect data, but their value is tremendous if they are both properly understood and analyzed.• The promise of Big Data for Development is, and will be, best fulfilled when its limitations, biases and features are adequately understood and taken into account when interpreting data.• Properly analysed, Big Data offers the opportunity for an improved understanding of human behaviour that can support the field of global development: 1) Early warning, 2) Real-time awareness and 3) Real-time feedback.
Contextualization is key:1) Data context. Indicators should not be interpreted in isolation. If one is concerned with anomaly detection, it is not so much the occurrence of one seemingly unusual fact or trend that should be concerning, but that of two, three or more.2) Cultural context. Knowing what is „normal‟ in a country or regional context is prerequisite for recognising anomalies. Cultural practices vary widely around the world and these differences certainly extend to the digital world. There is a deeply ethnographic dimension in using Big Data. Different populations use services in different ways, and have different norms about how they communicate publicly about their lives.
Download the full paper at:www.unglobalpulse.org/BigDataforDevWhitePaper