Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Evolution of the
Humanitarian Data
Ecosystem
Sara Terp, AAAI 2015
SJ’s Stages of Data Use
• Hand-scraping (including lists of where to look),
random categories, SMS, maps
• Standards and d...
2004-2009
• December 2004: Boxing Day Tsunami kills 230,000 people. Sri
Lankan techs create Sahana
• January 2008: Kenyan ...
Intelligence Systems
BOTSHUMANS
Good at: complex analysis,
heuristics, pragmatic
translations, creative data
finding, sudd...
Unmanned Vehicle Control
PACT locus of Authorith Computer Autonomy PACT Level Sheridan & Verplank
Computer monitored by
hu...
2010: Haiti, VTCs
“Don’t be Imperial”
Pro: “Laboratory” =
on behalf of
Per: “Community” =
alongside
Para: “Grassroots” –
by and within
Volunteer Skills Used
Programming
Telecommunications
Mapping
User Experience
IT project management
Data analysis
Relief wo...
Data Scientist Skills
Data Process
Ask a good question…
Obtain datasets
Clean, combine, transform data
Explore the data
Try models (classificati...
People started conversations…
• Twitter
• Facebook
• SMS
• Phones
• Photos
• News
• Sneakernet
DecisionsGAP
Overworked
Fie...
SMS to Map
@bodaceacat
http://blog.overcognition.com/
Creating Datasets
• People add features to OpenStreepMap
• Person sends SMS to ...
Interpreting Aerial Images
Building Technologies
Ongoing:
• CDAC website review
• Field Voices
• Haiti Amps Network
• Haitian Voices
• Machine Transl...
Improving Technologies
• ReliefWeb UX redesign
• Ushahidi UX redesign
• CDAC website review
• OpenStreetMap development, a...
Building Interfaces
Creating Community Sensors
@bodaceacat
http://blog.overcognition.com/
What’s an appropriate crisis to help?
• Information
– Information deluge
– Know...
@bodaceacat
http://blog.overcognition.com/
user questions for pkfloods
• Where can I find out who needs my help?
• Where c...
@bodaceacat
http://blog.overcognition.com/
Pkfloods Use Cases
What if the datapoints move?
• Ash cloud from Snæfellsjökull left planes on ground
and thousands of people stranded
• UK c...
@bodaceacat
http://blog.overcognition.com/
The 2010 Vision:
effective crisis information ecosystems
Responder-triggered VTCs
Task Types
• Message level:
• Media monitoring, source checking (e.g. SMS), summarisation, translation,
geolocation, clean...
Sudden-Onset Crisis
• Fire, flood, heat, cold, tsunami, earthquake, storm,
tornado, hurricane, cyclone, refugees, bombings...
2011: UN Data Science
Slow-Burn Crises
Droughts, agriculture, food insecurity, conflict,
education, disease, employment, shelter, trade,
endemic...
Crisismapping Early 2011: radiation
Category Standards
Human/Machine Data Generation
Data CrossWalks
DR Congo in Data.UN.Org:
“Congo, Democratic Republic of the”, “Congo Democratic”, “Democratic Republic of ...
2012: Partial Automation
ACAPS DNA
Data Finding
Common Data Needs
• Rolodexes: which response groups to follow, and who’s
likely to bring what
• 3Ws: who’s doing what whe...
Commonly Available Data
• Direct messages (SMS etc)
• Social media messages (tweets etc)
• Demographic data (e.g. surveys)...
Common Issues
• Massively dispersed and unstructured data (still)
• Named entity and category mismatches between datasets
...
(Some of) What’s Broken
• Crisis Data
– Remote vs Ground disconnect
– Crisis vs Development disconnect
– Deployment lead o...
2013: Data Overloads
Cleaner Workflows
More Maps
2013 Boston bombings
My Personal Three Vs
• Variety
– Data all over the place
– Csv, json, xml, excel, pdf, text, webpages, rss, scanned pages,...
The other Vs: Veracity
Mappers Needed More Data Science Literacy
Datastores
2014: Datastores
We Build Community Data
Tools
Ushahidi is a Dataset
Ushahidi Platform
PHOTOS, VIDEOS
Ushahidi Platform as Data
Non-Expert Visualisations
Word-level analysis
Typhoon Ruby, Dec 2014
Where to Map?
Stuff Happens
Lots of groups curate data
Including volunteer mappers
Ruby Datastores
Local wins. Local should
(almost) always win
2015: NGO Data Scientists
Ushahidi Platforms as
Datasets
Datastores and Viz
Resilience
And are making it part of “normality”
Here are some missing
pieces
• Basic vocabularies, e.g. stopword lists for most languages
(including SMSspeak in different...
Evolution of the Humanitarian Data Ecosystem
Evolution of the Humanitarian Data Ecosystem
Evolution of the Humanitarian Data Ecosystem
Upcoming SlideShare
Loading in …5
×

Evolution of the Humanitarian Data Ecosystem

1,691 views

Published on

Slides for presentation given at AAAI Spring 2015 on humanitarian linked open data.

Published in: Technology
  • Be the first to comment

Evolution of the Humanitarian Data Ecosystem

  1. 1. Evolution of the Humanitarian Data Ecosystem Sara Terp, AAAI 2015
  2. 2. SJ’s Stages of Data Use • Hand-scraping (including lists of where to look), random categories, SMS, maps • Standards and dataset visualisations • Mashups and statistical analysis • Stable datastores and local data scientists
  3. 3. 2004-2009 • December 2004: Boxing Day Tsunami kills 230,000 people. Sri Lankan techs create Sahana • January 2008: Kenyan news blackout during post-election violence. Bloggers create Ushahidi • June 2009: CrisisCommons forms after a tweet-up • October 2009: ICCM conference, Cleveland • 2009: Ushahidi creates CrisisMappers • 2009: First RHOK hackathon creates PeopleFinder • 2009: CDAC forms after a discussion in a bar
  4. 4. Intelligence Systems BOTSHUMANS Good at: complex analysis, heuristics, pragmatic translations, creative data finding, sudden onset Not so good at: high volume, repetitive, 24/7 accurate Good at: high volume, repetitive, complex pattern finding, long term Not so good at: complexity, human foibles
  5. 5. Unmanned Vehicle Control PACT locus of Authorith Computer Autonomy PACT Level Sheridan & Verplank Computer monitored by human Full 5b Computer does everything autonomously 5a Computer chooses action, performs it & informs human Computer backed up by human Action unless revoked 4b Computer chooses action & performs it unless human disapproves 4a Computer chooses action & performs it if human approves Human backed up by computer Advice, and if authorised, action 3 Computer suggests options and proposes one of them Human assisted by computer Advice 2 Computer suggests options to human Human assisted by computer only when requested Advice only if requested 1 Human asks computer to suggest options and human selects Operator None 0 Whole task done by human except for actual operations
  6. 6. 2010: Haiti, VTCs
  7. 7. “Don’t be Imperial” Pro: “Laboratory” = on behalf of Per: “Community” = alongside Para: “Grassroots” – by and within
  8. 8. Volunteer Skills Used Programming Telecommunications Mapping User Experience IT project management Data analysis Relief work experience Local knowledge Translation Communications & PR Facilitation and admin Making tea!
  9. 9. Data Scientist Skills
  10. 10. Data Process Ask a good question… Obtain datasets Clean, combine, transform data Explore the data Try models (classification, machine learning etc) Interpret and communicate your results
  11. 11. People started conversations… • Twitter • Facebook • SMS • Phones • Photos • News • Sneakernet DecisionsGAP Overworked Field People
  12. 12. SMS to Map
  13. 13. @bodaceacat http://blog.overcognition.com/ Creating Datasets • People add features to OpenStreepMap • Person sends SMS to 4636 • Message goes to CrowdFlower • Person translates and geolocates message • Message goes to Ushahidi display • Message gets to responders, public, aunts, Sahana etc.
  14. 14. Interpreting Aerial Images
  15. 15. Building Technologies Ongoing: • CDAC website review • Field Voices • Haiti Amps Network • Haitian Voices • Machine Translation System • Oil Spill Response • PAP outskirts food relief • Telecommunications technical project • Low-bandwidth Ushahidi • Kapab Medical Facility Capacity Finder • Disaster Accountability Public Database • Sync the Sheet • Testing Crabgrass Closed: • Translators in Action - other translation tools were developed Proposed • Mining Relief Data • Automating Aid Request via a Voice Phone Call • Building A Refugee Camp Cell Phone Early Warning System • Community Tool Box • CrisisCommons Roledex • Facebook for ARC Safe and Well site • Haitian Skilled Workforce Retention • Post Disaster Child Protection • CDAC Radio Website Unknown • Disaster Accountability Hotline • Incident visualisation • Needs Categorization • World Academic TeaCHing Hospitals disaster relief
  16. 16. Improving Technologies • ReliefWeb UX redesign • Ushahidi UX redesign • CDAC website review • OpenStreetMap development, at other end of table; OpenStreetMap users at the other
  17. 17. Building Interfaces
  18. 18. Creating Community Sensors
  19. 19. @bodaceacat http://blog.overcognition.com/ What’s an appropriate crisis to help? • Information – Information deluge – Knowledge drought • Infrastructure – Local infrastructure is overwhelmed – Existing information channels • Stages – Mitigation – Preparedness – Response – Recovery – Sustainability
  20. 20. @bodaceacat http://blog.overcognition.com/ user questions for pkfloods • Where can I find out who needs my help? • Where can I find people to help me deliver aid? • Where can I find out information? • How do I find out if I'm about to be flooded? • Who should I alert/give my information to? • Where can I find general information out about #pkfloods? • Where can I search for people? (I cannot find my grandmother/relative) • I have been 'found' - who should I alert/give my status to? • I need food/water/supplies, how can I tell people I need something? • I have food/water/supplies, how can I find out where there's a need? • I want to get to location x, where can I find out about the state of the roads? • I am observing/know the state of the roads, who should I alert/give my information to? • How can I find out where there are information blackspots/there is no telecomms coverage? • I know where the telecoms/information blackspots are, who should I give my alert/information to and how?
  21. 21. @bodaceacat http://blog.overcognition.com/ Pkfloods Use Cases
  22. 22. What if the datapoints move? • Ash cloud from Snæfellsjökull left planes on ground and thousands of people stranded • UK crisis mappers started news and twitter watches • Needed a tool that let us track who was stranded and ways for people to get home • But all the methods we had were static
  23. 23. @bodaceacat http://blog.overcognition.com/ The 2010 Vision: effective crisis information ecosystems
  24. 24. Responder-triggered VTCs
  25. 25. Task Types • Message level: • Media monitoring, source checking (e.g. SMS), summarisation, translation, geolocation, cleaning (e.g. PII removal), categorising (e.g. grouping) • Meta level: • Analysis (producing graphs, explanations, connections), • Verification • Tasks / team control • Communication • After-action reporting (inc evaluation)
  26. 26. Sudden-Onset Crisis • Fire, flood, heat, cold, tsunami, earthquake, storm, tornado, hurricane, cyclone, refugees, bombings, election issues / violence etc
  27. 27. 2011: UN Data Science
  28. 28. Slow-Burn Crises Droughts, agriculture, food insecurity, conflict, education, disease, employment, shelter, trade, endemic violence, GBV etc. “Human development is a process of enlarging people’s choices. The most critical ones are to lead a long and healthy life, to be educated and to enjoy a decent standard of living. Additional choices include political freedom, guaranteed human rights and self-respect – what Adam Smith called the ability to mix with others without being ashamed to appear in public” – UNDP Human Development Report
  29. 29. Crisismapping Early 2011: radiation
  30. 30. Category Standards
  31. 31. Human/Machine Data Generation
  32. 32. Data CrossWalks DR Congo in Data.UN.Org: “Congo, Democratic Republic of the”, “Congo Democratic”, “Democratic Republic of the Congo”, “Congo (Democratic Republic of the)”, “Congo, Dem. Rep.”, “Congo Dem. Rep.”, “Congo, Democratic Republic of”, “Dem. Rep. of Congo”, “Dem. Rep. of the Congo” DR Congo in common standards: “Democratic Republic of the Congo” (UN Stats), “Congo, The Democratic Republic of the” (ISO3166), “Congo, Democratic Republic of the” (FIPS10, Stanag), “180” (UN Stats), “COD” (ISO3166, Stanag), “CG” (FIPS10)
  33. 33. 2012: Partial Automation
  34. 34. ACAPS DNA
  35. 35. Data Finding
  36. 36. Common Data Needs • Rolodexes: which response groups to follow, and who’s likely to bring what • 3Ws: who’s doing what where • GIS data: knowing where medical facilities, schools, roads, bridges are • Communications: cell tower locations and signal maps • Demographics. • Technology and social media use to demographics
  37. 37. Commonly Available Data • Direct messages (SMS etc) • Social media messages (tweets etc) • Demographic data (e.g. surveys) • News reports • 3Ws, situation reports (both official, via news sources and on social media), field notes • Photos: ground, aerial, satellite, videos • CSVs, webpages, PDFs, audio recordings (e.g. radio)
  38. 38. Common Issues • Massively dispersed and unstructured data (still) • Named entity and category mismatches between datasets • Trust • Personally Identifiable Information (and risk) * Crisis response is time-limited * Crisis data response is resource-limited * Crisis preparation is attention-limited (if you want resilience, either pay or lead)
  39. 39. (Some of) What’s Broken • Crisis Data – Remote vs Ground disconnect – Crisis vs Development disconnect – Deployment lead overload • Development Data – Broken data formats, access, coverage, standards – Ignored data sources – Human vs Data disconnect • Communities – Stovepipes, fiefdoms, imperialism, finding…
  40. 40. 2013: Data Overloads
  41. 41. Cleaner Workflows
  42. 42. More Maps
  43. 43. 2013 Boston bombings
  44. 44. My Personal Three Vs • Variety – Data all over the place – Csv, json, xml, excel, pdf, text, webpages, rss, scanned pages, images, videos, audiofiles, maps, proprietary. Etc. • Velocity – Streams updating too fast for a mapping team (100-200 people) to handle – Pages updating too frequently to check by hand • Volume – Can’t open the data in a spreadsheet – Can’t fit the data on my laptop – Maxes out my credit card (thank you Amazon!)
  45. 45. The other Vs: Veracity
  46. 46. Mappers Needed More Data Science Literacy
  47. 47. Datastores
  48. 48. 2014: Datastores
  49. 49. We Build Community Data Tools
  50. 50. Ushahidi is a Dataset
  51. 51. Ushahidi Platform PHOTOS, VIDEOS
  52. 52. Ushahidi Platform as Data
  53. 53. Non-Expert Visualisations
  54. 54. Word-level analysis
  55. 55. Typhoon Ruby, Dec 2014
  56. 56. Where to Map?
  57. 57. Stuff Happens
  58. 58. Lots of groups curate data
  59. 59. Including volunteer mappers
  60. 60. Ruby Datastores
  61. 61. Local wins. Local should (almost) always win
  62. 62. 2015: NGO Data Scientists
  63. 63. Ushahidi Platforms as Datasets
  64. 64. Datastores and Viz
  65. 65. Resilience
  66. 66. And are making it part of “normality”
  67. 67. Here are some missing pieces • Basic vocabularies, e.g. stopword lists for most languages (including SMSspeak in different languages) • Pre-crisis datasets for many crisis-prone countries • Philippines: local response groups set up • Missing Maps project for GIS data • What about the rest? • User datasets in existing tools • E.g. adding own gazetteers into Ushahidi.

×