Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Crowdsourcing and Natural Language Processing for Humanitarian Response

815 views

Published on

Lecture given to the Disaster Resilience Leadership Academy at Tulane University in 2012

Recommendations
- Engage people with local knowledge
- Employ people with local knowledge
- Statistically cross-validate on-the-fly
- Default to private data practices
- Scale via natural language processing
- Use professional technology solutions

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Crowdsourcing and Natural Language Processing for Humanitarian Response

  1. 1. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Crowdsourcing and Natural Language Processing for Humanitarian Response Robert Munro Tulane Disaster Resilience Leadership Academy 2013
  2. 2. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Crowdsourcing and Natural Language Processing: • Mission 4636, Haiti • Pakreport, Pakistain • MapMill, Sandy • Libya Crisis Map • EpidemicIQ Outline
  3. 3. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Information is increasing • Scale (well-known) • Diversity (less understood) – On a given day, what is the average number of languages that someone could potentially hear? – How has this changed? Processing information at scale
  4. 4. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University 99% of languages don’t have machine-translation or similar services: • Disproportionately lower healthcare & education • Disproportionately greater exposure to disasters Crowdsourcing can bridge part of the gap. Diversity
  5. 5. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University GRAPH OF DEPLOYMENTS Crowdsourcing
  6. 6. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Crowdsourced processing of information in Haitian Kreyol. 1000s of Haitians in Haiti and among the diaspora. Haiti – Mission 4636 Apo Dalila Haiti (18.4957, -72.3185) “I need Thomassin Apo please” “Kenscoff Route: Lat: 18.4957, Long:-72.3185” “This Area after Petion-Ville and Pelerin 5 is not on Google Map. We have no streets name”
  7. 7. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  8. 8. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  9. 9. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  10. 10. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Evaluating local knowledge Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Haitians (volunteers and paid), and those working with them. “Non-Haitians, Ushahidi@Tufts 3,000 messages < 5 minutes each > 4 hours each 45,000 messages Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la.
  11. 11. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University The public facing map The map was not time-critical. This email chain was about successful responses to 4636 already sent through private channels. This was the majority decision. Ushahidi@Tufts might not have realized that Mission 4636 was 20x larger, but they did know we had the majority of Haitian volunteers. Mission 4636 stayed with the decision. Ushahidi@Tufts rejected the decision, citing permission from lawyers. (exact response from Ushahidi@Tufts is omitted as we don’t have permission to reprint their communications)
  12. 12. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University • In 2012 we found that no legal support existed. Just two (non-expert) emails: • “If people are texting you, with the intent of getting aid or reaching out to someone, then consent would be implied. They know what you're doing, right?” • “I am not sure who would have the expertise on this, but it seems quite clear to me that if you are able to obtain their numbers and they are sending you this information, consent is implied.” • The public map was not known in Haiti (Clemenzo, 2011), so we assume the lawyers agree with Mission 4636. (NB: these emails have been reprinted elsewhere with various edits) Haiti – Mission 4636
  13. 13. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University • Broadly, so we do not repeat the mistake. • More narrowly, while the response is over in Haiti, people’s identities remain at risk, as Ushahidi@Tufts members are still putting 4636 messages online. Last month, the full name and phone number of a school girl. They refused to remove it, supported by post-hoc cover-ups: – “I did not receive a single email from anyone involved in Mission 4636 demanding that the SMS’s not be made public.” (Patrick Meier; compare to prior slide) – Consent was given from ‘experts’, note again: “I am not sure who would have the expertise on this, but it seems quite clear to me that if you are able to obtain their numbers and they are sending you this information, consent is implied.” (part in red ommited from public statements by Meier) Why it still matters
  14. 14. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Haiti – Mission 4636 Lessons learned • Default to private data practices Majority decision was not to use a public map Duplicated data on the map led to false responses • Find volunteers through strong social ties 10x larger/faster than the publicized efforts • Avoid anyone with their own agenda coming in (‘bloggers’, ‘crisis-mappers’ …) • Localize to the crisis-affected community 25% of work was by paid workers in Haiti
  15. 15. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Haiti – Mission 4636 Paid workers in Mirebalais, Haiti (FATEM) Benchmarks we can use:* $ 0.25 per translation $ 0.20 per geolocation $ 0.05 per categorization / filtering 4:00 minutes per report processed Can volunteerism undercut this cost? * Munro. 2012. Crowdsourcing and crisis-affected community: lessons learned and looking forward from Mission 4636. Journal of Information Retrieval
  16. 16. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Data-structuring for 2010 floods in Pakistan Pakreport *Chohan, Hester and Munro. 2012. Pakreport: Crowdsourcing for Multipurpose and Multicategory Climate-related Disaster Reporting. Climate Change, Innovation & ICTs Project. CDI Multiple inexperienced people are more accurate than one experienced person.*
  17. 17. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Pakreport Lessons learned • Default to private data practices (!) Taliban threatened to attack mapped aid workers • Cross-validate tasks across multiple workers We used CrowdFlower, as with Mission 4636 • Localize to the crisis-affected community Data obtained by hand / created jobs
  18. 18. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Sandy Mapmill Volunteer damage assessment of Civil Air Patrol photographs
  19. 19. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Sandy Mapmill Used to populate a damage map for FEMA. 30,000+ images. (run by Idibon’s CTO, Schuyler Erle, on his first 2 days on the job!)
  20. 20. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University • f Volunteers v professionals
  21. 21. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Inter-annotator agreement can smooth errors
  22. 22. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University MapMill Sandy Lessons learned • We were able to successfully get enough volunteers at scale. • Volunteers were substantially less accurate than professionals, but multiple judgments help mitigate via inter-annotator agreement. • Despite the scale, it still would have only cost ~$3000 to pay crowdsourcing professionals
  23. 23. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Libya Crisis Map A negative example • 2283 reports already-open, English sources • 1 month of full-time management and contributions from >100 volunteers
  24. 24. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Libya Crisis Map Equivalent cost from paid workers • $575.75 - or about $800 with multiple steps Equivalent time cost from Libyan nationals: • 152.2 hours = less than 1 month for 1 person - would also address some security concerns
  25. 25. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Libya Crisis Map Lessons learned • Crowdsourced volunteers were not required - cost more to run than was saved by not paying - a single in-house Libyan could have achieved more • Default to private data practices - assume all identities of volunteers were exposed - many Libyans expressed concern about the public map and decided not to take part
  26. 26. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University • Imagine if there was too much information, even for large volumes of crowdsourced workers? Natural Language Processing
  27. 27. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Scaling beyond purely manual processing. Disease outbreaks are the world’s single greatest killer. No organization is tracking them all. Epidemics
  28. 28. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Diseases eradicated in the last 75 years: Increase in air travel in the last 75 years: smallpox
  29. 29. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University 90% of ecological diversity 90% of linguistic diversity
  30. 30. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Reported locally before identification H1N1 (Swine Flu) months (10% of world infected) HIV decades (35 million infected) H1N5 (Bird Flu) weeks (>50% fatal) Simply finding these early reports can help prevent epidemics.
  31. 31. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University epidemicIQ Machine - learning (millions) Reports (millions) Microtaskers (thousands) Analysts – domain experts (capped number) в предстоящий осенне- зимний период в Украине ожидаются две эпидемии гриппа ‫من‬ ‫مزيد‬‫انفلونز‬‫الطيور‬ ‫ا‬‫مصر‬ ‫في‬ 香港现1例H5N1禽流感病例曾 游上海南京等地
  32. 32. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University E Coli in Germany The AI head-start
  33. 33. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University epidemicIQ Lessons learned • Current data privacy practices are insufficient - reports from areas where victims are vilified • Crowdsourcing can provide needed skill-sets - 100s of German speakers at short notice • Natural language processing can scale beyond human processing capacity
  34. 34. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Crowdsourcing and risk People’s real-time locations are their most sensitive personal information. Crowdsourcing distributes information to a large number of individuals for processing. For information about at-risk individuals: • Is it right to crowdsource the processing? • Is it right to use a public-facing map?
  35. 35. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Natural Language Processing and risk We can trust machines with some information that we don’t want people to see. Machine-learning is never 100% accurate: • What is the risk of having less time for a human to evaluate each data point? • Does the pattern of the machine-learning errors contain inherent biases?
  36. 36. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Recommendations • Engage people with local knowledge • Employ people with local knowledge • Statistically cross-validate on-the-fly • Default to private data practices • Scale via natural language processing • Use professional technology solutions Conclusions
  37. 37. Crowdsourcing and Natural Language Processing for Humanitarian Response | Robert Munro Disaster Resilience Leadership Academy | Tulane University Thank you Robert Munro @WWRob

×