Crowdsourcing, Mapping and Verification - PICNIC2012
Crowdsourced information can play a crucial role in unexpected circumstances like political uprisings and natural disasters. But how can the data best be verified and what is the role of the media?With the expansion of social media and live mapping, crowdsourced information has begun to play a significant role in sudden, unexpected circumstances such as natural disasters and political uprisings. So is it possible for humans to replace algorithms in certain situations? Volunteer contributions have the potential to save lives and support local communities, but the real challenge is verifying crowdsourced information and "big data" – particularly in crisis situations requiring accurate validation under significant time constraints.
Good afternoon everyone, and many thanks to the organizers for putting this exciting conference together.As you may have already noticed (and hopefully verified), I am not Dr. Patrick Meier. But the good Doctor asked me to personally extend his sincerest, sincerest apologies for not being here in person today due to matters that were completely out of his control. He is very, very disappointed he could not be here today. The good news is that I have worked closely with Patrick and other colleagues for over the past two plus years on crisis mapping, challenges of verification and the role of media in this new information landscape. What I’d like to do with this talk is provide some background information to the field of crisis mapping and the challenge of verification in order to frame the panels that will follow. Will therefore will keep the presentation rather broad and general, and ask my fellow panelists to dive deeper into some of these issues, both during their presentations and also during the questions and answers. So I will provide a brief introduction to crisis mapping and in the process highlight how the challenge of verification has becoming increasingly central to that field.
Everyone agrees that the field crisis mapping as we know it today was launched at the Harvard Humanitarian Initiative (HHI) by Patrick Meier and Jennifer Leaning. They coined the term back in 2006 and launched a 2 year Program on Crisis Mapping and Early Warning in 2007. The purpose was to explore the use of new technologies, particularly map-based platforms, for the purposes of improving humanitarian response.
In 2009, at the end of the 2 year program, Patrick and Jen Ziemke organized the first CrisisMappers Conference where they launched the CrisisMappers Network. The purpose of this global network is to catalyze information sharing and partnerships on crisis mapping and humanitarian projects. In his keynote address at this conference, Patrick offered a simple taxonomy and research agenda for the field of crisis mapping.
So as you can see, there is a lot more to crisis mapping than the simple map
In January 2010, a devastating earthquake struck Haiti. What followed was a turning point for the young field of crisis mapping. Students at the Fletcher school at Tufts University supported by Ushahidi launched the Ushahidi Haiti Project. I’m sure most if not all of you are already familiar with this project, so I won’t go into details here.
Volunteers from Open Street Map were invaluable in making the entire project successful. Indeed, the OSM team crowdsourced the most detailed street map of Haiti in just a matter of days. They used satellite imagery provided by the World Bank and traced this imagery onto a digital map. Here is an awesome animation of this. Some 700 volunteers contributed to these efforts, making over 1.4 million edits to the map in a matter of weeks. Harry will speak more about OSM’s efforts since Haiti in the panel that follows.
Shortly after the Haiti earthquake, a devastating snow storm paralyzed Washington DC. So the Washington Post launched a crisis map. But the Washington Post didn’t only crowdsourced problems, they also crowdsourced solutions. In effect, they created a self-help map. A self-recovery map.
Several months later, the Standby Volunteer Task Force was launched at the 2nd Crisis Mappers Conference. The rationale of the SBTF came from the experiences in Haiti, Chile, Pakistan… Volunteers proved invaluable in these effortsBut these efforts were completely reactive, unplanned, chaotic.So purpose of SBTF was to streamline and be more pro-active. Helena will present specifically on the SBTF in the following panel.
The SBTF was officially activated by the UN to create a live crisis map of Libya on March 3rd 2011. Again, Helena will provide more insights into this, but I am highlighting this project here because it was really the first time that the SBTF actively carried out verification of crowdsourced social media for crisis mapping with the intention to create out of social media content actionable information for the humanitarian community responding to the crisis.
Right before the 2ndCrisisMappers Conf, I oversaw the use of crisis mapping to monitor Egypt’s Parliamentary Elections (Yes, when Mubarak was still around). The reason I’m sharing this particular project is because it was the very first time that a crisis mapping initiative specifically included a verification component.
The project was built on the model of a decentralized system where each region had each own team and where the verification process was handle centrally by a team of former journalists from Thompson Reuters but the actual verification was done on a case by case level on the ground. This project is one if its kind as it achieved the result of 91% of reports verified.
At the same time that the Libya project was launched, Harvard University partnered with several professionals from the Syrian Diaspora to monitor, document and map human rights abuses in Syria. The project has been going on uninterrupted for 18 months now. The reason this has been possible is because the project has combined crowdsourcing with data mining.
Harvard University’s HealthMap project was used for the data mining. HealthMap automatically monitors news and social media information for early signs of disease outbreaks. They created a “specially crafted gazetteer, which was built incrementally by adding relevant geographic phrases extracted from the specific kind of news report intended for mapping” which was then used in a “look-up tree algorithm which tries to find a match between the sequences of words in the alert and the sequences of words in the entries of the gazetteer. The system also implement a set of rules which use the position of the phrase in the alert to decide whether or not the phrase is related to the reported diseases.”The SyriaTracker team repurposed HealthMap to monitor 2,000 English based news sources covering Syria to automatically search for evidence of killings, torture etc. What is really important here is that by combining crowdsourcing and data mining, Syria Tracker has been able to fully verify over 80% of the killings documented on the site.
A few months after Syria Tracker was launched, an earthquake struck the city of Van in Turkey. AL Jazeera activated the SBTF to create this live crowdsourced social media map of impact and needs. Because this was in effect a news site, Al Jazeera posted a disclaimer in the green message area just above the map notes that Al Jazeera cannot guarantee the accuracy of all the information provided on the map, which means that users/viewers of the information should make their own assessment regarding the validity of the reports and use the contact numbers and links provided to investigate the information themselves.So one way around the challenge of verification is to shift the burden to viewers/consumers of information themselves, rather than the producers.
In other words, they followed Ronald Reagan’s famous saying “Trust but Verify”
In Central African Republic Internews launched a project in 2012 that is based on the use of what Patrick Meier call “bounded crowdsourcing.” The idea here is to use a small but trusted number of reporters who each report information on a particular event. The way they collect information is via crowdsourcing from the local population and trusted authorities on the ground, but the middle point is this network of 15 radio stations that verify one by one each of this information and process them before they are visualized in an interactive map to be used by the humanitarian community in the capital Bangui.
As the BBC’s User-Generated Hub (UGC) in London and Storyful have both demonstrated, journalists are already committing time and resources to verifying crowdsourced content. So is Al Jazeera. Recall during the Egyptian revolution last year how 75% of video footage was actually crowdsourced user-generated content. So it is very important for the field of crisis mapping to learn from existing efforts in the journalism space. As mentioned in the introduction of my talk, I too have been spending a considerable amount of time working on developing verification strategies. One area that is particularly useful as an entry point and use case is verification around election monitoring. As noted vis-à-vis Egypt, I have already been involved in this area. I was recently in Ukraine working on an Internews project to develop such a strategy for the upcoming elections.
Here is a first draft output of the strategies/tactics we plan to use for verification purposes. The point I want to hammer home here is that verification is possible. More and more people leave digital footprints online rather than offline. Indeed, there is more evidence to collect online than offline. So that’s where we need to be, without forgetting offline follow up and interviews are still very important. In this specific project 40 journalists around the country will be working together with more than 200 electoral monitors to triangulate reports form the crowd and being able to take advantage of the crowd by collecting much more information that they personally could possibly do, but in the same time to verify this information on the ground via a complex system based on both machine learning and human investigation.
One other point to keep in mind is that we are not restricted to text based evidence collection only. This is a graphic from Nieman’s recent report on Truth in Social Media that depicts Storyful’s verification approach for photographic and video content. I’m sure that David Clench from Storyful will say more about this in his presentation during the second panel.
The verification processes that journalists have always been using in the past are still very much relevant and can now be integrated with new tools, machine learning, geo-parsing as well as with new methodologies like crowdsourcing and sampling. What it is important to keep in mind here is according to me that we are in front of new possibilities and new paths to be discovered, but that the field of verification is and will still something to be carefully design on the base of years of experience and learning form the media field.
Many thanks for your time. I can be contacted at this email address. I look forward to moderating the upcoming panels. Thanks again.
Crowdsourcing, Mapping and Verification - PICNIC2012