FOSS data triage platform
Crowdsourced Crisis Data Media Analytics
(Africa) (Middle East)
May 6, 2009 Team Update
Where we are today?
Reeling from an awesome conference.
Enjoying the good company of InSTEDD.
International coworkers in person for the ﬁrst time.
Swift is going live after months of collaborative design work.
Numerous deployments are scheduled in very near term.
Currently iterating “live” for votereport.in *gulp :)
the deployment problem
If you have to deploy, you
have a problem
How might we imagine a better
crisis reporting tool?
By designing a better crisis listening tool.
quot;At present, early warning units within the UN ... use manual
labor to collect relevant information from online sources.
Most units employ full-time staff for this, often
meaning that 80% of an analyst’s time is actually used to
collect pertinent articles and reports, leaving only 20% of
the time for actual analysis, interpretation and policy
recommendations. We can do better. Analysts ought
to be spending 80% of their time analyzing.quot;
Patrick Meier, “Crimson Hexagon: Early Warning 2.0?” February 17, 2009
We need new listening tools.
Ability to report eyewitness stories
Ability to act on them
1910 1940 1980 2010
What can we build when we focus
on the user interface needs of citizen editors?
How does this changes our idea of an “analyst”?
Can data entry actually make an impact?
Could it actually be ... enjoyable to use?
What if we assume that citizen editors deserve the best?
What if we treat their data with the same respect?
The revolution will not be transmitted via giant .xls ﬁles.
Output: The Crisis API
The resulting crisis database is open, with distributed storage
and standards-based portability, based on:
ICAL / FOAF / Dublin Core / Etc.
RSS / CSV / JSON / SQL / Etc.
and all of it is handled by Freebase.com
Swift the method has evolved into a platform:
1. data gathering engine (aggregator)
2. data structuring tool (wiki)
3. most importantly, an API for crisis data
Swift is an aggregator
with entity extraction
By “roping together” relevant feeds, then parsing their
content, we can get a rich database of people, places
and organizations in real time.
We are working with
Freebase.com and Calais.
Swift is mostly a Rails app,
Twitter Vote Report.
Good with Data? Now you can help.
Swift is designed for
Improving information ﬁndability in a crisis
Making it easier to ﬁnd things that you didn't know you were looking for
Better understanding media from other parts of the world
Making urgent data more sharable (structured, published and accessible)
Making it more obvious what information is missing about an crisis
Promoting the work of eyewitnesses with prepared crisis editors
Expanding the grassroots reporting network
Preserving information across crises
GEEK OUT A SEC
Every incoming is parsed into an object with
1. URI.body (the text of the url)
2. URI.rating (anyone can rate through a web UI)
3. URI.submitters (anyone who linked to it)
4. URI.history (every revision preserved)
5. URI.tags (added by humans and machines)
people ﬁnder interface format
used for 90,000 entries after Katrina
Grassroots reporting: Database-driven journalism and data
Weather Related Disasters
Urban and Rural Fire
Possibilities with speed: Internal Displacement & Refugee
What to do in a ___ based on my location? Volcano
Report a ____ location Missing Person
Report a ___ accident. Structure Collapse
___ detection and reporting. Railroad Disasters
“There was a ____. Are you ok?” alerts.
Neighborhood-level ____ warnings.
Foot and Mouth
Swift’s realtime strengthens the Ushahidi
Subscribe to disaster keyword based
alerts about an emergency
eg: “san francisco, earthquake”
Crimson Hexagon state of the art NLP.
VRA's GeoMonitor, a natural language parser that reads the headlines of Reuters and AFP news
wires and codes to state quot;who did what, to who, where and when?quot; Rated quot;virtually identicalquot; to
human event summarizing.
JRC's European Media Monitor, which can parse thousands of different news sources but faces
limitations since analysts still need to read each article to understand the nature of the terrorist
Tabari text parsing engine, quot;event data coder that has been used in at least ﬁve NSF-sponsored
projects and produced data used in a number of refereed articles in political science.quot;
FORECITE Forecasting of Crises and Instability Using Text-Based Events, developed by the US
Center for Army Analysis (CAA)
UN's HEWSweb A relatively new early warning system
Biowarn Textual analysis with infectious disease focus
FAST Comprehensive system by Swisspeace
A minimal implementation: an analogy
We’ll put a card table at the public library and
work on reports with highlighters, put
everything into a card catalog and leave it in