Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Scraping the Social Graph with Ushahidi and SwiftRiver
1. SCRAPING THE SOCIAL GRAPH
CRISIS MONITORING WITH SOCIAL MEDIA
Georgetown University
jongos@gmail.com
@jongos
2. About Ushahidi Notable Uses The Challenge
Ushahidi is a free, open-source Ushahidi has been deployed in As the amounts of data
platform used for crowdsourcing major global crisis scenarios, aggregated by Ushahidi users
and visualizing data geospatially. allowing organizations to draw grows, they face a common
It was born out of the 2008 situational awareness from the problem. How do they effectively
election unrest when founders crowd. To date it’s been manage this realtime data? How
Juliana Rotich, Erik Hersman, Ory downloaded over 15,000 times. can we help them discover
Okolloh and David Kobia wanted credible and actionable info from
to allow Kenyan citizens a way to S o m e o f t h e m o re n o t a b l e the deluge of reports they’ll get
SMS reports of incident to know deployments include recently in from the public? The SwiftRiver
what was occurring around them. Egypt, the Haiti earthquakes, the initiative was created to begin to
This was one of the earliest uses fires in Russia, the Queensland answer some of these questions
of crowdsourcing for crisis floods in Australia. for Ushahidi deployers.
response.
16. PLATFORM GOALS
Consider the context, relevance defined by the user
Offer an opt-in global database of trust and authority
Algorithms augment, but not define, human decision making
Work across media channels (Twitter, Email, Feeds, SMS)
Be accessible (offline/online/mobile)
Index massive amounts of the mobile/social web
17. KNC AWARD & RIVER ID
final component of the veracity algorithm
needs to be able to scale massively
changing the backend (Hadoop & Mongo DB)
research by data scientists
use-cases at scale and iterative improvements
24. NETWORK DYNAMICS
Good crowdsourcing campaigns build upon the existing ties
between people and their networks. There’s a natural mult-
iplier, where the people in the original network become
nodes for new networks and so on.
25. EARNING TRUST
❖ Participation is permission
❖ Consent is not carte blanche
❖ Clarity is critical
❖ Trust is Earned or Burned
❖ Transparency is hard to teach
26. PRIVACY
❖ Protection of data is different than the
protection of people/identity
❖ Standards like HTTPS or SSL
❖ Encryption
❖ Anonymity is not a given (TOR Project)
❖ The usual fail-points are still threats (weak
passwords, compromised servers, careless
employees)
27. VALIDATION
❖ Verify factual occurrences (location, time,
date)
❖ Verify contributor identity (who?)
❖ Verify contributor credentials
Everything beyond these three points is an educated
guess. Anyone looking to game the campaign will only
be affective if they are able to compromise the
aforementioned.
28. MOTIVATION
❖ Ease of participation
❖ Low risk of failure or shame
❖ Social Capital
❖ Repute & Accolade
❖ Barter
❖ Strategic Spending ($)
❖ Data Sharing
❖ Altruism & Charity