Your SlideShare is downloading. ×
Big Data and Me
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Big Data and Me

285
views

Published on

Talk given to Touro College's Leadership in Digital Technology innovation course, April 22nd 2013

Talk given to Touro College's Leadership in Digital Technology innovation course, April 22nd 2013

Published in: Technology, Travel, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
285
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Big Data and Me: experiencesfrom the front lineSara-Jayne FarmerChange AssemblyApril 22nd 2013
  • 2. ME
  • 3. Me• Data Scientist• Using data to:– connect communities– improve access to information– so people can make better decisions– on both small and large scales• It’s all about people:– Local people: know their needs; need more information– Local technologists: have skills; need connections– Large organisations: have resources; need guidance
  • 4. Some of those People(smart, talented, dedicated hackers in Haiti, January 2013)
  • 5. My Personal Three Vs• Variety– Data all over the place– Csv, json, xml, excel, pdf, text, webpages, rss, scannedpages, images, videos, audiofiles, maps, proprietary. Etc.• Velocity– Streams updating too fast for a mapping team (100-200 people)to handle– Pages updating too frequently to check by hand• Volume– Can’t open the data in a spreadsheet– Can’t fit the data on my laptop– Maxes out my credit card (thank you Amazon!)
  • 6. VARIETY
  • 7. “more people have mobile phones than toilets”– UN, March 2013
  • 8. But… but… there are always data issues…• Datasets were difficult to find• No data available after 2010• Hard to track provenance – e.g. what decisions didthe people creating these datasets make? Whatassumptions?• Data was rounded up• Countrynames didn’t match between sets• Multiple charactersets (e.g. Å, A, Ԇ)• Messy formatting (merges, ‘explanations’ etc)
  • 9. e.g. Country NamesDR Congo in Data.UN.Org:• “Congo, Democratic Republic of the”, “CongoDemocratic”, “Democratic Republic of the Congo”, “Congo(Democratic Republic of the)”, “Congo, Dem. Rep.”, “CongoDem. Rep.”, “Congo, Democratic Republic of”, “Dem. Rep.of Congo”, “Dem. Rep. of the Congo”DR Congo in common standards:• “Democratic Republic of the Congo” (UNStats), “Congo, The Democratic Republic of the”(ISO3166), “Congo, Democratic Republic of the”(FIPS10, Stanag), “180” (UN Stats), “COD”(ISO3166, Stanag), “CG” (FIPS10)
  • 10. And coding
  • 11. And interpretation• Hang on… don’t some people have more than onephone?• And how do you count the people without toilets?• What if the cities have lots of phones and toilets, andthe rural areas don’t?• Where does my composting toilet fit in this?• How big were these surveys?• What do we do with the zeros?• Etc…
  • 12. And purpose
  • 13. And Communication
  • 14. And Alternative Data Sources
  • 15. And alternative alternatives…• Social media proxies• Grassroots maps• Etc.
  • 16. VELOCITY AND VOLUME
  • 17. 2013 Boston bombings
  • 18. The Humans+Tools Solution: Crisismapping
  • 19. Find…
  • 20. Listen…
  • 21. Estimate…
  • 22. Geolocate…
  • 23. Create maps…
  • 24. Analyse
  • 25. Explain
  • 26. Use
  • 27. BUT WE NEED MORE DATASCIENTISTS…
  • 28. Build and Connect Communities
  • 29. Train Non-Techies
  • 30. Create Higher-level Tools
  • 31. Big Data and Me: experiencesfrom the front lineSara-Jayne Farmerhttp://www.changeassembly.com/@bodaceacat
  • 32. MORE REFERENCES
  • 33. strataconf.com
  • 34. datasciencecentral.com
  • 35. analytictalent.com
  • 36. Tools
  • 37. Formal (Free) Training
  • 38. NYC Meetups (see meetup.com)
  • 39. Volunteering: datakind.org