Your SlideShare is downloading. ×
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Combining data with Google Refine
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Combining data with Google Refine

5,263

Published on

Presentation at the Global Investigative Journalism Conference, Kiev, Ukraine, 15 October 2011

Presentation at the Global Investigative Journalism Conference, Kiev, Ukraine, 15 October 2011

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,263
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Get yourself ready • Google ‘Google Refine download’ http://code.google.com/p/google- refine/wiki/Downloads • Download and install Google Refine • Open it up - it should open in a browser at http://127.0.0.1:3333/Saturday, 15 October 2011
  • 2. Google Refine Combining data OnlineJournalismBlog.com Twitter.com/PaulBradshawSaturday, 15 October 2011
  • 3. In a nutshell... cell.cross("GPdata2008", "Practice Code").cells["Total Listsize"].value[0] • Using GREL to combine datasets • Using APIs to grab geographical data • Using Reconcile services to grab company dataSaturday, 15 October 2011
  • 4. GREL Google Refine Expression LanguageSaturday, 15 October 2011
  • 5. cell.cross("GPdata2008", "Practice Code").cells ["Total Listsize"].value[0]Saturday, 15 October 2011
  • 6. Using APIs Getting contextual data .Saturday, 15 October 2011
  • 7. What’s an API again? Ask it a question, it gives you an answer: “For each of these codes, give me the region.” “For each of these names, tell me their political party”Saturday, 15 October 2011
  • 8. Useful APIs Geo: UK Postcodes, Google Maps Social: Twitter, Facebook, Flickr Politics: They Work For You, Data.gov.uk News: Guardian, NYT, USA Today, NPR Health, business, etc. Search for specific onesSaturday, 15 October 2011
  • 9. API keys Sometimes needed - apply through the site Use it in the request as a passwordSaturday, 15 October 2011
  • 10. API limits Can prevent you getting data for all your records. Try multiple APIs or split your data into multiple sheets - or buy a licenceSaturday, 15 October 2011
  • 11. Saturday, 15 October 2011
  • 12. Get data from an API www.chasedavis.com/refine.htmlSaturday, 15 October 2011
  • 13. Reconciling An easier way to get dataSaturday, 15 October 2011
  • 14. OpenCorporates.com http://vimeo.com/17924204Saturday, 15 October 2011
  • 15. Walkthrough: Reconciliation with Open Corporates • Click on arrow at top of column • Select Reconcile > Start Reconciling... • Click on Add Standard Service... • http://opencorporates.com/reconcile • And start...Saturday, 15 October 2011
  • 16. Walkthrough: Reconciliation with Open Corporates • Click ‘Search for Match’ and select • Click double tick icon to bulk reconcile • Reconcile > Action > Match each cell to its best candidateSaturday, 15 October 2011
  • 17. FreebaseSaturday, 15 October 2011
  • 18. Freebase and namespacesSaturday, 15 October 2011
  • 19. Search for matchSaturday, 15 October 2011
  • 20. Walkthrough: Using Google Refine and APIsSaturday, 15 October 2011
  • 21. Saturday, 15 October 2011
  • 22. Escaping values for URLs "http://maps.googleapis.com/ maps/api/geocode/json? sensor=false&address=" + escape(value, "url")Saturday, 15 October 2011
  • 23. JSON explained {category : value} {category {nested category : value {nested category 2 : value } {category 2 : value}Saturday, 15 October 2011
  • 24. JSON explained {name : citytown} {geo {latitude : 42 {longitude : 2 } {administrative : citytown council}Saturday, 15 October 2011
  • 25. Walkthrough: Using Google Refine to pull out data > Create new column based on this one... GREL: value.parseJson().item1.part2[1]Saturday, 15 October 2011
  • 26. Links Delicious.com/paulb/kiev11 Delicious.com/paulb/googlerefine OnlineJournalismBlog.com/tag/ google-refineSaturday, 15 October 2011

×