• Save
Combining data with Google Refine
Upcoming SlideShare
Loading in...5
×
 

Combining data with Google Refine

on

  • 5,725 views

Presentation at the Global Investigative Journalism Conference, Kiev, Ukraine, 15 October 2011

Presentation at the Global Investigative Journalism Conference, Kiev, Ukraine, 15 October 2011

Statistics

Views

Total Views
5,725
Views on SlideShare
5,707
Embed Views
18

Actions

Likes
4
Downloads
0
Comments
0

3 Embeds 18

http://paper.li 9
http://openrefinetest3.cloudapp.net 7
http://www.twylah.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Combining data with Google Refine Combining data with Google Refine Presentation Transcript

  • Get yourself ready • Google ‘Google Refine download’ http://code.google.com/p/google- refine/wiki/Downloads • Download and install Google Refine • Open it up - it should open in a browser at http://127.0.0.1:3333/Saturday, 15 October 2011
  • Google Refine Combining data OnlineJournalismBlog.com Twitter.com/PaulBradshawSaturday, 15 October 2011
  • In a nutshell... cell.cross("GPdata2008", "Practice Code").cells["Total Listsize"].value[0] • Using GREL to combine datasets • Using APIs to grab geographical data • Using Reconcile services to grab company dataSaturday, 15 October 2011 View slide
  • GREL Google Refine Expression LanguageSaturday, 15 October 2011 View slide
  • cell.cross("GPdata2008", "Practice Code").cells ["Total Listsize"].value[0]Saturday, 15 October 2011
  • Using APIs Getting contextual data .Saturday, 15 October 2011
  • What’s an API again? Ask it a question, it gives you an answer: “For each of these codes, give me the region.” “For each of these names, tell me their political party”Saturday, 15 October 2011
  • Useful APIs Geo: UK Postcodes, Google Maps Social: Twitter, Facebook, Flickr Politics: They Work For You, Data.gov.uk News: Guardian, NYT, USA Today, NPR Health, business, etc. Search for specific onesSaturday, 15 October 2011
  • API keys Sometimes needed - apply through the site Use it in the request as a passwordSaturday, 15 October 2011
  • API limits Can prevent you getting data for all your records. Try multiple APIs or split your data into multiple sheets - or buy a licenceSaturday, 15 October 2011
  • Saturday, 15 October 2011
  • Get data from an API www.chasedavis.com/refine.htmlSaturday, 15 October 2011
  • Reconciling An easier way to get dataSaturday, 15 October 2011
  • OpenCorporates.com http://vimeo.com/17924204Saturday, 15 October 2011
  • Walkthrough: Reconciliation with Open Corporates • Click on arrow at top of column • Select Reconcile > Start Reconciling... • Click on Add Standard Service... • http://opencorporates.com/reconcile • And start...Saturday, 15 October 2011
  • Walkthrough: Reconciliation with Open Corporates • Click ‘Search for Match’ and select • Click double tick icon to bulk reconcile • Reconcile > Action > Match each cell to its best candidateSaturday, 15 October 2011
  • FreebaseSaturday, 15 October 2011
  • Freebase and namespacesSaturday, 15 October 2011
  • Search for matchSaturday, 15 October 2011
  • Walkthrough: Using Google Refine and APIsSaturday, 15 October 2011
  • Saturday, 15 October 2011
  • Escaping values for URLs "http://maps.googleapis.com/ maps/api/geocode/json? sensor=false&address=" + escape(value, "url")Saturday, 15 October 2011
  • JSON explained {category : value} {category {nested category : value {nested category 2 : value } {category 2 : value}Saturday, 15 October 2011
  • JSON explained {name : citytown} {geo {latitude : 42 {longitude : 2 } {administrative : citytown council}Saturday, 15 October 2011
  • Walkthrough: Using Google Refine to pull out data > Create new column based on this one... GREL: value.parseJson().item1.part2[1]Saturday, 15 October 2011
  • Links Delicious.com/paulb/kiev11 Delicious.com/paulb/googlerefine OnlineJournalismBlog.com/tag/ google-refineSaturday, 15 October 2011