0
Scraping the      OlympicsPaul Bradshaw, author: Scraping for Journalists                                 *        Leanpub...
?Scraping basicsCombining dataFinding stories in data                      *
*
Function (Parameters)                  *
Function (Parameters)=SUM(A2:A50)=AVERAGE(B2:B300)=COUNTIF(A10:A3000,”Smith”)                       *
(“string”, index)                *
Tip: search fordocumentation     *
Tip: search for structure      around data   *
*
//div[starts-with(@class, ‘jobWrap’)]*
*
Combining data          *
?Question:Which torchbearers arefrom Dorset?                    *
*
*
*
*
*
*
*
*
?Finding leads:Corporate torchbearers?                   *
*
*
*
*
New entries - ordisappearing ones               *
*
*
*
*
Leanpub.com/scrapingforjournalists                      @paulbradshaw             onlinejournalismblog.com               h...
Upcoming SlideShare
Loading in...5
×

Scraping the Olympics

7,317

Published on

Presentation for a workshop at the BBC Data Journalism Day, July 2012

1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total Views
7,317
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
13
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Scraping the Olympics"

  1. 1. Scraping the OlympicsPaul Bradshaw, author: Scraping for Journalists * Leanpub.com/scrapingforjournalists
  2. 2. ?Scraping basicsCombining dataFinding stories in data *
  3. 3. *
  4. 4. Function (Parameters) *
  5. 5. Function (Parameters)=SUM(A2:A50)=AVERAGE(B2:B300)=COUNTIF(A10:A3000,”Smith”) *
  6. 6. (“string”, index) *
  7. 7. Tip: search fordocumentation *
  8. 8. Tip: search for structure around data *
  9. 9. *
  10. 10. //div[starts-with(@class, ‘jobWrap’)]*
  11. 11. *
  12. 12. Combining data *
  13. 13. ?Question:Which torchbearers arefrom Dorset? *
  14. 14. *
  15. 15. *
  16. 16. *
  17. 17. *
  18. 18. *
  19. 19. *
  20. 20. *
  21. 21. *
  22. 22. ?Finding leads:Corporate torchbearers? *
  23. 23. *
  24. 24. *
  25. 25. *
  26. 26. *
  27. 27. New entries - ordisappearing ones *
  28. 28. *
  29. 29. *
  30. 30. *
  31. 31. *
  32. 32. Leanpub.com/scrapingforjournalists @paulbradshaw onlinejournalismblog.com helpmeinvestigate.com slideshare.net/onlinejournalist * linkedin.com/in/onlinejournalist
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×