Submit Search
Upload
Scraping the Olympics
•
3 likes
•
14,975 views
Paul Bradshaw
Follow
Presentation for a workshop at the BBC Data Journalism Day, July 2012
Read less
Read more
Education
News & Politics
Technology
Report
Share
Report
Share
1 of 32
Download now
Download to read offline
Recommended
Making data journalism work
Making data journalism work
Paul Bradshaw
Data validation in the Digital Age
Data validation in the Digital Age
J T "Tom" Johnson
Open Data in the Newsroom: What's the story? (Talk from OK Con 2011 in Berlin)
Open Data in the Newsroom: What's the story? (Talk from OK Con 2011 in Berlin)
Mirko Lorenz
Data Journalism
Data Journalism
pilhofer
Data journalism's future: new sources, new opportunities
Data journalism's future: new sources, new opportunities
Paul Bradshaw
Olympic Pages
Olympic Pages
Society for News Design
Brief introduction to data visualization
Brief introduction to data visualization
Zach Gemignani
How to work with a bullshitting robot
How to work with a bullshitting robot
Paul Bradshaw
Recommended
Making data journalism work
Making data journalism work
Paul Bradshaw
Data validation in the Digital Age
Data validation in the Digital Age
J T "Tom" Johnson
Open Data in the Newsroom: What's the story? (Talk from OK Con 2011 in Berlin)
Open Data in the Newsroom: What's the story? (Talk from OK Con 2011 in Berlin)
Mirko Lorenz
Data Journalism
Data Journalism
pilhofer
Data journalism's future: new sources, new opportunities
Data journalism's future: new sources, new opportunities
Paul Bradshaw
Olympic Pages
Olympic Pages
Society for News Design
Brief introduction to data visualization
Brief introduction to data visualization
Zach Gemignani
How to work with a bullshitting robot
How to work with a bullshitting robot
Paul Bradshaw
How to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in R
Paul Bradshaw
ChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalism
Paul Bradshaw
Data journalism: history and roles
Data journalism: history and roles
Paul Bradshaw
Working on data stories: different approaches
Working on data stories: different approaches
Paul Bradshaw
Visual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniques
Paul Bradshaw
Using narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalism
Paul Bradshaw
Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)
Paul Bradshaw
Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)
Paul Bradshaw
Data journalism on the air: 3 tips
Data journalism on the air: 3 tips
Paul Bradshaw
7 angles for data stories
7 angles for data stories
Paul Bradshaw
Uncertain times, stories of uncertainty
Uncertain times, stories of uncertainty
Paul Bradshaw
Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)
Paul Bradshaw
Storytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reporting
Paul Bradshaw
Cognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalists
Paul Bradshaw
The 3 chords of data journalism
The 3 chords of data journalism
Paul Bradshaw
Data journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for stories
Paul Bradshaw
Teaching AI in data journalism
Teaching AI in data journalism
Paul Bradshaw
10 ways AI can be used for investigations
10 ways AI can be used for investigations
Paul Bradshaw
Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)
Paul Bradshaw
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Paul Bradshaw
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
manuelaromero2013
More Related Content
More from Paul Bradshaw
How to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in R
Paul Bradshaw
ChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalism
Paul Bradshaw
Data journalism: history and roles
Data journalism: history and roles
Paul Bradshaw
Working on data stories: different approaches
Working on data stories: different approaches
Paul Bradshaw
Visual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniques
Paul Bradshaw
Using narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalism
Paul Bradshaw
Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)
Paul Bradshaw
Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)
Paul Bradshaw
Data journalism on the air: 3 tips
Data journalism on the air: 3 tips
Paul Bradshaw
7 angles for data stories
7 angles for data stories
Paul Bradshaw
Uncertain times, stories of uncertainty
Uncertain times, stories of uncertainty
Paul Bradshaw
Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)
Paul Bradshaw
Storytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reporting
Paul Bradshaw
Cognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalists
Paul Bradshaw
The 3 chords of data journalism
The 3 chords of data journalism
Paul Bradshaw
Data journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for stories
Paul Bradshaw
Teaching AI in data journalism
Teaching AI in data journalism
Paul Bradshaw
10 ways AI can be used for investigations
10 ways AI can be used for investigations
Paul Bradshaw
Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)
Paul Bradshaw
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Paul Bradshaw
More from Paul Bradshaw
(20)
How to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in R
ChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalism
Data journalism: history and roles
Data journalism: history and roles
Working on data stories: different approaches
Working on data stories: different approaches
Visual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniques
Using narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalism
Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)
Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)
Data journalism on the air: 3 tips
Data journalism on the air: 3 tips
7 angles for data stories
7 angles for data stories
Uncertain times, stories of uncertainty
Uncertain times, stories of uncertainty
Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)
Storytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reporting
Cognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalists
The 3 chords of data journalism
The 3 chords of data journalism
Data journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for stories
Teaching AI in data journalism
Teaching AI in data journalism
10 ways AI can be used for investigations
10 ways AI can be used for investigations
Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Scraping for journalists - ideas, concepts and tips (CIJ Summer School 2019)
Recently uploaded
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
manuelaromero2013
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
dawncurless
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
GaneshChakor2
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
FatimaKhan178732
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
Celine George
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
Chameera Dedduwage
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
Sayali Powar
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
ssuser54595a
microwave assisted reaction. General introduction
microwave assisted reaction. General introduction
Maksud Ahmed
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
GeoBlogs
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
RoyAbrique
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
David Douglas School District
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
iammrhaywood
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
Jayanti Pande
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
RaunakKeshri1
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
pboyjonauth
mini mental status format.docx
mini mental status format.docx
PoojaSen20
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
SafetyChain Software
Recently uploaded
(20)
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
microwave assisted reaction. General introduction
microwave assisted reaction. General introduction
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
mini mental status format.docx
mini mental status format.docx
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
Scraping the Olympics
1.
Scraping the
Olympics Paul Bradshaw, author: Scraping for Journalists * Leanpub.com/scrapingforjournalists
2.
? Scraping basics Combining data Finding
stories in data *
3.
*
4.
Function (Parameters)
*
5.
Function (Parameters) =SUM(A2:A50) =AVERAGE(B2:B300) =COUNTIF(A10:A3000,”Smith”)
*
6.
(“string”, index)
*
7.
Tip: search for documentation
*
8.
Tip: search for
structure around data *
9.
*
10.
//div[starts-with(@ class, ‘jobWrap’)]*
11.
*
12.
Combining data
*
13.
? Question: Which torchbearers are from
Dorset? *
14.
*
15.
*
16.
*
17.
*
18.
*
19.
*
20.
*
21.
*
22.
? Finding leads: Corporate torchbearers?
*
23.
*
24.
*
25.
*
26.
*
27.
New entries -
or disappearing ones *
28.
*
29.
*
30.
*
31.
*
32.
Leanpub.com/scrapingforjournalists
@paulbradshaw onlinejournalismblog.com helpmeinvestigate.com slideshare.net/onlinejournalist * linkedin.com/in/onlinejournalist
Download now