Slides from my talk on web scraping to BrisJS the Brisbane JavaScript meetup.
You can find the code on GitHub: https://github.com/ashleydavis/brisjs-web-scraping-talk
7. Web scraping is a horrible idea
โ The scripts are tightly linked to the HTML
โ The scripts fragile and prone to breaking
โ Identifying HTML elements to extract is messy work
โ Legal gray area
โ You could be blocked from the web site
8. Sometimes web scraping is all we have
โ The data isnโt accessible any other way
โ We still need the data
13. Production issues...
Performance
โ Cache the Nightmare object / batch requests
โ Disable image download
Debugging
โ Show the Electron window
โ Enable devtools
โ Handle errors from Nightmare
โ Display logging from the headless browser
14. Resources
โ Code
โ github.com/ashleydavis/brisjs-web-scraping-talk
โ Contact
โ Email: ashley@codecapers.com.au
โ Twitter: @ashleydavis75
โ GitHub:
โ ashleydavis
โ data-forge
โ Data Wrangling with JavaScript
โ datawranglingwithjavascript.com
โ The Data Wrangler
โ the-data-wrangler.com
My book