3. What is Web-Scraping
“Web scraping, web harvesting, or web data extraction is data
scraping used for extracting data from websites”
https://en.wikipedia.org/wiki/Web_scraping
4. Throw Your Hands in The Air Like You Just Don't
Care
Who has web scraped?
Why did you web scrape?
5. Reasons
● Poor (or No) API (Application Programming Interface)
○ “Where there is a will, there is way”
● Poor design
○ Non-responsive
○ Cluttered
○ E.g. https://whereisthemyciti.com/
● Curiosity
○ Python
○ Web-scraping
○ New skill
28. Summary
1. Identify the structure, and interesting components e.g. <table>, ids, classes
2. Identify how to reach the data e.g. urls
3. ‘Scrape’ the data with code e.g. code
4. Profit??
29. Summary - Web Scraping Sometimes #2
● Inconsistent
○ Structure can change
● Code can be messy
● Lots of data manipulation
○ Paying for a well-maintained API is better than headaches