Web scraping in python

778 views

Published on

Introduction to Web Scraping in Python

Published in: Internet
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
778
On SlideShare
0
From Embeds
0
Number of Embeds
135
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Web scraping in python

  1. 1. Web Scraping with Python by @sauravtom (work is progress …)
  2. 2. Data Scraping Automated Process Specify css or xml path grab the content store it in a database
  3. 3. Who uses Scrapers ? Scrapers as backbone of Big Data Importance in Industry level as well as indie projects.
  4. 4. Why choose python ? Robust, flexible and powerful Relatively lesser development time Easy to learn and use Huge standard library, thorough documentation and helpful community.
  5. 5. Scraping libraries in python lxml BS4 Scrapy Mechanize twill ...
  6. 6. Scraper Demonstration in bs4 Inspect the element Find the node Plug it in (some code and pictures)
  7. 7. Making Scrapers faster Thread and Queues (some code ...)
  8. 8. Detecting bottlenecks Introduction to profiling in python (some code)
  9. 9. Making Scrapers even faster Using memcache to reduce redundant scraping (some code)
  10. 10. Thats it !! (links to the code present in these slides)

×