What is Crawler?
When you are hungry, you prefer to go a
restaurant that can serve you with delicious food, of
Restaurant : Site/Base_url
Food : Data/Information
Interest : Relevant knowledge
How to make Robot?
● DOM (Document Object Model)
● Library (depending upon language)
How to make in Ruby?
● Best to use for simple text-extraction
● Clear API
● Fast and better than Rubyfulsoup
● Methods like parent and child, sibling as in JS,
makes life easier
Is something missing?
What you think?
Is it really easy
makes scraping fast and efficient?
Firebug integrates with Firefox to put a wealth of web
development tools at your fingertips while you browse.
You can edit, debug, and monitor CSS, HTML, and
● Firebug (http://www.getfirebug.com/)
● This makes life easier. Do learn to use it
Enough...where is the code??
● Build Doc = Hpricot(open(url-name))
● To walk through DOM: (Doc/”#header”)
● More: (Doc/”.love_class”), (Doc/”a/ul/li”)