Successfully reported this slideshow.

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

1

Share

Loading in …3
×
1 of 27
1 of 27

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

1

Share

Download to read offline

動的HTMLスクレイピング対応並列分散クローラのご紹介
札幌Ruby会議02

動的HTMLスクレイピング対応並列分散クローラのご紹介
札幌Ruby会議02

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

  1. 1. require 'rubygems' require 'sinatra' post '/' do url = params[:url] data = params[:data] store(url, data) next_url = process(url) next_url end
  2. 2. // ==UserScript== // @name greasi_scraper // @namespace http://libelabo.jp/ // @include http://images.google.co.jp/* // @require http://ajax.googleapis.com/ajax/libs/jquery/1.3.1/jquery.min.js // ==/UserScript== function postData(data) { var postData = $.param({url: location.href, data: JSON.stringify(data)}); GM_xmlhttpRequest({ method: "POST", url: "http://libelabo.jp/greasi/", headers: {'Content-type':'application/x-www-form-urlencoded'}, data: postData, onload: function(xhr){ location.href = xhr.responseText } }); }

×