Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
require 'rubygems'
require 'sinatra'

post '/' do
  url = params[:url]
  data = params[:data]
  store(url, data)
  next_ur...
//   ==UserScript==
//   @name      greasi_scraper
//   @namespace http://libelabo.jp/
//   @include   http://images.googl...
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Upcoming SlideShare
Loading in …5
×

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

動的HTMLスクレイピング対応並列分散クローラのご紹介
札幌Ruby会議02

  • Login to see the comments

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

  1. 1. require 'rubygems' require 'sinatra' post '/' do url = params[:url] data = params[:data] store(url, data) next_url = process(url) next_url end
  2. 2. // ==UserScript== // @name greasi_scraper // @namespace http://libelabo.jp/ // @include http://images.google.co.jp/* // @require http://ajax.googleapis.com/ajax/libs/jquery/1.3.1/jquery.min.js // ==/UserScript== function postData(data) { var postData = $.param({url: location.href, data: JSON.stringify(data)}); GM_xmlhttpRequest({ method: "POST", url: "http://libelabo.jp/greasi/", headers: {'Content-type':'application/x-www-form-urlencoded'}, data: postData, onload: function(xhr){ location.href = xhr.responseText } }); }

×