Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
require 'rubygems'
require 'sinatra'

post '/' do
  url = params[:url]
  data = params[:data]
  store(url, data)
  next_ur...
//   ==UserScript==
//   @name      greasi_scraper
//   @namespace http://libelabo.jp/
//   @include   http://images.googl...
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Introduce of the parallel distributed Crawler with scraping Dynamic HTML
Upcoming SlideShare
Loading in …5
×

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

2,886 views

Published on

動的HTMLスクレイピング対応並列分散クローラのご紹介
札幌Ruby会議02

Published in: Technology
  • Be the first to comment

Introduce of the parallel distributed Crawler with scraping Dynamic HTML

  1. 1. require 'rubygems' require 'sinatra' post '/' do url = params[:url] data = params[:data] store(url, data) next_url = process(url) next_url end
  2. 2. // ==UserScript== // @name greasi_scraper // @namespace http://libelabo.jp/ // @include http://images.google.co.jp/* // @require http://ajax.googleapis.com/ajax/libs/jquery/1.3.1/jquery.min.js // ==/UserScript== function postData(data) { var postData = $.param({url: location.href, data: JSON.stringify(data)}); GM_xmlhttpRequest({ method: "POST", url: "http://libelabo.jp/greasi/", headers: {'Content-type':'application/x-www-form-urlencoded'}, data: postData, onload: function(xhr){ location.href = xhr.responseText } }); }

×