A introduction to Scraperwiki (for not developers)
Upcoming SlideShare
Loading in...5
×
 

A introduction to Scraperwiki (for not developers)

on

  • 56,265 views

A simple introduction to scraperwiki with attention for the NOT developers

A simple introduction to scraperwiki with attention for the NOT developers

Statistics

Views

Total Views
56,265
Slideshare-icon Views on SlideShare
56,260
Embed Views
5

Actions

Likes
2
Downloads
22
Comments
0

1 Embed 5

https://twitter.com 5

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A introduction to Scraperwiki (for not developers) A introduction to Scraperwiki (for not developers) Presentation Transcript

    • Summer School "Data journalism e visualizzazione grafica dei dati" 29 July 2011 – Flavon (TN)A introduction to for not developers Maurizio Napolitano <napo@fbk.eu>
    • Description in the nameSCRAPER WIKIsourcehttp://www.modot.org/central/major_projects/July2006photos.htm source http://www.commoncraft.com/video/wikis
    • Wiki like Wikipedia Scraper like ???a scraper extract datafrom a content
    • Legal aspectScraper sites may violatecopyright law.Even taking content from an open content site can be acopyright violation, if done in a way which does not respectthe license.For instance, the GNU Free Documentation License (GFDL)and Creative Commons ShareAlike (CC-BY-SA) licensesrequire that a republisher inform readers of the licenseconditions, and give credit to the original author. http://en.wikipedia.org/wiki/Scraper_site
    • .. then scraperwiki is ... https://scraperwiki.com/A place where share scrapers … and data :)
    • ScraperWiki legal aspectUse6. You agree that, in using the ScraperWiki site and services, you willnot interfere with the legal rights[...]Intellectual Property9. Subject to the following paragraphs, the source code of theScraperWiki site, and all other copyrightable materials that form a partof it is released under the GNU Affero General Public License.10. All scraping code hosted on the site is licensed under the GNUGeneral Public License. You hereby license all scraping code youcreate using ScraperWiki under the same licence.11. You agree to assert no additional intellectual property rights,including copyright and database right, in any scraped data other thanthose which subsisted in the relevant web sites before the running ofthe relevant scraper and which were held by you at that time.12. You grant us a non-exclusive, worldwide, licence to use any datathat you store on our site, for the purposes of administering the site. https://scraperwiki.com/terms_and_conditions/
    • ScraperWiki legal aspectUSE6.You agree [..] you will not interfere withthe legal rights[...]INTELLECTUAL PROPERTY9. […] the source code of the ScraperWiki [..] is releasedunder the GNU Affero General Public License.10. All scraping code […] is licensed under the GNUGeneral Public License.11.You agree to assert no additionalintellectual property rights [...]12. You grant us a non-exclusive, worldwide, licence to use any datathat you store on our site, for the purposes of administering the site.
    • HOW CREATE A SCRAPER?
    • The NOT developers
    • The technical approachhttp://unstats.un.org/unsd/demographic/products/socind/education.htm
    • Behind the page HTML code
    • Where are the data? There is a structure behind!!!
    • The algorithm!!!Download th web page Read the informationFind the right position Extract the data Create a CSV file data1;data2;data3 [...] dataN1;dataN2;dataN3
    • Example: python codehttps://scraperwiki.com/docs/python/python_intro_tutorial/
    • … and everything run in the cloud!!!
    • The code in the cloudhttps://scraperwiki.com/scrapers/mlb_rosters/
    • Sharing & ReUse
    • Enjoy!!!httpS://scraperwiki.com/
    • Thanks! A introduction to ScraperWiki for NOT developers by Maurizio Napolitano <napo@fbk.eu> is licensed under a Creative Commons Attribuzione 3.0 Unported License.Created for Summer School "Data journalism e visualizzazione grafica dei dati" 29 July 2011 – Flavon (TN)