Multilingual scraping from dutch government data

593 views

Published on

2 Comments
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
593
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
2
Likes
0
Embeds 0
No embeds

No notes for slide

Multilingual scraping from dutch government data

  1. 1. Multilingual Scraping fromOpen Dutch Government Data<br />Open Data Day Hackathon Ireland<br />DERI & 091 labs Galway, 4 Dec 2010<br />Tobias Wunner<br />
  2. 2. Dutch open government data<br />3 websites same data<br />but multilingual <br />
  3. 3. Dutch Spending Data <br />Javascript<br />Website<br />Pixel Graphic<br />in PDF<br />
  4. 4. Dutch Spending Data <br />Website<br />Pixel Graphic<br />in PDF<br />DIFFICULT!<br />
  5. 5. Scrape multilingual concepts<br /><ul><li> 367 concept (24 Excel files)
  6. 6. concept hierarchy</li></ul>“International items”@en<br />“Internationale conjunctur”@nl<br />super concept<br />“Long-term interest rate”@en<br />“Lange Rente”@nl<br />
  7. 7. Scrape multilingual concepts<br /><ul><li> 367 concept (24 Excel files)
  8. 8. concept hierarchy</li></ul>“International items”@en<br />“Internationale conjunctur”@nl<br />super concept<br />“Long-term interest rate”@en<br />“Lange Rente”@nl<br />
  9. 9. References<br />[1] Open Data Day Galway with results<br />http://www.opendataday.org/wiki/City_Events#Galway<br />[2] Multilingual scraper fo Dutch Government Data<br />http://scraperwiki.com/scrapers/cpbnl-multilingual-terminology/<br />

×