Piggy Bank


Published on

07-0113 - SWT Presentation 01

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Piggy Bank

  1. 1. PIGGY BANK Experience the Semantic Web Inside Your Web Browser Qurat ul Ain 07-0113
  2. 2. Road Map <ul><li>Vision behind Piggy Bank </li></ul><ul><li>Piggy Bank Introduction </li></ul><ul><li>Screen Scrapers </li></ul><ul><li>Piggy Bank Approach </li></ul><ul><li>Semantic Bank </li></ul><ul><li>User Experiences </li></ul><ul><li>Demo - VacancyGuide Screen Scraper </li></ul><ul><li>Implementation </li></ul><ul><li>Architecture </li></ul><ul><li>Limitations and Recommendations </li></ul><ul><li>Conclusion </li></ul><ul><li>Acknowledgments </li></ul><ul><li>References </li></ul>2/2/2010 Piggy Bank
  3. 3. Vision behind Piggy Bank <ul><li>Semantic Web envisions a Web wherein information is offered free of presentation, allowing effective exchange across web sites and across web pages. </li></ul><ul><li>But without substantial Semantic Web content, few tools will be written to consume it; without many such tools, there is little appeal to publish Semantic Web content. </li></ul>2/2/2010 Piggy Bank
  4. 4. Piggy Bank <ul><li>Mozilla Firefox extension. </li></ul><ul><li>Turns the browser into a mashup platform, by allowing to extract data from different web sites and mix them together. </li></ul><ul><li>Wherever Semantic Web content is not available, Piggy Bank can invoke screen scrapers to re-structure information within web pages into Semantic Web format. </li></ul>2/2/2010 Piggy Bank
  5. 5. Screen Scrapers <ul><li>When the web page does not include the “pure” information itself, Piggy Bank needs to invoke the software code to extract such information. For each web page a different piece of software code is required to perform this translation — that piece of code is called a “screen scraper”. </li></ul>2/2/2010 Piggy Bank
  6. 6. Available Scrapers <ul><li>FlickrPhoto Scraper </li></ul><ul><li>Generic Web Page Scraper </li></ul><ul><li>LinkedIn Scraper </li></ul><ul><li>Ontoworld Upcoming Events Scraper </li></ul><ul><li>Orkut Friends Scraper </li></ul><ul><li>Presidential Candidates 2008 Scraper </li></ul><ul><li>VacancyGuide Screen Scraper </li></ul>2/2/2010 Piggy Bank
  7. 7. Approach
  8. 8. Approach <ul><li>Piggy Bank, a tool that lets Web users extract pure information items from within web pages and save them in Semantic Web format (RDF). </li></ul><ul><li>Piggy Bank then lets users make use of these items right inside the same web browser. These items, collected from different sites, can now be browsed, searched, sorted, and organized together, regardless of their origins and types. </li></ul>2/2/2010 Piggy Bank
  9. 9. Semantic Bank <ul><li>For further experience of Semantic Web. </li></ul><ul><li>A common repository of RDF to which a community of Piggy Bank users can contribute to share the information they have collected. </li></ul>2/2/2010 Piggy Bank
  10. 10. User Experiences
  11. 11. 2/2/2010 Collecting Information Piggy Bank
  12. 12. Sharing Information 2/2/2010 Piggy Bank
  13. 13. Installation of Scrapers <ul><li>It supports easy and safe installation of scrapers through the use of RDF. </li></ul><ul><li>To install a scraper in Piggy Bank, the user only needs to save its metadata into his/her Piggy Bank and then “activates” it. </li></ul>2/2/2010 Piggy Bank
  14. 14. Demo - VacancyGuide Screen Scraper
  15. 15. <ul><li>http://simile.mit.edu/piggy-bank/screencasts/apartments.swf </li></ul>2/2/2010 Piggy Bank
  16. 16. Implementation
  17. 17. Piggy Bank <ul><li>Piggy Bank is implemented as an extension to the web browser rather than as a stand-alone application. </li></ul><ul><ul><li>Rich and familiar user interface of the web browser. </li></ul></ul><ul><ul><li>Lets users experience the benefits of Semantic Web technologies without much cost. </li></ul></ul><ul><ul><li>It also lets users enjoy the RDF-enabled sites while still improving their experience on the scrapable sites. </li></ul></ul>2/2/2010 Piggy Bank
  18. 18. <ul><li>They selected Java as the Piggy Bank’s core implementation language. It gave them benefit in using the Java-based RDF libraries as well as for many other functionalities like for parsing RSS feeds and etc. </li></ul><ul><li>They made the act of collecting information items as lightweight as possible, first they make use of a status-bar icon to indicate that a web page is scrapable, and second they support collecting through a single-click on that same icon. </li></ul>2/2/2010 Piggy Bank
  19. 19. <ul><li>Piggy Bank uses any combination of the following three methods for collection: </li></ul><ul><ul><li>Links from the current web page to Web resources in other formats are retrieved and are parsed into RDF. </li></ul></ul><ul><ul><li>Available XSL transformations are applied on the current web page’s DOM (Document Object Model). </li></ul></ul><ul><ul><li>Available Javascript code is run on the current web page’s DOM, retrieving other web pages to process if necessary. </li></ul></ul><ul><li>They made use of a servlet capable of generating DHTML-based user interface, so that users can browse the collected information as web pages. </li></ul>2/2/2010 Piggy Bank
  20. 20. Architecture
  21. 21. 2/2/2010 Piggy Bank
  22. 22. Semantic Bank <ul><li>Semantic Bank shares a very similar architecture to the Java part of Piggy Bank. They both make use of the same servlet to serve the faceted browsing user interface. </li></ul><ul><li>It gives each of its subscribed members a different profile for persisting data while it keeps another profile where “published” information from all members gets mixed together. </li></ul><ul><li>Semantic Bank listens to HTTP POSTs sent by a user’s Piggy Bank to upload his/her data. All of the uploaded data goes into that user’s profile on the Semantic Bank, and those items marked as public are copied to the common profile. </li></ul>2/2/2010 Piggy Bank
  23. 23. Limitations and Recommendations
  24. 24. <ul><li>Needs more Scrapers and more intellectual and radiant ideas. </li></ul><ul><li>Needs to be adopted widely to make a real and significant change on the Web. </li></ul>2/2/2010 Piggy Bank
  25. 25. <ul><li>Copyright issues? </li></ul><ul><li>How will copyright laws adapt to the nature of the information being redistributed? </li></ul><ul><li>Will scraper writers be held responsible for illegal use of the information their scrapers produce on a massive scale? </li></ul><ul><li>Is Semantic Bank responsible for checking for copyright infringement of information published to it? </li></ul>2/2/2010 Piggy Bank
  26. 26. <ul><li>In order to remain in control of their information, one might expect publishers to publish Semantic Web information themselves so to eliminate the need for scraping their sites. </li></ul><ul><li>They might include copyright information into every item they publish and hold Piggy Bank and Semantic Bank responsible for keeping that information intact as the items are moved about. </li></ul><ul><li>So that this evolution of Semantic Web would not have to be stopped. </li></ul>2/2/2010 Piggy Bank
  27. 27. Conclusion
  28. 28. <ul><li>In adopting Piggy Bank, users immediately gain flexibility in the ways they use existing Web information without ever leaving their familiar web browser. </li></ul><ul><li>Through the use of Piggy Bank, as they consume Web information, they automatically produce Semantic Web information. </li></ul><ul><li>Through Semantic Bank, the information they have collected, on publishing, merges together smoothly, giving rise to higher-ordered patterns and structures. This is how the Semantic Web might emerge from the Web. </li></ul>2/2/2010 Piggy Bank
  29. 29. Scraping The Web <ul><li>Empowering Web users, giving them control over the information that they encounter. Even in the cases where the web sites do not publish Semantic Web information directly, users can still extract the data using scrapers. </li></ul><ul><li>It has introduced a means for users to integrate information from multiple sources on the Web at their own choice. </li></ul>2/2/2010 Piggy Bank
  30. 30. Information wants to be Free! <ul><li>This system goes beyond just collecting Semantic Web information but also enables users to publish the collected information back out to the Web. </li></ul>2/2/2010 Piggy Bank
  31. 31. Acknowledgments
  32. 32. Simile Project <ul><li>This work is conducted by the Simile Project, a collaborative effort between the MIT Libraries, the Haystack group at MIT CSAIL, and the World Wide Web Consortium. </li></ul>2/2/2010 Piggy Bank
  33. 33. References
  34. 34. <ul><li>http://simile.mit.edu/ </li></ul><ul><li>http://simile.mit.edu/wiki/Piggy_Bank </li></ul><ul><li>http://simile.mit.edu/papers/iswc05.pdf </li></ul><ul><li>http://simile.mit.edu/piggy-bank/screencasts/apartments.swf </li></ul>2/2/2010 Piggy Bank
  35. 35. Thankyou Folks!