Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Content Markup / Plazi


Published on

  • Be the first to comment

  • Be the first to like this

Content Markup / Plazi

  1. 1. Interactive Content Extraction (semi- and fully automated) GUIDO SAUTTER KIT guido.sautter @ kit . edu Workpackage 7 Biodiversity Literature Access and Data Mining ViBRANT Virtual Biodiversity
  2. 2. Who we are & what we do <ul><li>Prof. Dr. Klemens Böhm </li></ul><ul><li>Head of Database & Information Systems Group </li></ul><ul><li>Computer Science Department @ KIT </li></ul><ul><li>Data Analytics, Citizen Science and Crowdsourcing, Query Processing in Databases </li></ul><ul><li>Guido Sautter </li></ul><ul><li>Researcher / PhD Student </li></ul><ul><li>Computer Science Department @ KIT </li></ul><ul><li>Semantic Markup of & Data Extraction from Legacy Documents, Interactive NLP, ... GoldenGATE </li></ul>5
  3. 3. What we will do in ViBRANT <ul><li>Community-contributed bibliography [month 12]: </li></ul><ul><li>“ Link as you browse!“ (an unlinked reference? link it) </li></ul><ul><li>“ Parse & correct as you browse!“ (as you encounter the need) </li></ul><ul><li>Markup Modules [month 24]: </li></ul><ul><li>“ Markup & correct as you browse!“ (same idea as above) </li></ul><ul><li>“ No time? Then note it!“ (add note to public TODO list) </li></ul><ul><li>Advanced Search [month 35]: </li></ul><ul><li>“ Browse, don‘t google!“ (generate rich in-text search links) </li></ul><ul><li>... from marked / annotated parts of documents </li></ul>5
  4. 4. Engaging Citizen Scientists <ul><li>Make existing parsing facilities for bibliographic references available via Scratchpads to help with shared bibliography: </li></ul>5
  5. 5. Assisting Taxonomists in Scratchpads <ul><li>Make available parsing functionality for taxonomic names: </li></ul>5