Semantic data mining of literature

980 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
980
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Leading the way 40 years ago -now 200,000+ students many mature, also CPD NLP in our own group, Also experts in semantic web, ie KMi And through the BBC close involvement with popular science on radio and TV, most relevant to this audience is another OU + NHM collaboration: iSpot
  • We process text Extract key words and concepts Format into XML for export Not scanning service Not an OCR service
  • Data mining to look for patterns Patterns might be patterns of erros, eg BCA ae ligature Context resolve problems like Homo -> Homa Validate and populate with existing resources, so our approach is sustainable after ViBRANT completes
  • Scratchpads in the first instance But because we are using a modular approach and delivering the tools as web services they could be accessed from any other biodiversity resource
  • As you can see our work package is BLAND not ViBRANT So back to David Morse for the discussion and your questions
  • Semantic data mining of literature

    1. 1. Semantic data mining of literature David (Dauvit) King The Open University [email_address] Workpackage 7 Biodiversity literature access and data mining ViBRANT Virtual Biodiversity
    2. 2. Who we are & what we do <ul><li>University specialising in distance-education </li></ul><ul><li>Experience with: </li></ul><ul><li>natural language processing </li></ul><ul><li>semantic web </li></ul><ul><li>citizen science </li></ul><ul><li>Your place to share nature: iSpot </li></ul>5
    3. 3. What we will do in ViBRANT <ul><li>Deliver automated and semi-automated data extraction and disambiguation from scanned literature </li></ul>5
    4. 4. How we are doing it <ul><li>Apply data mining techniques to scanned literature </li></ul><ul><li>Determine context for disambiguation </li></ul><ul><li>Validate using existing Biodiversity resources </li></ul>5
    5. 5. Who are our users & how will they engage? <ul><li>Taxonomists and citizen scientists via Scratchpads… </li></ul><ul><li>and potentially anyone interested in biodiversity using a recognised resource </li></ul>5
    6. 6. <ul><li>As this is the last of the </li></ul><ul><li>B iodiversity l iterature a ccess </li></ul><ul><li>a n d d ata mining </li></ul><ul><li>lightning talks, </li></ul><ul><li>back to David Morse for the </li></ul><ul><li>WP7 discussion and your questions. </li></ul>6

    ×