Successfully reported this slideshow.

SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents

4

Share

Loading in …3
×
1 of 35
1 of 35

More Related Content

Similar to SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents

Related Books

Free with a 14 day trial from Scribd

See all

SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents

  1. 1. SemTechBiz 2012, San Francisco, June 4th 2012 Domeo: a web-based tool for semantic annotation of online documents http://www.annotationframework.org/ Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School
  2. 2. About Me • Assistant in Neurology at Mass General Hospital • Research faculty at Harvard Medical School • Author of 30+ scientific publications • Senior software and knowledge engineer • Member of W3C HCLS Interest Group • Co-chair of the W3C Open Annotation Community Group http://www.paolociccarese.info/ Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  3. 3. As (biomedical) scientists… • We deal with an increasing amount of digital resources: documents, images, videos, datasets, vocabularies, databases, software… – About 150-200 articles a week – 10mins/article ≈ 34hours/week? – How can we manage it? http://www.ncbi.nlm.nih.gov/pubmed/ Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  4. 4. … we commonly use annotation • We annotate prints, HTML and PDFs • We bookmark/tag web pages… • … and publications (citations/references) • We comment on web pages, blogs, forums and emails • We tweet… • … Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  5. 5. Are we efficient and effective? • Can we integrate our annotations? • Can we leverage machine computation? • Can we share it easily with our colleagues? • Can we capitalize on the work of colleagues? • Can we integrate it with other resources? • Can we easily observe science evolution? • Can we easily detect the up-to-date science? • Can we discover valuable resources? Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  6. 6. A ‘semantic’ view of a publication Semantic Web Applications in Neuromedicine (SWAN) project [2007] classic publication scientific discourse ‘semantic’ representation http://tinyurl.com/cgyna2m Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  7. 7. graph representation Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  8. 8. SWAN Creation/Curation Process Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  9. 9. How do we empower ‘Joe Scientist’? • Even simple linking tasks are not ‘standardized’, hard to share and not easy to perform http://antibodyregistry.org/antibody17/antibodyform.html? gui_type=advanced&ab_id=2266850 antibodyregistry.org Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  10. 10. Enable manual annotation of digital resources • Visually and effectively annotate - better semantically annotate - any digital resource and resource fragment, while performing our regular browsing/reading activities http://www.ncbi.nlm.nih.gov/pubmed/19822029 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2874257/ ≈ Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  11. 11. Leverage text mining and community curation • Run text mining and entities recognition algorithms on scientific documents and persist the results in a standard format • Benefit from crowdsourcing by supporting curation of manual and automatic annotation Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  12. 12. Enable semantic tagging (ontologies) http://purl.obolibrary.org/obo/PR_000004168 Label ‘amyloid beta A4 protein’ Exact synonyms ‘APP’, ‘amyloidogenic glycoprotein’, … Related Synonyms ‘A4’, ‘ABPP’, Is a http://purl.obolibrary.org/obo/PR_000000001 Label ‘protein’ Definition ‘An amino acid chain that…’ Source: Protein Ontology (PRO) https://pir5.georgetown.edu/wiki/PRO Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  13. 13. APPs for the Semantic Resources Project, May 2010 Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  14. 14. Zooming in APPs for the Semantic Resources Project, May 2010 Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  15. 15. …and more • Share the annotation in a common format • Efficiently search (inference, rules) the annotation • Reuse/integrate the annotation • Exercise access control • Subscribe to feeds related to topics of interest – Proteins, Cells, Authors, Papers… • Retrieve additional content (mashups) • Find new resources • Find collaborators Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  16. 16. Annotation Ontology (AO) • OWL vocabulary for representing and sharing annotation of digital resources and their fragments • Not only for biomedicine! Ciccarese et al, 2011 An open annotation ontology for science on web 3.0 http://www.jbiomedsem.com/content/2/S2/S4 http://purl.org/ao/home (Website/Wiki) Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  17. 17. AO Overview AO allows to annotate: Resources: Documents (HTML, PDF, Word, Excel), Images, Databases, Web Services... (and their fragments) Specifying (or not) an: Annotation Type: through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define. With (or without) a: Topic: free text, structured text, URIs, RDF entities, RDF graphs, domain ontologies… Tracing: Provenance: who created what, when, with which software, with what expectations… Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  18. 18. AlzSWAN: http://tinyurl.com/18r Annotating a document Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  19. 19. Annotating a document fragment Protein Ontology – PRO: http://purl.org/obo/owl/PRO Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  20. 20. HyQue triples Experiments Workflows Paolo Ontology 2.0: http://code.google.com/p/swan-ontology/ SWAN Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  21. 21. Annotation Ontology Network Biotea The Living Document Project Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  22. 22. Open Annotation Community Group • Annotation Ontology is going to be replaced in our applications by the Open Annotation Model developed through the W3C Open Annotation Community Group – Website http://www.w3.org/community/openannotation/ – Core Model http://www.openannotation.org/spec/core/ – Extensions http://www.openannotation.org/spec/extension/ Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  23. 23. • DOMEO Annotation Toolkit is a web application for producing and sharing manual, semi-automatic and automatic annotation Ciccarese et al, 2012 Open semantic annotation of scientific publications using DOMEO http://www.jbiomedsem.com/content/3/S1/S1 http://annotationframework.org Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  24. 24. DOMEO: Document Metadata Organizer Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  25. 25. Semantic Tags or Qualifiers [1] Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  26. 26. Semantic Tags or Qualifiers [2] Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  27. 27. Semantic Tags or Qualifiers [3] Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  28. 28. Domeo and the NCBO Annotator http://www.bioontology.org/annotator-service • Domeo allows automatic/manual annotation with terms coming from selected ontologies managed by the BioPortal Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  29. 29. Running NCBO Annotator Additional text mining services will be listed here Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  30. 30. NCBO Annotator Results in Domeo List of recognized entities Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  31. 31. Results Curation Customizable Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  32. 32. Cumulative Results Curation • One item only • All instances with the same text match • All instances independently from the text match Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  33. 33. Serialization in AO/RDF (Share) Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  34. 34. http://www.slideshare.net/paolociccarese/domeo-and-text-mining UIMA, Clerezza and AO Evaluating Performance Comparing Algorithms Learning … Text Curated Mining Results AO RDF Text Mining Results Applications AO RDF Publishing Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  35. 35. SemTechBiz 2012, San Francisco, June 4th 2012 Thank you! Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School

Editor's Notes

  • The topic can be an antibody (NIF Antibody registry)
  • ×