SemTechBiz 2012, San Francisco, June 4th 2012Domeo: a web-based tool for semantic  annotation of online documents        h...
About Me     •   Assistant in Neurology at Mass General Hospital     •   Research faculty at Harvard Medical School     • ...
As (biomedical) scientists…     • We deal with an increasing amount of digital       resources: documents, images, videos,...
… we commonly use annotation     • We annotate prints, HTML and PDFs     • We bookmark/tag web pages…     • … and publicat...
Are we efficient and effective?     •   Can we integrate our annotations?     •   Can we leverage machine computation?    ...
A ‘semantic’ view of a publication                                    Semantic Web Applications in Neuromedicine          ...
graph representationPaolo Ciccarese, PhD     SemTechBiz 2012, June 4th 2012
SWAN Creation/Curation ProcessPaolo Ciccarese, PhD          SemTechBiz 2012, June 4th 2012
How do we empower ‘Joe Scientist’?     • Even simple linking tasks are not ‘standardized’, hard       to share and not eas...
Enable manual annotation               of digital resources     • Visually and effectively annotate - better       semanti...
Leverage text mining and                         community curation     • Run text mining and entities recognition       a...
Enable semantic tagging (ontologies)                   http://purl.obolibrary.org/obo/PR_000004168                   Label...
APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD   SemTechBiz 2012, June 4th 2012
Zooming in                             APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD              ...
…and more     •   Share the annotation in a common format     •   Efficiently search (inference, rules) the annotation    ...
Annotation Ontology (AO)     • OWL vocabulary for representing and sharing       annotation of digital resources and their...
AO Overview  AO allows to annotate:     Resources: Documents (HTML, PDF, Word, Excel), Images,     Databases, Web Service...
AlzSWAN: http://tinyurl.com/18r                                    Annotating a documentPaolo Ciccarese, PhD              ...
Annotating a document fragmentProtein Ontology – PRO: http://purl.org/obo/owl/PROPaolo Ciccarese, PhD                     ...
HyQue triples                                                                                Experiments                  ...
Annotation Ontology Network                                          Biotea           The Living Document                 ...
Open Annotation Community Group     • Annotation Ontology is going to be replaced in       our applications by the Open An...
• DOMEO Annotation Toolkit is a web       application for producing and sharing manual,       semi-automatic and automatic...
DOMEO: Document Metadata OrganizerPaolo Ciccarese, PhD             SemTechBiz 2012, June 4th 2012
Semantic Tags or Qualifiers [1]Paolo Ciccarese, PhD              SemTechBiz 2012, June 4th 2012
Semantic Tags or Qualifiers [2]Paolo Ciccarese, PhD              SemTechBiz 2012, June 4th 2012
Semantic Tags or Qualifiers [3]Paolo Ciccarese, PhD              SemTechBiz 2012, June 4th 2012
Domeo and the NCBO Annotator                                                                    http://www.bioontology.org...
Running NCBO Annotator            Additional text mining services            will be listed herePaolo Ciccarese, PhD      ...
NCBO Annotator Results in Domeo        List of recognized        entitiesPaolo Ciccarese, PhD         SemTechBiz 2012, Jun...
Results Curation                                          CustomizablePaolo Ciccarese, PhD                      SemTechBiz...
Cumulative Results Curation     • One item only     • All instances with the same text match     • All instances independe...
Serialization in AO/RDF (Share)Paolo Ciccarese, PhD              SemTechBiz 2012, June 4th 2012
http://www.slideshare.net/paolociccarese/domeo-and-text-mining                       UIMA, Clerezza and AO                ...
SemTechBiz 2012, San Francisco, June 4th 2012Thank you!    Paolo Ciccarese, PhD      http://www.paolociccarese.info/      ...
Upcoming SlideShare
Loading in...5
×

SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents

1,321

Published on

Slides for the tutorial at SemTechBiz 2012: Leveraging the Semantic Web with Drupal 7. Where Domeo is used as use case.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,321
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • The topic can be an antibody (NIF Antibody registry)
  • Transcript of "SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents"

    1. 1. SemTechBiz 2012, San Francisco, June 4th 2012Domeo: a web-based tool for semantic annotation of online documents http://www.annotationframework.org/ Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School
    2. 2. About Me • Assistant in Neurology at Mass General Hospital • Research faculty at Harvard Medical School • Author of 30+ scientific publications • Senior software and knowledge engineer • Member of W3C HCLS Interest Group • Co-chair of the W3C Open Annotation Community Group http://www.paolociccarese.info/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    3. 3. As (biomedical) scientists… • We deal with an increasing amount of digital resources: documents, images, videos, datasets, vocabularies, databases, software… – About 150-200 articles a week – 10mins/article ≈ 34hours/week? – How can we manage it? http://www.ncbi.nlm.nih.gov/pubmed/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    4. 4. … we commonly use annotation • We annotate prints, HTML and PDFs • We bookmark/tag web pages… • … and publications (citations/references) • We comment on web pages, blogs, forums and emails • We tweet… • …Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    5. 5. Are we efficient and effective? • Can we integrate our annotations? • Can we leverage machine computation? • Can we share it easily with our colleagues? • Can we capitalize on the work of colleagues? • Can we integrate it with other resources? • Can we easily observe science evolution? • Can we easily detect the up-to-date science? • Can we discover valuable resources?Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    6. 6. A ‘semantic’ view of a publication Semantic Web Applications in Neuromedicine (SWAN) project [2007] classic publication scientific discourse ‘semantic’ representation http://tinyurl.com/cgyna2mPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    7. 7. graph representationPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    8. 8. SWAN Creation/Curation ProcessPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    9. 9. How do we empower ‘Joe Scientist’? • Even simple linking tasks are not ‘standardized’, hard to share and not easy to perform http://antibodyregistry.org/antibody17/antibodyform.html? gui_type=advanced&ab_id=2266850 antibodyregistry.orgPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    10. 10. Enable manual annotation of digital resources • Visually and effectively annotate - better semantically annotate - any digital resource and resource fragment, while performing our regular browsing/reading activities http://www.ncbi.nlm.nih.gov/pubmed/19822029 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2874257/ ≈Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    11. 11. Leverage text mining and community curation • Run text mining and entities recognition algorithms on scientific documents and persist the results in a standard format • Benefit from crowdsourcing by supporting curation of manual and automatic annotationPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    12. 12. Enable semantic tagging (ontologies) http://purl.obolibrary.org/obo/PR_000004168 Label ‘amyloid beta A4 protein’ Exact synonyms ‘APP’, ‘amyloidogenic glycoprotein’, … Related Synonyms ‘A4’, ‘ABPP’, Is a http://purl.obolibrary.org/obo/PR_000000001 Label ‘protein’ Definition ‘An amino acid chain that…’ Source: Protein Ontology (PRO) https://pir5.georgetown.edu/wiki/PROPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    13. 13. APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    14. 14. Zooming in APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    15. 15. …and more • Share the annotation in a common format • Efficiently search (inference, rules) the annotation • Reuse/integrate the annotation • Exercise access control • Subscribe to feeds related to topics of interest – Proteins, Cells, Authors, Papers… • Retrieve additional content (mashups) • Find new resources • Find collaboratorsPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    16. 16. Annotation Ontology (AO) • OWL vocabulary for representing and sharing annotation of digital resources and their fragments • Not only for biomedicine! Ciccarese et al, 2011 An open annotation ontology for science on web 3.0 http://www.jbiomedsem.com/content/2/S2/S4 http://purl.org/ao/home (Website/Wiki)Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    17. 17. AO Overview AO allows to annotate: Resources: Documents (HTML, PDF, Word, Excel), Images, Databases, Web Services... (and their fragments) Specifying (or not) an: Annotation Type: through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define. With (or without) a: Topic: free text, structured text, URIs, RDF entities, RDF graphs, domain ontologies… Tracing: Provenance: who created what, when, with which software, with what expectations…Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    18. 18. AlzSWAN: http://tinyurl.com/18r Annotating a documentPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    19. 19. Annotating a document fragmentProtein Ontology – PRO: http://purl.org/obo/owl/PROPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    20. 20. HyQue triples Experiments WorkflowsPaolo Ontology 2.0: http://code.google.com/p/swan-ontology/SWAN Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    21. 21. Annotation Ontology Network Biotea The Living Document ProjectPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    22. 22. Open Annotation Community Group • Annotation Ontology is going to be replaced in our applications by the Open Annotation Model developed through the W3C Open Annotation Community Group – Website http://www.w3.org/community/openannotation/ – Core Model http://www.openannotation.org/spec/core/ – Extensions http://www.openannotation.org/spec/extension/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    23. 23. • DOMEO Annotation Toolkit is a web application for producing and sharing manual, semi-automatic and automatic annotation Ciccarese et al, 2012 Open semantic annotation of scientific publications using DOMEO http://www.jbiomedsem.com/content/3/S1/S1 http://annotationframework.orgPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    24. 24. DOMEO: Document Metadata OrganizerPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    25. 25. Semantic Tags or Qualifiers [1]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    26. 26. Semantic Tags or Qualifiers [2]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    27. 27. Semantic Tags or Qualifiers [3]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    28. 28. Domeo and the NCBO Annotator http://www.bioontology.org/annotator-service • Domeo allows automatic/manual annotation with terms coming from selected ontologies managed by the BioPortalPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    29. 29. Running NCBO Annotator Additional text mining services will be listed herePaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    30. 30. NCBO Annotator Results in Domeo List of recognized entitiesPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    31. 31. Results Curation CustomizablePaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    32. 32. Cumulative Results Curation • One item only • All instances with the same text match • All instances independently from the text matchPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    33. 33. Serialization in AO/RDF (Share)Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    34. 34. http://www.slideshare.net/paolociccarese/domeo-and-text-mining UIMA, Clerezza and AO Evaluating Performance Comparing Algorithms Learning … Text Curated Mining Results AO RDF Text Mining Results Applications AO RDF PublishingPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
    35. 35. SemTechBiz 2012, San Francisco, June 4th 2012Thank you! Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School

    ×