• Save
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents
Upcoming SlideShare
Loading in...5
×
 

SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents

on

  • 1,436 views

Slides for the tutorial at SemTechBiz 2012: Leveraging the Semantic Web with Drupal 7. Where Domeo is used as use case.

Slides for the tutorial at SemTechBiz 2012: Leveraging the Semantic Web with Drupal 7. Where Domeo is used as use case.

Statistics

Views

Total Views
1,436
Views on SlideShare
1,435
Embed Views
1

Actions

Likes
1
Downloads
0
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • The topic can be an antibody (NIF Antibody registry)

SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online documents Presentation Transcript

  • SemTechBiz 2012, San Francisco, June 4th 2012Domeo: a web-based tool for semantic annotation of online documents http://www.annotationframework.org/ Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School
  • About Me • Assistant in Neurology at Mass General Hospital • Research faculty at Harvard Medical School • Author of 30+ scientific publications • Senior software and knowledge engineer • Member of W3C HCLS Interest Group • Co-chair of the W3C Open Annotation Community Group http://www.paolociccarese.info/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • As (biomedical) scientists… • We deal with an increasing amount of digital resources: documents, images, videos, datasets, vocabularies, databases, software… – About 150-200 articles a week – 10mins/article ≈ 34hours/week? – How can we manage it? http://www.ncbi.nlm.nih.gov/pubmed/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • … we commonly use annotation • We annotate prints, HTML and PDFs • We bookmark/tag web pages… • … and publications (citations/references) • We comment on web pages, blogs, forums and emails • We tweet… • …Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Are we efficient and effective? • Can we integrate our annotations? • Can we leverage machine computation? • Can we share it easily with our colleagues? • Can we capitalize on the work of colleagues? • Can we integrate it with other resources? • Can we easily observe science evolution? • Can we easily detect the up-to-date science? • Can we discover valuable resources?Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • A ‘semantic’ view of a publication Semantic Web Applications in Neuromedicine (SWAN) project [2007] classic publication scientific discourse ‘semantic’ representation http://tinyurl.com/cgyna2mPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • graph representationPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • SWAN Creation/Curation ProcessPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • How do we empower ‘Joe Scientist’? • Even simple linking tasks are not ‘standardized’, hard to share and not easy to perform http://antibodyregistry.org/antibody17/antibodyform.html? gui_type=advanced&ab_id=2266850 antibodyregistry.orgPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Enable manual annotation of digital resources • Visually and effectively annotate - better semantically annotate - any digital resource and resource fragment, while performing our regular browsing/reading activities http://www.ncbi.nlm.nih.gov/pubmed/19822029 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2874257/ ≈Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Leverage text mining and community curation • Run text mining and entities recognition algorithms on scientific documents and persist the results in a standard format • Benefit from crowdsourcing by supporting curation of manual and automatic annotationPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Enable semantic tagging (ontologies) http://purl.obolibrary.org/obo/PR_000004168 Label ‘amyloid beta A4 protein’ Exact synonyms ‘APP’, ‘amyloidogenic glycoprotein’, … Related Synonyms ‘A4’, ‘ABPP’, Is a http://purl.obolibrary.org/obo/PR_000000001 Label ‘protein’ Definition ‘An amino acid chain that…’ Source: Protein Ontology (PRO) https://pir5.georgetown.edu/wiki/PROPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Zooming in APPs for the Semantic Resources Project, May 2010Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • …and more • Share the annotation in a common format • Efficiently search (inference, rules) the annotation • Reuse/integrate the annotation • Exercise access control • Subscribe to feeds related to topics of interest – Proteins, Cells, Authors, Papers… • Retrieve additional content (mashups) • Find new resources • Find collaboratorsPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Annotation Ontology (AO) • OWL vocabulary for representing and sharing annotation of digital resources and their fragments • Not only for biomedicine! Ciccarese et al, 2011 An open annotation ontology for science on web 3.0 http://www.jbiomedsem.com/content/2/S2/S4 http://purl.org/ao/home (Website/Wiki)Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • AO Overview AO allows to annotate: Resources: Documents (HTML, PDF, Word, Excel), Images, Databases, Web Services... (and their fragments) Specifying (or not) an: Annotation Type: through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define. With (or without) a: Topic: free text, structured text, URIs, RDF entities, RDF graphs, domain ontologies… Tracing: Provenance: who created what, when, with which software, with what expectations…Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • AlzSWAN: http://tinyurl.com/18r Annotating a documentPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Annotating a document fragmentProtein Ontology – PRO: http://purl.org/obo/owl/PROPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • HyQue triples Experiments WorkflowsPaolo Ontology 2.0: http://code.google.com/p/swan-ontology/SWAN Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Annotation Ontology Network Biotea The Living Document ProjectPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Open Annotation Community Group • Annotation Ontology is going to be replaced in our applications by the Open Annotation Model developed through the W3C Open Annotation Community Group – Website http://www.w3.org/community/openannotation/ – Core Model http://www.openannotation.org/spec/core/ – Extensions http://www.openannotation.org/spec/extension/Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • • DOMEO Annotation Toolkit is a web application for producing and sharing manual, semi-automatic and automatic annotation Ciccarese et al, 2012 Open semantic annotation of scientific publications using DOMEO http://www.jbiomedsem.com/content/3/S1/S1 http://annotationframework.orgPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • DOMEO: Document Metadata OrganizerPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Semantic Tags or Qualifiers [1]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Semantic Tags or Qualifiers [2]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Semantic Tags or Qualifiers [3]Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Domeo and the NCBO Annotator http://www.bioontology.org/annotator-service • Domeo allows automatic/manual annotation with terms coming from selected ontologies managed by the BioPortalPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Running NCBO Annotator Additional text mining services will be listed herePaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • NCBO Annotator Results in Domeo List of recognized entitiesPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Results Curation CustomizablePaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Cumulative Results Curation • One item only • All instances with the same text match • All instances independently from the text matchPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • Serialization in AO/RDF (Share)Paolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • http://www.slideshare.net/paolociccarese/domeo-and-text-mining UIMA, Clerezza and AO Evaluating Performance Comparing Algorithms Learning … Text Curated Mining Results AO RDF Text Mining Results Applications AO RDF PublishingPaolo Ciccarese, PhD SemTechBiz 2012, June 4th 2012
  • SemTechBiz 2012, San Francisco, June 4th 2012Thank you! Paolo Ciccarese, PhD http://www.paolociccarese.info/ paolo.ciccarese@gmail.com Mass General Hospital Harvard Medical School