Isf vivo2013

on

  • 889 views

A talk about the merger and refactoring of the eagle-i and VIVO ontologies presented by myself, Brian Lowe, Janos Hajagos, and Erich Bremer at the VIVO2013 conference in St. Louis

A talk about the merger and refactoring of the eagle-i and VIVO ontologies presented by myself, Brian Lowe, Janos Hajagos, and Erich Bremer at the VIVO2013 conference in St. Louis

Statistics

Views

Total Views
889
Views on SlideShare
885
Embed Views
4

Actions

Likes
0
Downloads
7
Comments
0

2 Embeds 4

http://kred.com 2
https://twitter.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • The process of integrating the eagle-i and VIVO ontologies, refactoring them, and modularizing the ISF posed a set of interesting challenges and constraints
  • Trade off between content coverage aggregation and pattern-driven (for example, certain types of axioms in one place, imports, etc.)For instance, profile module that needs to be generic.Vocabulary as information model: Person and social security. The axioms that a person has social security is not an axiom that exists in “dictionary”. Informational vs definitional axioms. Informational axioms are about a subset of the entity – E.g. People are not defined by their social security number.
  • Using Vcard to bring together two different representations. Both eagle-I and VIVO had representations of contact information but most of it was done with data properties and string values, some of which were not structures. The move to the new vcard/foaf representation imposes more structure and requires a lot more use of classes and object properties. The data migration is not yet done for the applications.ISF now includes the general idea of contact, which can take multiple forms. An agent can have a contact that can be a FOAF profile (more web based) while the VCArd is more standard. The Vcard standard is a well established IETF networking standard for exchanging contact related information and the FOAF vocabulary is a commonly used RDF vocabulary to represent contact like information that is more focused on web presence rather than physical addresses and communication as in the Vcard case. The ISF adopts both Vcard and FOAF. Vcard had an existing RDF mapping at the beginning of the project but recently the W3C published a new RDF mapping for version 4 of vcard. The RDF mapping is still in draft status but we are moving to the new RDF mapping for the final release of the ISF.
  • This shows the use-cases for URIs that don’t fall under the typical OWL class/individual modeling of data. There is a need for an agreed on set of codes, concepts, types, etc. of things in addition to classes and individuals. It is also just another perspective on the domain where there is frequently a need to talk about a whole set (an OWL class) as if it is a single primitive thing (an instance) and SKOS is a formalization of this idea.
  • Here we have added the punning (if needed)This diagram shows:That we make a distinction between the “ontology” on the left side and the “vocabulary” on the right.This distinction doesn’t mean that the set of URIs on both sides are disjoint. Certain URIs might exist as classes in the ontology and as individuals in the vocabulary.This is the punning, the same URI has two different type assertions (class type vs. individual type)The “PhD degree” is an individual that can be referenced in a “position” instance to indicate that the position is related to PhD degrees in some way but it doesn’t imply that there is a specific instance of a PhD degree that belong to some agent related by the position. If an agent later obtains an actual instance of a PhD degree, a new URI will be created and asserted to be an instance of the “PhD degree” class from the ontology (the punning of the “PhD degree” URI).
  • Here ICD example:Concept scheme class means the vocabulary (Mesh or ICD9) and the SKOS concept The concept ICD (327.3 exists in ICD9 scheme).Now the notation (which is an actual datatype such umls-aui) and the value of that datatype. The concept ICD0 is coded with the code SKOS give you some object property to related concept. The closeMatch,exactMatchWhen same AUI or CUI exist we have exact matchLui Sui CUI AUI*UIThe idea is using SKOS:exactMatch or closeMatch, we can walk between ontologies and still relate back to ISF
  • Increasing the complexity of the ontology merging process created more impetus to keep track of changes and document and validate them. To this end, we developed a Protégé plugin that better supports this new process.When we were in the stage of being very detailed we wanted to mark axioms for each classes if they were migrated or not.Yellow was reviewed, green was complete with axiom migration

Isf vivo2013 Isf vivo2013 Presentation Transcript

  • Integrated Semantic Framework: launching the next generation VIVO ontology Erich Bremer, Jon Corson- Rikert, Melissa Haendel, Janos Hajagos and Brian Lowe @ontowonka net w o r k
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. People and Resources techniques training protocols affiliation roles grants credentials genes anatomy manufacturer publications disease
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. CTSAConnect Project Connecting people and resources Needs:  Identify potential collaborators, relevant resources, and expertise across scientific disciplines  Assemble translational teams of scientists to address specific research questions Goal is to create a semantic representation of clinician and basic science researcher expertise to enable:  More effective linking of information about clinicians and basic science researchers  Computation and publication of clinical expertise data as Linked Data (LD) for use in other applications
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Integrated Semantic Framework Ontology (ISF) suite  Merge the eagle-i and VIVO ontologies into one single ontology suite (the ISF)  Extend their coverage to include representation of clinical encounter  Modularize the ISF such that it can be made available in a set of files that can be reused independently eagle-i Resources VIVO People Coordination eagle-i VIVO Semantic Clinical activities
  • ISF Content and modularization eagle-i Research resources VIVO Person profiling CTSA ShareCenter Discussions, requests, share documents ISF Contact Organizations Affiliations Services Events Clinical Expertise Reagents Organisms Credentials CTSAconnect Reveal Connections. Realize Potential.
  • 8/15/2013 6www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Original Ontologies eagle-i resource ontology VIVO ontology BFO as upper Ontology No upper Ontology Has OBO Foundry principles as guiding design principles Adopts ontologies already in wide use across the Linked Data community such as FOAF and BIBO Aimed at driving an application as well as develop an interoperable core domain ontology Aimed mostly at supporting data validation and data entry through the VIVO application and to produce Linked data Active application and ontology development and live data Active application and ontology development and live data Somewhat unconventional scenario: Usually creating ontologies from scratch or reusing existing ontologies without above constraints
  • 8/15/2013 7www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. A first approach Goals:  Identify overlapping and duplicated entities in the eagle-i and VIVO ontologies  Avoid severe disruptions in application compatibility  Minor incremental additions to the ISF and push significant changes back to the source ontologies Good for:  Referencing existing entities while developing new ISF-specific modules  Performing initial alignments on classes in some portion of the overlapping hierarchies Limits:  Lengthy process of identifying necessary alignments and implementing changes in the source ontologies  With no disruption to the applications, development was slow and low impact
  • 8/15/2013 8www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Preferred approach  Implement the refactoring and merging disconnected from application and data constraints  Impact on the application and data migration assessed after refactoring  Better balance of impact on apps and data migration versus total redesign of approach  Refactoring of source files based on content coverage
  • 8/15/2013 9www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. What the ISF means for VIVO  Fewer object properties  hasRole  hasAttendeeRole  hasClinicalRole  hasPartnerRole  hasOrganizerRole  hasOutreachProviderRole  hasTeacherRole  …
  • 8/15/2013 10www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. What the ISF means for VIVO  Fewer object properties  hasRole  hasAttendeeRole  hasClinicalRole  hasPartnerRole  hasOrganizerRole  hasOutreachProviderRole  hasTeacherRole  …
  • 8/15/2013 11www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Original Motivation
  • 8/15/2013 12www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Original Motivation
  • 8/15/2013 13www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. New Approach  Consistently query for type of related resource  Configure application behavior for property and class combinations, not properties alone Person Teacher RolehasRole Presenter Role hasRole “teaching activities” “presentations”
  • 8/15/2013 14www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Reified Relationships Thing Thing vivo: Relation- ship time interval
  • 8/15/2013 15www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Reified Relationships foaf:Person bibo: Document vivo: Authorship author rank vivo:authorInAuthorship vivo:linkedInformationResource
  • 8/15/2013 16www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Reified Relationships foaf:Person bibo: Document vivo: Authorship author rank vivo:authorInAuthorship vivo:linkedInformationResource
  • 8/15/2013 17www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Reified Relationships foaf:Person bibo: Document vivo: Authorship author rank vivo:relates vivo:relates
  • 8/15/2013 18www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Reified Relationships Thing Thing time interval vivo:relates vivo:relates AdvisingRelationship Authorship AwardedDegree AwardReceipt Editorship Grant IssuedCredential Position
  • 8/15/2013 19www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Relationships and Roles Together Person Person Mentoring Relationship mentor role mentee role Mentoring Process
  • 8/15/2013 21www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Fewer Person Subclasses  Most Person subclasses removed from ISF  Retained in VIVO application until 1.7+  Person  FacultyMember  NonFacultyAcademic  Postdoc  Librarian  Archivist  …
  • 8/15/2013 22www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Fewer Person Subclasses  Better to query for people by positions and roles  Person  FacultyMember  NonFacultyAcademic  Postdoc  Librarian  Archivist  …
  • 8/15/2013 23www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Three examples of merging and refactoring  Tackle an open design/representation issue proposing a new design pattern (position of a person over time)  Reference/incorporation of external vocabularies or taxonomies  Merging two different design approaches (Person and Contacts) using existing standard (Vcard)
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Merging person and contact representation
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. eagle-i ontology browser with ISF
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Inclusion of external vocabularies in the ISF
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Including vocabularies in ISF ISF ontology Vocabularies
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Using external vocabularies together
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.  Annotation view with approved or pending approval.  Module view shows pending axiom changes per module and has ability to save the changes with a log comment, and generate the spreadsheet summary Protégé refactoring plugin
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. UMLS Integration The Unified Medical Language System (UMLS) Metathesaurus is a long term project of the National Library of Medicine
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Additional vocabularies RxNorm SNOMED CT CPT ICD9CM, MeSH, and NCIt are made available in the initial CTSAconnect release
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Expressed as a SKOS Vocabulary
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Cross vocabulary concept mapping
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Connect to the full UMLS
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. So what?  Now that eagle-i and VIVO are “on the same page,” future development can leverage better consensus and ontologically rigorous solutions  CTSAs have a new research profiling data standard for exchange  Applications such as Vivo, eagle-i, LOKI, Profiles, SciVal, and ScienCV are working on generating ISF compliant data  We can profile people based on a much larger diversity of their activities and products of research  There is still a lot of work to do – this was a short term project and ISF could be better generalized for other use cases
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. ISF/VIVO Ontology Working Group  Extending and building on the ISF going forward • Collaborating with other VIVO working groups to assure the ISF evolves as it needs it to • Synergies with ShareCenter, eagle-i, Plumage from UCSF  Engaging as a community with SciENCV, CASRAI, euroCRIS LOD group, CTSA Ontology Affinity Group, and others  Biweekly calls, mailing list, documentation – all on wiki.duraspace.org/display/vivo
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. ISF/ShareCenter Drupal Integration 1) Mapping Drupal Node Fields with corresponding ISF Predicates 2) Integration Issue #1 – Drupal creates it’s own URIs for mappings. 3) Importing of custom RDF augmentation to Drupal RDF Store (ARC2) 4) Integration Issue #2 – ARC2 store wiped clean on each re-indexing
  • www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential. Team CTSA 10-001: 100928SB23 PROJECT #: 00921-0001 OHSU: Melissa Haendel, Carlo Torniai, Nicole Vasilevsky, Shahim Essaid, Eric Orwoll Cornell University: Jon Corson-Rikert, Dean Krafft, Brian Lowe University of Florida: Mike Conlon, Chris Barnes, Nicholas Rejack Stony Brook University: Moises Eisenberg, Erich Bremer, Janos Hajagos Harvard University: Daniela Bourges- Waldegg Sophia Cheng Share Center: Chris Kelleher, Will Corbett, Ranjit Das, Ben Sharma University at Buffalo: Barry Smith, Dagobert Soergel CTSAconnect project ctsaconnect.org ISF ontology sourcehttp://code.google.com/p/connect- isf/ ISF 1.0 Documentation http://connect- isf.googlecode.com/svn/release/2013-07- 31/isf-1.0-documentation.pdf New Duraspace Ontology Working Group https://wiki.duraspace.org/pages/viewpage. action?pageId=34656953 Resources