Controlled vocabularies and VIVO
Upcoming SlideShare
Loading in...5

Controlled vocabularies and VIVO






Total Views
Views on SlideShare
Embed Views



1 Embed 152 152


Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Controlled vocabularies and VIVO Controlled vocabularies and VIVO Presentation Transcript

  • Controlled vocabulariesand VIVOPaul Albertpaa2013@med.cornell.eduWeill Cornell Medical College
  • The problemWeve seen 959 ways to referto Proceedings of the NationalAcademy of Sciences.Google Scholar Development Team
  • The problemWeve seen 959 ways to referto Proceedings of the NationalAcademy of Sciences. ¡Ay mi estómago! TeamGoogle Scholar Development
  • The main intent of the SemanticWeb is to give machines muchbetter access to informationresources so they can beinformation intermediaries insupport of humans.Michael Uschold
  • Let’s Define Our Terms ive ps is t iat shi t L c hy ssoc ion ar l ici rar A lat ra mm E xp H ie R e Gcontrolledvocabulary ✓taxonomy ✓ ✓thesaurus ✓ ✓ ✓ontology ✓ ✓ ✓ ✓
  • Warning Pursuit of controlled vocabularytends to expose source systems for the quagmires they are.
  • Which controlledvocabulary should I use?
  • Selecting controlled vocabularies:when snobbery is a virtue
  • “Desiderata” for Controlled Medical Vocabularies Methods of Information in Medicine © F. K. Schattauer Verlagsgesellschaft mbH (1998) f I , J. J. Cimino Desiderata for Controlled Medical Vocabularies in the Twenty-First Century Department of Medical Informatics, Abstract: Builders of medical informatics applications need controlled Columbia University, New York, USA medical vocabularies to support their applications and it is to their advan- tage to use available standards. In order to do so, however, these stand- ards need to address the requirements of their intended users. Overthe past decade, medical informatics researchers have begun to articulate some of these requirements. This paper brings together some of the common themes which have been described, including: vocabulary content, concept orientation, concept permanence, nonsemantic concept identifiers, poly- hierarchy, formal definitions, rejection of "not elsewhere classified" terms, multiple granularities, mUltiple consistent views, context representation, graceful evolution, and recognized redundancy. Standards developers are beginning to recognize and address these desiderata and adapt their offer- ings to meet them. Keywords: Controlled Medical Terminology, Vocabulary, Standards, Review 1. Introduction to achieve optimal integration of the ambivalence. A number of vocabularies two, transfer of patient information have been put forth as standards [3] but
  • “Desiderata” for Controlled Medical Vocabularies Content – formal editorial policy and methodology; provide breadth and depth; don’t just add terms2. Concept orientation – exactly one meaning per concept and exactly one concept per meaning3. Concept permanence – old concepts cant be deleted; names can be changed as long as meaning doesnt change
  • “Desiderata” for Controlled Medical Vocabularies Nonsemantic identifiers – use a meaningless integer5. Polyhiearchy – employ multiple hierarchies to support need for tree walking and inferencing6. Formal Definitions – structured descriptions that invoke relationships within the terminology
  • “Desiderata” for Controlled Medical Vocabularies Reject “not elsewhere classified” – terminology changes induce semantic drift8. Graceful evolution – fix mistakes; account for changes in medical knowledge9. Recognize redundancy – redundant expressions are inevitable, but redundant concepts are bad
  • Is Roz Chast’s ice cream ontology desiderata compliant?
  • Is Roz Chast’s ice cream ontology desiderata compliant?CompliantReject "Not Elsewhere Classified"Recognize RedundancyUnclear or Non-CompliantContentConcept PermanenceGraceful EvolutionConcept OrientationNonsemantic Concept IdentifiersPolyhierarchyFormal Definitions
  • What is the license of the controlled vocabulary?• Are the ontology codes copyrighted and can they be used in an open source application?• Need to account for the possibility that the data is reused for a commercial interest
  • Externally maintained vocabularies are more sustainable Who will maintain and host the vocabulary?
  • Controlled vocabularies used in VIVO
  • The Ontology Team isconsidering serving vocabularies for select domains“The VIVO community might be able to buildservices to serve controlled vocabularies for organizations and journals.”
  • Food and Agriculture Organization (FAO) geopolitical ontology • master reference for geopolitical information in multiple languages • provides relations among territories (land borders, group membership, etc) • tracks historical changes Ships with VIVO application
  • Academic Degrees Ships with VIVO application
  • As of version 1.4, VIVO allows users to lookup terms from UMLS and GEMET
  • As of version 1.4, VIVO allows users to lookup terms from UMLS and GEMET
  • As of version 1.4, VIVO allows users to lookup terms from UMLS and GEMET
  • As of version 1.4, VIVO allows users to lookup terms from UMLS and GEMET
  • GEMET: controlled vocabulary for environmental topicsadministration forestry radiationsagriculture general researchair geography resourcesanimal husbandry human health social aspects,biology industry populationbuilding information soilchemistry legislation spaceclimate materials tourismdisasters, accidents, risk military aspects trade, serviceseconomics natural areas, landscape, transportenergy ecosystems urban environment, natural dynamics urban stressenvironmental policy noise, vibrations wastefishery physics waterfood, drinking water pollution
  • Vocabularies actively being considered for VIVO• colleges and universities• journals - open source status (VIVOONT-433)• languages (VIVOONT-250) - model write, speak, proficiency• others?
  • – one promisingoption for organizations
  • Modeling medical terms in VIVO
  • Types of Specialty All Specialties Board-Certified Specialties Board-Certified Subspecialties
  • Types of Medical Expertise Feigned Clinical Research < < GLG-20s Performed Board-certified Invented amasquerading 100+ ECGs in Cardiology better ECGas doctors for comic effect
  • We use Intelligent Medical Objects (IMO)’s interface terminology• Maps medical expertise terms to SNOMED CT• Useful for returning relevant results to patients searching for a doctor• Enables the physician to enter more arcane areas of expertise (e.g., Asian American Community Health)• A commercial application
  • Physician Admin View:Search for “chemotherapy” in IMO
  • Physician Admin View:Search for “that” yields manyterms not in SNOMED CT.
  • Expertise exists in POPS.Board certification dataexists in POPS, Intellicred.
  • Export from PhysiciansProfile System containsspecialty and expertise
  • Board CertificationsProblem #1: No indication of certifying board. At least 13 certifications including geriatric medicine,pain medicine, and urology are given by at least one ABMS board.
  • Board Certifications Problem #2: Names ofcertifications are ambiguous.Colon and rectal surgery is listed in the following alternate ways: Surgery, Colon and Rectal Colon-Rectal Surgery Colorectal Surgery
  • Board CertificationsProblem #3: No given date of certification.
  • Board CertificationsProblem #4: Whichsource vocabulary? Prior to 1970 1970-1979 1970-1979
  • The National Uniform Claim Committee(NUCC) maintains a list of health careprovider taxonomy codes, but this listseems to be exclusively for non-MDs.
  • Change in number of ABMSSubspecialties/Specialties 145 84 66 74 20 10 70 79 92 96 99 0 12 re- 19 70 -19 - 19 By 19 By 19 2P 19 1980
  • Cosmetic Dentistry Geriatric Psychotherapy Neuro Critical CareCosmetic Dermatology Gynecologic Endocrinology Neuro RadiologyCosmetic Surgery Gynecologic Pathology Neuro-OphthalmologyCritical Care Neurology Gynecology Neuro-PathologyDermatology, General Hand Surgery NutritionEar, Nose, and Throat, Heart Surgery Oral and MaxillofacialPediatric Hematology/Oncology PathologyEchocardiography Hepatobiliary Surgery Oral and Maxillofacial SurgeryElectrodiagnostic Medicine Hepatology OrthodonticsEmergency Neurology High Risk Obstetrics Orthopedic SurgeryEndocrinology Hospitalist Orthopedics ollowing 135 boardFacial Plastic and Immunopathology Pain Medicine/Pain The fReconstructive SurgeryFacial Plastic Surgery Infant Psychiatry Intensive Care Management Pathology ns in our systemFamily Psychology Internal Medicine, General Pediatric Allergy and certificatioFetal CardiologyFoot and Ankle Surgery International Medicine International Travel Medicine Immunology Pediatric Behavior and cognized by ABMS.Foot Surgery Interventional Neuroradiology Development are not reGastroenterology PathologyGastrointestinal PathologyGastrointestinal Surgery Interventional Oncology Interventional Pain Management Pediatric Dentistry Pediatric Neurological Surgery Prior to 1970 1970-1979 1970-1979 Pediatric NeurologyGeneral Anesthesiology Interventional Radiology Pediatric NeurosurgeryGeneral Cardiology Invasive Cardiology Pediatric Orthopedic SurgeryGeneral Dentistry Laboratory Medicine Pediatric OrthopedicsGeneral Dermatology Laryngology PeriodonticsGeneral Internal Medicine Liver Pathology Plastic and ReconstructiveGeneral Neurology Maternal-Fetal Medicine SurgeryGeneral Neurosurgery Medical Genetics PsychologyGeneral Obstetrics and Molecular Genetics Pulmonary Disease MedicineGynecology Molecular Hematopathology RadiologyGeneral Ophthalmology Molecular Infectious Disease Radiology, Vascular/General Pediatrics Molecular Pathology InterventionalGeneral Psychiatry Musculoskeletal Oncology Reproductive EndocrinologyGeneral Surgery Musculoskeletal Radiology Surgery, Critical CareGeneral Urology Neonatal Neurology Surgery, HandGenetics, Medical Neonatal Surgery Surgery, Oral and MaxillofacialGeriatric Cardiology Neonatal Thoracic Surgery Thoracic SurgeryGeriatric Dermatology Neonatology Vascular and Interventional
  • Weill Game Plan for Board Certifications• Explore ingest from Intellicred (fewer certifications, less variability, may include certifying agency?)• Explore external vocabularies• Failing that, create our own
  • Medical Expertise andNon-Certified Specialties
  • Expertise term from Weill Cornell Physician Profile 3% of terms from the source system System (n = 2578) lack or have an unclear equivalent in UMLS How does a term of local clinical expertise map to UMLS using Stony Brooks API? Weill → UMLS – In Vitro Fertilization Counseling → V Unclear Fertilization | Counseling – Adjustable Band → Band – Bowel-Sparing Strictureplasty → No Identical Subtype Compound term Equivalent preserving Union of two concepts original meaning53% of terms from 3% of terms from 2% of terms from the source systemcorrespond exactly to 5% of terms from the source system the source systemsome representation 34% of terms from the source system can be represented by the joining (not can only be represented as a in UMLS the source system have some equivalent in UMLS that is intersection) of two subtype of a can only be lexically different but concepts in UMLS concept in UMLS represented as a– Polycystic Ovary Syndrome semantically identical combination of terms– Anaphylaxis Weill → UMLS Weill → UMLS– Aortic Dissection from UMLS – Billing and Coding → Billing | – Bipolar 1 Disorder → Bipolar– Chemoembolization– Dental Implant Weill → UMLS Coding Disorder – Biopsy of Skin → Skin biopsy – Bone and Mineral Metabolism – FAA Medical Exam → Medica– Echocardiogram Weill → UMLS → Bone Metabolism | Mineral – Aneurysm of Popliteal Artery → Exam – Asian American Community Health Metabolim Aneurysm Popliteal → Asian American | Community – Bladder and Prostate Cancer – Charcot-Marie-Tooth Disease → Health → Bladder Cancer | Prostate Charcot-Marie-Tooth – Endoscopic Ultrasound of Cancer – Cirrhosis of Liver → Cirrhosis Esophagus → Endoscopic Ultrasound – Coarctation of the Aorta → | Esophagus Coarctation – Chronic Pelvic Pain In Female → Chronic Pelvic Pain | Female – Bronchoscopy With Biopsy →
  • Pre-coordination Post-coordinationDefinition Terms combined by a developer to denote a Terms combined at the time of search and specific concept and its retrieval using Boolean attributes more or other operators. precisely.Benefits Users who are not Lazy or “busy” totally familiar with a developers controlled vocabulary and its structure.Examples avian hypersensitivity avian AND pneumonitis hypersensitivity AND pneumonitis carrier sense multiple access carrier sense AND multiple access
  • How do we semantically model post-coordinated terms?1. Do not mess with post-coordination. User adds term from lookup service. Thats it. (Existing method.)2. User adds term from lookup service. Machine makes basic inferences based on similarity. (Everything is "related term.")3. User adds term from lookup service. Administrator models terms.4. User adds term from lookup service. User interface enables and guides end user.
  • Option #3: User adds term from lookup service. Administrator models terms.Can we build on others work?• The International Health Terminology Standards Development Organization (IHTSDO) in Denmark is working to develop and promote SNOMED to support sharing of modelling.• IMO, our terminology service, may help model coordinated terms.
  • Need for post-coordination is widespread For example, many global health terms require coordination.
  • UMLS’s rapid growth is somewhatat odds with desiderata compliance12000000 Strings900000060000003000000 Concepts 0 99 000 001 002 003 004 005 006 007 008 009 0102011 19 2 2 2 2 2 2 2 2 2 2 2
  • Cimino’s Critique of Terminologies Desiderata Adherence Cov Conc Perm ID Hier Def NEC Evol Redun ICD + - - - +/- - - - - CPT - + + + - - - + - DRG - + + + - - - + - NDC + + - - - - + - - RxNorm + + + + + + + + + LOINC + + + + +/- + + + + Nursing + + +/- + +/- - - +/- +/- SNOMED + + + + + +/- + + + MeSH + +/- + + +/- - - + - UMLS + + + + +/- - n/a + -Cov: Content coverage Conc: Concept oriented Perm: Concept permanenceID: meaningless identifiers Hier: Multiple hierarchy Def: Formal definitionsNEC: Rejected “Not Elsewhere Classified” Evol: Graceful evolution Redun: Detect redundancy
  • Why SNOMED CT may be better at representing medical terms compared to UMLS• No formal conceptual model (near-synonymy)• No hierarchy• Lots of redundancy• Lots of ambiguity
  • UMLS is good for helping you find termsin a specific terminology because allmany-to-one term-to-concept mappingsexpand the synonyms you can matchagainst. I recommend you use UMLS tofind terms from a very limited set ofterminologies - maybe SNOMED plusLOINC plus RxNorm, for example.Jim Cimino
  • Proposed Role of SKOSClassesskos:Concept    snomedct:Procedure    snomedct:Disorder    rxnorm:Drug    ...Propertiesskos:related    snomedct:equivalentTo    ...skos:broaderskos:narrower
  • Read MoreGuidelines for the Construction, Format, andManagement of Monolingual ControlledVocabularies for Controlled MedicalVocabularies
  • Practice Robot Courtesy with Local Extensions Use classes/properties that are subclasses/subproperties of existing classes/properties in VIVO’s core ontology.