From billing codes to expertise:
  mining, representing and sharing
   clinical research profiles in the
          Linked Data Cloud
                        Carlo Torniai
Shahim Essaid, Chris Barnes, Stephen Williams, Janos Hajagos
             Nicole Vasilevsky, Melissa Haendel
CTSAConnect Project
Needs:
   – Identify potential collaborators, relevant resources, and
     expertise across scientific disciplines
   – Assemble translational teams of scientists to address specific
     research questions
Approach:
   Create a semantic representation of clinician and basic science
   researcher expertise to enable
   – more effective linking of information about clinicians and basic
     science researchers
   – publication of expertise data as Linked Data (LD) for use in
     other applications

www.ctsaconnect.org                                    CTSAconnect
                                               Reveal Connections. Realize Potential.
Integrating VIVO and eagle-i

                                           VIVO
                            eagle-i


  VIVO is an ontology-driven application . . . for collecting and
   displaying information about people
  eagle-i is an ontology-driven application . . . for collecting and
   searching research resources
  Both publish Linked Data. Neither addresses clinical expertise




www.ctsaconnect.org
 8/23/2012                                                CTSAconnect               3
                                                  Reveal Connections. Realize Potential.
Extending eagle-i and VIVO to represent
                     clinical expertise
                                                    Semantic
                                                      VIVO
                                                               Clinical
                                          eagle-i             activities

          Researcher Characterization                                    Clinician Characterization
•    Organizational affiliations     •   Research resources    •   Training and credentials
•    Grant and project participation      –   Reagents         •   Clinical research topic
•    Activities                           –   Biospecimens     •   Specialization inferred from EHR
      –   Teaching courses                –   Animal models          –    Procedures
      –   Mentoring students              –   Instruments            –    Diagnosis
      –   (Co)-authoring publications     –   Techinque              –    Prescriptions



          CTSAconnect will produce a single Integrated Semantic Framework that includes
          clinical expertise


    www.ctsaconnect.org                                                           CTSAconnect
                                                                          Reveal Connections. Realize Potential.
ISF Clinical module




                                        ARG: Agents, Resources, Grants ontology
                                        CM: Clinical module
                                        IAO: Information Artifact Ontology
                                        OBI: Ontology for Biomedical
                                        Investigations
                                        OGMS: Ontology for General Medical
                                        Science
                                        FOAF: Friend of a Friend vocabulary
                                        BFO: Basic Formal Ontology


www.ctsaconnect.org                       CTSAconnect
                                  Reveal Connections. Realize Potential.
ISF Clinical module: encounter

                                   ARG: Agents, Resources, Grants ontology
                                   CM: Clinical module
                                   OGMS: Ontology for General Medical
                                   Science
                                   FOAF: Friend of a Friend vocabulary




www.ctsaconnect.org                      CTSAconnect
                                 Reveal Connections. Realize Potential.
ISF Clinical module: encounter output




 CM: Clinical module
 OBI: Ontology for Biomedical
 Investigations
 OGMS: Ontology for General
 Medical Science



www.ctsaconnect.org                        CTSAconnect
                                   Reveal Connections. Realize Potential.
Collecting and publishing clinical expertise
         as represented by encounter



     Step 1         Step 2       Step 3                       Step 4
   Aggregate      Map Data to   Compute                    Publish Linked
  Clinical Data       ISF       Expertise                       Data




www.ctsaconnect.org                                 CTSAconnect
                                            Reveal Connections. Realize Potential.
Aggregate clinical data
     Step 1                  Step 2                 Step 3                        Step 4
   Aggregate               Map Data to             Compute                     Publish Linked
  Clinical Data                ISF                 Expertise                        Data



  Provider       ICD           Code      Unique Patient
     ID       Code Value       Count         Count                     Code Label
                                                          Unilateral or unspecified femoral hernia
  1234567         552.00          1            1            with obstruction (ICD9CM 552.00)

                                                          Bilateral femoral hernia without mention
  1234567         553.02          8            6             of obstruction or gangrene (ICD9CM
                                                                           553.02)
                                                            Regional enteritis of large intestine
  1234567         555.1           4            1                     (ICD9CM 555.1)
                                                          Corrected transposition of great vessels
  1234568         745.12         10            5                     (ICD9CM 745.12)



www.ctsaconnect.org                                                     CTSAconnect
                                                                Reveal Connections. Realize Potential.
Map data to ISF
                      Step 1                                                        Step 2             Step 3                       Step 4
                    Aggregate                                                     Map Data to         Compute                    Publish Linked
                   Clinical Data                                                      ISF             Expertise                       Data


                                                                                       Java scripts                          RDF
                                          Unique

Provider ID   ICD Code Value Code Count
                                          Patient
                                          Count            Code Label                   OWL API                             triples
                                                    Unilateral or unspecified
                                                      femoral hernia with
 1234567         552.00          1           1
                                                     obstruction (ICD9CM
                                                            552.00)

                                                     Bilateral femoral hernia
                                                       without mention of
 1234567          553.02         8           6
                                                     obstruction or gangrene
                                                         (ICD9CM 553.02)

                                                    Regional enteritis of large
 1234567          555.1          4           1
                                                    intestine (ICD9CM 555.1)
                                                    Corrected transposition of
 1234568          745.12         10          5       great vessels (ICD9CM
                                                             745.12)



              Aggregated
              Clinical Data


                                                                                           ISF

      www.ctsaconnect.org                                                                                                 CTSAconnect
                                                                                                                  Reveal Connections. Realize Potential.
Compute Expertise
     Step 1          Step 2           Step 3                       Step 4
   Aggregate       Map Data to       Compute                    Publish Linked
  Clinical Data        ISF           Expertise                       Data




• Unified Medical Language System (UMLS) aggregates Medical
  Subjects Heading (MeSH) and other terminologies by linking them
  to UMLS concept unique identifiers (CUI)
• UMLS CUIs will be used to map ICD9 and CPT codes to MeSH
• Expertise indexed by MeSH will enable meaningful connections
  between clinicians, basic researchers, and biomedical knowledge



www.ctsaconnect.org                                      CTSAconnect
                                                 Reveal Connections. Realize Potential.
Compute Expertise: Mapping ICD9 to MeSH




www.ctsaconnect.org                 CTSAconnect
                            Reveal Connections. Realize Potential.
Compute Expertise: weighting
     Step 1         Step 2       Step 3                       Step 4
   Aggregate      Map Data to   Compute                    Publish Linked
  Clinical Data       ISF       Expertise                       Data


                                  • Provider X has 500 patients
                                  • S/he has used Syndactyly
                                    (ICD9: 755.12) for 30 unique
                                    patients 75 times

                                  Percentage of patients with
                                  code: 30/500*100 = 6%
                                  Code frequency: 75/30 = 2.5
                                  Code weight: 6 * 2.5 = 15



www.ctsaconnect.org                                 CTSAconnect
                                            Reveal Connections. Realize Potential.
Publish Linked Data
     Step 1               Step 2            Step 3                       Step 4
   Aggregate            Map Data to        Compute                    Publish Linked
  Clinical Data             ISF            Expertise                       Data




                                                                        Other APIs
                                                                        Endpoints
                                                                         SPARQL
                                           …



          Linked Data                                              Several means
                                      Triple Stores                to access and
             cloud
                                                                     query data


www.ctsaconnect.org                                            CTSAconnect
                                                       Reveal Connections. Realize Potential.
Sample encounter data published as LOD



                                      Health care encounter
    Annotations and                   Instance URI
    Properties

                          Inferred Types




www.ctsaconnect.org                      CTSAconnect
                                 Reveal Connections. Realize Potential.
Querying the data




www.ctsaconnect.org                       CTSAconnect
                                  Reveal Connections. Realize Potential.
Beyond expertise




• Encounter data represented using ISF and published as Linked
  Data, in addition to enhance linkage between clinical and basic
  expertise, will enable integration with multiple datasets which
  could be used in a variety of ways to discover useful clinical
  associations and patterns
www.ctsaconnect.org                                   CTSAconnect
                                              Reveal Connections. Realize Potential.
Information
  CTSAconnect project                         Carlo Torniai
                                                 torniai@ohsu.edu
    ctsaconnect.org
  CTSAconnect ontology source                 Shahim Essaid
    http://code.google.com/p/connect-isf/        essaids@ohsu.edu
  The clinical module can be directed         Chris Barnes
   accessed at http://bit.ly/clinical-isf        cpb@ufl.edu
  Linked Data generation code
   http://bit.ly/isf-lod-code                  Janos Hajagos
                                                janos.hajagos@stonybrook.edu
  eagle-i federated search
     eagle-i.net                               Stephen V Williams
  VIVO integrated search                       swilliams@ctrip.ufl.edu

     vivosearch.org                            Nicole Vasilevski
  CTSA ShareCenter                             vasilevs@ohsu.edu
     ctsasharecenter.org
                                               Melissa Haendel
                                                haendel@ohsu.edu
CTSA 10-001: 100928SB23
 www.ctsaconnect.org                                            CTSAconnect
PROJECT #: 00921-0001                                   Reveal Connections. Realize Potential.

From billing codes to expertise: mining, representing and sharing clinical research profiles in the Linked Data Cloud

  • 1.
    From billing codesto expertise: mining, representing and sharing clinical research profiles in the Linked Data Cloud Carlo Torniai Shahim Essaid, Chris Barnes, Stephen Williams, Janos Hajagos Nicole Vasilevsky, Melissa Haendel
  • 2.
    CTSAConnect Project Needs: – Identify potential collaborators, relevant resources, and expertise across scientific disciplines – Assemble translational teams of scientists to address specific research questions Approach: Create a semantic representation of clinician and basic science researcher expertise to enable – more effective linking of information about clinicians and basic science researchers – publication of expertise data as Linked Data (LD) for use in other applications www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 3.
    Integrating VIVO andeagle-i VIVO eagle-i  VIVO is an ontology-driven application . . . for collecting and displaying information about people  eagle-i is an ontology-driven application . . . for collecting and searching research resources  Both publish Linked Data. Neither addresses clinical expertise www.ctsaconnect.org 8/23/2012 CTSAconnect 3 Reveal Connections. Realize Potential.
  • 4.
    Extending eagle-i andVIVO to represent clinical expertise Semantic VIVO Clinical eagle-i activities Researcher Characterization Clinician Characterization • Organizational affiliations • Research resources • Training and credentials • Grant and project participation – Reagents • Clinical research topic • Activities – Biospecimens • Specialization inferred from EHR – Teaching courses – Animal models – Procedures – Mentoring students – Instruments – Diagnosis – (Co)-authoring publications – Techinque – Prescriptions CTSAconnect will produce a single Integrated Semantic Framework that includes clinical expertise www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 5.
    ISF Clinical module ARG: Agents, Resources, Grants ontology CM: Clinical module IAO: Information Artifact Ontology OBI: Ontology for Biomedical Investigations OGMS: Ontology for General Medical Science FOAF: Friend of a Friend vocabulary BFO: Basic Formal Ontology www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 6.
    ISF Clinical module:encounter ARG: Agents, Resources, Grants ontology CM: Clinical module OGMS: Ontology for General Medical Science FOAF: Friend of a Friend vocabulary www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 7.
    ISF Clinical module:encounter output CM: Clinical module OBI: Ontology for Biomedical Investigations OGMS: Ontology for General Medical Science www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 8.
    Collecting and publishingclinical expertise as represented by encounter Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 9.
    Aggregate clinical data Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data Provider ICD Code Unique Patient ID Code Value Count Count Code Label Unilateral or unspecified femoral hernia 1234567 552.00 1 1 with obstruction (ICD9CM 552.00) Bilateral femoral hernia without mention 1234567 553.02 8 6 of obstruction or gangrene (ICD9CM 553.02) Regional enteritis of large intestine 1234567 555.1 4 1 (ICD9CM 555.1) Corrected transposition of great vessels 1234568 745.12 10 5 (ICD9CM 745.12) www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 10.
    Map data toISF Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data Java scripts RDF Unique Provider ID ICD Code Value Code Count Patient Count Code Label OWL API triples Unilateral or unspecified femoral hernia with 1234567 552.00 1 1 obstruction (ICD9CM 552.00) Bilateral femoral hernia without mention of 1234567 553.02 8 6 obstruction or gangrene (ICD9CM 553.02) Regional enteritis of large 1234567 555.1 4 1 intestine (ICD9CM 555.1) Corrected transposition of 1234568 745.12 10 5 great vessels (ICD9CM 745.12) Aggregated Clinical Data ISF www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 11.
    Compute Expertise Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data • Unified Medical Language System (UMLS) aggregates Medical Subjects Heading (MeSH) and other terminologies by linking them to UMLS concept unique identifiers (CUI) • UMLS CUIs will be used to map ICD9 and CPT codes to MeSH • Expertise indexed by MeSH will enable meaningful connections between clinicians, basic researchers, and biomedical knowledge www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 12.
    Compute Expertise: MappingICD9 to MeSH www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 13.
    Compute Expertise: weighting Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data • Provider X has 500 patients • S/he has used Syndactyly (ICD9: 755.12) for 30 unique patients 75 times Percentage of patients with code: 30/500*100 = 6% Code frequency: 75/30 = 2.5 Code weight: 6 * 2.5 = 15 www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 14.
    Publish Linked Data Step 1 Step 2 Step 3 Step 4 Aggregate Map Data to Compute Publish Linked Clinical Data ISF Expertise Data Other APIs Endpoints SPARQL … Linked Data Several means Triple Stores to access and cloud query data www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 15.
    Sample encounter datapublished as LOD Health care encounter Annotations and Instance URI Properties Inferred Types www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 16.
    Querying the data www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 17.
    Beyond expertise • Encounterdata represented using ISF and published as Linked Data, in addition to enhance linkage between clinical and basic expertise, will enable integration with multiple datasets which could be used in a variety of ways to discover useful clinical associations and patterns www.ctsaconnect.org CTSAconnect Reveal Connections. Realize Potential.
  • 18.
    Information  CTSAconnectproject  Carlo Torniai torniai@ohsu.edu ctsaconnect.org  CTSAconnect ontology source  Shahim Essaid http://code.google.com/p/connect-isf/ essaids@ohsu.edu  The clinical module can be directed  Chris Barnes accessed at http://bit.ly/clinical-isf cpb@ufl.edu  Linked Data generation code http://bit.ly/isf-lod-code  Janos Hajagos janos.hajagos@stonybrook.edu  eagle-i federated search eagle-i.net  Stephen V Williams  VIVO integrated search swilliams@ctrip.ufl.edu vivosearch.org  Nicole Vasilevski  CTSA ShareCenter vasilevs@ohsu.edu ctsasharecenter.org  Melissa Haendel haendel@ohsu.edu CTSA 10-001: 100928SB23 www.ctsaconnect.org CTSAconnect PROJECT #: 00921-0001 Reveal Connections. Realize Potential.

Editor's Notes

  • #18 Need to have nice picture here about the concept expressed.. Maybe it would be great to have an actual example about connecting ISF expertise data with other data ( I can use some SAPRQL queries)For this is required clear semantics and that’s why we need RDF and OWL