CDISC - Healthcare, meet the Semantic Web


Published on

Presented at CDISC 2009 in Baltimore, it explores what the Semantic Web can bring to Healthcare. Can it be deployed right now? With ease? CDISC sets standards for the exchange of clinical trial data. Once deployed, they remove much of the redundancy and paper processing that characterizes a typical trial today. Its membership includes government regulators like the US FDA, all the major drug companies and their IT vendors.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • There are green field demos already
  • Like “goodness”
    NOT HTML ... Querying like DB querying, not page fetching
    run thru on baltimore becomes linkable data about baltimore
    web didn’t make hyperlinks or protocols or page layout
    SEM WEB: ONE MORE WEB THING ... the power, the scale was link anywhere
  • Nodes and Literals ... Codes would break down
    Detailed discussion of semi-structured (not going to get into this aspect)
    Observable (OBX|4|CE|30949-2^Vaccination adverse event outcome^LN|1|005^required hospitalization^NIP|)
    NIP= National Immunization Program within the Center for Disease Control
    One standard code is 30949-2.
    For the astute: better if code became URI.
  • Beyond an isolated set of patients
  • FOLLOW THE LINKS: Typical Report: chase type (Ontology) in a world of (EMR) particulars. Stanford Drug Ontology gives compounds that treat conditions. RxNorm relates compounds to branded drugs. Hoot72 Clinical Data has branded drugs.
    From: HCLS == the w3c Health Care and Life Sciences Interest Group
    Patient Data is secure - “intranet” LINK OUT
    RxNorm not yet an ontology but has web api so can represent it as a SPARQL end point
    Simplied to fit. Ingredient = consists of to ingredient to brand name etc.
  • Ala Semantic Web: pretty loose definitions. Philosophy.
    Dumb AI
    from obese to desoxyn ... we need entities
    Middleware - format gets out of the way. IT gets out of the reformatting business.
  • Growing in number ... Billions of triples, ready to be leveraged, all these URIs.
    Gen purpose (demographics) and PubMed, Drug Bank, GeneID, Diseasome
    Arrows representing linking out to another conceptual scheme
    1. STANFORD and BIO-MED guys big ...

    CDASH ODM (machine readable)
    CDISC SDTM and other terminology goes through an extensive process of definition, development, and review before it is declared ready for release. Terminology that has completed this process is tagged as "Production," and now includes some 50 SDTM codelists with about 2,200 terms covering demographics, interventions, findings, events, trial design, units, frequency, and ECG terminology. This terminology is maintained and distributed as part of NCI Thesaurus
  • CDC Example.
    We will have standard and local ontologies, standard and local queries
    there may be several different ways to express the same concept. Human users may be able to recognise that these are essentially the same, but the rules for doing so must be made explicit to be usable by computer. -- Why is Terminology hard?, Alan Rector
  • ME to learn of CDISC work. See how to leverage all the work.
    FDA: “Improve Interoperability: The Target EA establishes enterprise-wide standards that promote platform and vendor independence, enabling greater interoperability across disparate applications, both internal and external”
    DOCUMENT model vs GRAPH model
    Trial == Snapshot. Extrapolated from individual observations (weight gain etc)
    ... Here at the Drug Information Association (DIA), you can see a “live” implementation of the interoperability that is possible between Electronic Health Record (EHR) systems and Electronic Data Capture (EDC) systems used for clinical research, which leverages the Integrating the Healthcare Enterprise’s (IHE) Retrieve Form for Data Capture (RFD) integration profile along with CDISC’s ODM and CDASH standards
    Contrast to RDIF XFORMs.
    The big picture ... Concept and Concrete, Users and Contributors in one web
    Trial Recruitment, Drug Safety, Outcomes research
  • HL7 holds our health data
    HL7 everywhere means v2. Small V3.
  • Of course, more structured than your average tweet
    Pick out message type, patient name, contact relationship, body weight observation
  • Unload the Truck
  • What we've done
    Key is automatic i.e. requirement
    Mapping is on the site.
    Moving beyond rough logs.
  • But don't just want a script EMR
  • Get Real
    The Integration Control Number (ICN) - ASTM e1714-95 standard for a universal health identifier.
    Like the efforts in the showcase to interop EMRs
  • Already done for us - or at least we know it works
    Note Multiple VistAs
    HL7 is triggered. Data there and WHEN it is there.
  • open use docs first
  • MUMPS (Massachusetts General Hospital Utility Multi-Programming System)
    EMR NOT LEFT OUT OF THE PICTURE, not just a “old” aside.
    Looking in the code, you could see ...
  • Very early on this.
  • More than the two ways here
  • CDISC - Healthcare, meet the Semantic Web

    1. 1. Hoot72 Health-Care, meet the Semantic Web CDISC 2009
    2. 2. • “Demonstrate the power the Semantic Web brings to Health-Care and how easy it is to deploy today.” • Incubate: open source, docs • Not Green Field - 40+ years of Health IT
    3. 3. : Just Another Format? • Technology: Web Stack++ • Reuse: HTTP, URIs, not HTML • + RDF, OWL, SPARQL • Get Link docs -> Query Link data • One more reuse: Link ANYWHERE • Begone CD-ROM: no islands • WW: new adds on, reuse, open
    4. 4. A Linkable Patient type Patient about personName observation observationValue familyName givenName Doe CodingSystem Code middleName Code CodingSystem John Fitzgerald LN 30949-2 005 NIP Identifiers and Time not shown URI:
    5. 5. Now Just Ask ... All Patients with adverse outcome from vaccine ... SELECT DISTINCT ?givenName ?familyName WHERE { ?patient hoot72:personName [ hoot72:givenName ?givenName ; hoot72:familyName ?familyName ] . [ hoot72:about ?patient ; ?assert [ hoot72:nameOfCodingSystem "LN" ; hoot72:simpleIdentifier "30949-2" ] ] }
    6. 6. Move out and up • Question: Patients taking “Weight Loss Drugs” • Patient Web: very particular • Patient drugs as NDC codes: DESOXYN TABLETS (00074337701) ... • Too big a gap?
    7. 7. Ontologies Link! Obese StanDrug: C0025611 Methamphetamine May Treat Name Name SameAs Stanford Drug Ontology Methamphetamine RxNorm:6816 Ingredient NDC: 00074337701 Patient Joe SameAs Hoot72 Name Patient Medication Graph Desoxyn 5MG Tablet NDC: 00074337701 RxNorm * Dotted: composite of links to save space ** w3c HCLS Example
    8. 8. The Ontologies? • “an implementable model of the entities that need to be understood in common in order for some group of software systems and their users to function and communicate at the level required for a set of tasks” -- Alan Rector • “Shared Knowledge” for Machines • Links, hierarchies, equivalence ... • The “middleware” of the Semantic Web • OWL (WOL) - Web Ontology Language
    9. 9. More Every Day ... URIs for SDTM “Finding” ...
    10. 10. Not just “Standards” SameAs CodingSystem CodingSystem Code Text Code Local 182253 MRSA Culture LN 13317-3 Local Code LOINC Code • Enable standard, off-the-shelf queries • Definition is incremental
    11. 11. CDISC: it’s the content • Roadmap: “The separation of content standards from the means of transporting that content” • Terms: to OWL and Endpoints • “BRIDGing” in OWL • Trials as querable Graphs (vs docs)
    12. 12. Many Users, Contributors Patient Researcher Linked Health Data Doctor Informatics Insurance Manager One Semantic Web for Health-Care
    13. 13. But ... “Patient Gap” • “Trapped”, “Silo’ed” • Ontologies Left Waiting • EMRs Hold Back
    14. 14. Approach: Mine
    15. 15. Enabler: the Silo’s chat “HL7 version 2 is a major breakthrough and market 2.2 success. More than 93% 2.1 3.0 2.3 hospitals in US are using this 2.5 2.4 standard” - Health Level Horizon (HLH) Project 2.3.1 2.1 2.2 2.3 2.3.1 2.4 2.5 3.0 Source: Neotool, V3 vs V2
    16. 16. HL7 “tweet” ... MSH|^~&|REGADT|MCM|IFENG||199112311501||ADT^A04^ADT_A01|000001|P|2.4||| EVN|A04|199901101500|199901101400|01||199901101410 PID|||191919^^GENHOS^MR~371-66-9256^^^USSSA^SS|253763|MASSIE^JAMES^A|| 19560129|M|||171 ZOBERLEIN^^ISHPEMING^MI^49849^""^||(900)485-5344| (900)485-5344||S^^HL70002|C^^HL70006|10199925^^^GENHOS^AN|371-66-9256|| NK1|1|MASSIE^ELLEN|SPOUSE^^HL70063|171 ZOBERLEIN^^ISHPEMING^MI^49849^""^ |(900)485-5344|(900)545-1234~(900)545-1200|EC1^FIRST EMERGENCY CONTACT^HL70131 NK1|2|MASSIE^MARYLOU|MOTHER^^HL70063|300 ZOBERLEIN^^ISHPEMING^MI^49849^""^ |(900)485-5344|(900)545-1234~(900)545-1200|EC2^SECOND EMERGENCY CONTACT^HL70131 NK1|3 NK1|4|||123 INDUSTRY WAY^^ISHPEMING^MI^49849^""^||(900)545-1200| EM^EMPLOYER^HL70131|19940605||PROGRAMMER|||ACME SOFTWARE COMPANY PV1||O|O/R||||0148^ADDISON,JAMES|0148^ADDISON,JAMES||AMB||||||| 0148^ADDISON,JAMES|S|1400|A|||||||||||||||||||GENHOS|||||199501101410| PV2||||||||199901101400|||||||||||||||||||||||||199901101400 ROL||AD|CP^^HL70443|0148^ADDISON,JAMES OBX||NM|3141-9^BODY WEIGHT^LN||62|kg|||||F James was admitted ... his wife is his emergency contact ... hereʼs his weight ...
    17. 17. What if? OR|20010331605||ORU^R01|20010422GA03|T|2.3.1|||AL| 725^^^^MR||Doe^John^Fitzgerald^JR^^^L||20001007| M||2106-3^White^HL70005|123 Peachtree St^APT 3B^Atlanta^GA^30210^^M^^GA067||(678) 555-1212^^PRN| |||||||||Peachtree Clinic|101 Main Street^^Atlanta^GA^38765^^O^^GA121|(404) 554-9097^^WPN|101 Main Street^^Atlanta^GA^38765^^O^^GA121| Unload to a query-happy Graph
    18. 18. Observation PID|||1234^^^^SR~1234-12^^^^LR~00725^^^^MR||Doe^John^Fitzgerald^JR^^^L| ... OBX|4|CE|30949-2^Vaccination adverse event outcome^LN|1|005^required hospitalization^NIP| type Patient about personName observation observationValue familyName givenName Doe CodingSystem Code middleName Code CodingSystem John Fitzgerald LN 30949-2 005 NIP Identifiers and Time not shown
    19. 19. Hoot72 HL7 Mapper HL7 Message Definitions HL7 Hoot72 Messages Mapper Clinical Data Repostory Hoot72 Ontology
    20. 20. Which Represents ... CDR/S EMR Research HL7... URL SPARQL Personal Ontology Report Represent Produce
    21. 21. Reality: from Vets • Concrete EMR - VistA • VA: Largest U.S. Care Provider • 128 VistAs, federated, 14+ Million in MPI - ICNs • Available under FOIA • The Proof • Mapper Subscribes for HL7 (30/120 packs) • Maintains a CDR/S for VistA (1 or more)
    22. 22. Austin already calls DOD/CDC HL7... MPI/HDR VistA HealthEVet Report Represent Produce
    23. 23. Our Austin • Interfaces under/not doc’ed • first document (a graph!), example code • open = clear (vs standard) documents • Demographics, Patient Identity, >1 EMR • ADT/VQQ messages (1->40) • Now Vitals, Adverse Reaction ...
    24. 24. Approach: Wrap
    25. 25. Every EMR, an EndPoint • EMR links to the cloud, natively • Mini-Austin: MPI only (old VA Approach) • Lucky: MUMPS repositories • Network-Format ala Semantic Web • VistA’s FileMan (no scale to test) • If only you could SPARQL them ...
    26. 26. FMQL: SPARQL-like SELECT ?name ?diagnosis ?age ?history FILE "PATIENT" WHERE {?r "NAME" ?name ; "DIAGNOSIS" ?d . ?d "DIAGNOSIS" ?diagnosis ; "AGE AT ONSET" ?age ; "HISTORY" ?history } • Specification in progress • Initial goal: limited Patient, meta data dumps
    27. 27. Summary • Semantic Web growing in Health-Care • But a “Patient Gap” • Different ways to bridge • CDISC can drive it forward • More: