• Save
Towards semantic interoperability of electronic haelth records
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Towards semantic interoperability of electronic haelth records

on

  • 398 views

For more projects visit @ www.nanocdac.com

For more projects visit @ www.nanocdac.com

Statistics

Views

Total Views
398
Views on SlideShare
398
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Towards semantic interoperability of electronic haelth records Document Transcript

  • 1. 424 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, VOL. 16, NO. 3, MAY 2012Toward Semantic Interoperability of ElectronicHealth RecordsIdoia Berges, Jes´us Berm´udez, and Arantza IllarramendiAbstract—Although the goal of achieving semantic interoper-ability of electronic health records (EHRs) is pursued by manyresearchers, it has not been accomplished yet. In this paper, wepresent a proposal that smoothes out the way toward the achieve-ment of that goal. In particular, our study focuses on medical di-agnoses statements. In summary, the main contributions of ourontology-based proposal are the following: first, it includes a canon-ical ontology whose EHR-related terms focus on semantic aspects.As a result, their descriptions are independent of languages andtechnology aspects used in different organizations to representEHRs. Moreover, those terms are related to their correspondingcodes in well-known medical terminologies. Second, it deals withmodules that allow obtaining rich ontological representations ofEHR information managed by proprietary models of health in-formation systems. The features of one specific module are shownas reference. Third, it considers the necessary mapping axiomsbetween ontological terms enhanced with so-called path map-pings. This feature smoothes out structural differences betweenheterogeneous EHR representations, allowing proper alignment ofinformation.Index Terms—Electronic health record (EHR), ontology, seman-tic interoperability.I. INTRODUCTIONIN 2009, the European Community presented a long-term re-search and deployment roadmap that provides the key stepsfor achieving semantic interoperability in the area of health-care [1]. The incorporation some years ago of electronic healthrecords (EHRs) to the healthcare institutions may be seen asthe first step toward the achievement of the goal, since, apartfrom local advantages over manual records such as avoidinglegibility problems, they favor a fast exchange of clinical databetween different organizations. However, the fact that mosthealthcare institutions have developed their health informationsystems in an autonomous way has resulted in a proliferationof heterogeneous health information systems, each one with itsown proprietary model for representing and storing EHR in-formation, which hinders the task of interoperating with eachother.Manuscript received June 6, 2011; revised November 9, 2011; acceptedDecember 13, 2011. Date of publication December 30, 2011; date of cur-rent version May 4, 2012. This work was supported by the Spanish Ministryof Education and Science under Project TIN2010-21387-C02-01. The work ofI. Berges was supported by a grant of the Basque Government (Programa deFormaci´on de Investigadores del Departamento de Educaci´on, Universidades eInvestigaci´on).The authors are with the Department of Languages and Information Sys-tems, University of the Basque Country, Donostia-San Sebasti´an, 20018Spain (e-mail: idoia.berges@ehu.es; jesus.bermudez@ehu.es; a.illarramendi@ehu.es).Digital Object Identifier 10.1109/TITB.2011.2180917In many areas, the adoption of knowledge representation stan-dards stands out as the most usual approach to solve interop-erability problems. This happens also in the healthcare area,where some standards such as openEHR [2], ISO 13606 [3], andHL7-CDA [4] are under development for this purpose. All threefollow a dual model-based methodology for representing EHRinformation: the reference model defines basic structures suchas list, table, etc., while the archetype model defines knowledgeelements (such as respiration rate) by using and constraining theelements of the reference model. Although the idea of using astandard may seem suitable for the considered goal, we thinkthat interoperability does not mean to have a unique represen-tation but a semantically acknowledgeable equivalent one. Thiswould relieve healthcare institutions from being forced to useone standard in the representation of their knowledge and more-over, since several standards are being developed for the samepurpose, the interoperability problem will remain unsolved un-less these standards merge into a single one. Currently, muchresearch is being done on the latter issue [5].In this paper, we present a proposal to move toward the no-tion of full semantic interoperability of heterogeneous EHRs,which states that when one particular system receives someEHR information from another healthcare institution, the re-ceived information can be seamlessly integrated into its under-lying repository because the differences in the language, in therepresentation of the information, and in the storing systems donot cause any misunderstanding [1]. Two general approachesfor interoperability among systems are described in [6]: usinga canonical model to which the particular models are linked oraligning the particular models 2 × 2. The proposal presented inthis paper is sustained in the former approach. More precisely,it is an ontology-based approach where OWL2 [7] ontologiesare used as representation models. In general, ontologies havebeen considered relevant for several purposes such as: enablingreuse of domain knowledge, allowing the analysis of domainknowledge, and sharing common understanding of the mean-ing of information [8]. Our approach benefits from the latteradvantage and additionally it provides the following ones.1) It favors the notion of semantic interoperability: The use ofa formal ontology as canonical conceptual model allows usto focus on aspects that are independent of the languagesor technologies used to describe EHRs.2) It favors the notion of extensibility to different models: Theframework comprises two kinds of ontologies that repre-sent the definitions of clinical terms that appear in EHRsat different levels of abstraction. The canonical containsontological definitions of EHR statements and the appli-cation ontologies contain specializations of the definitions1089-7771/$31.00 © 2012 IEEE
  • 2. BERGES et al.: TOWARD SEMANTIC INTEROPERABILITY OF ELECTRONIC HEALTH RECORDS 425of the canonical ontology according to the standards men-tioned previously or according to proprietary models ofhealthcare institutions.3) It decreases the need of human intervention: The frame-work relies on a reasoning mechanism that, using axiomsstated in the ontology, infers knowledge that allows thediscovery of more relationships among the heterogeneousmodels used by the different health information systems.Dealing with ontologies, one relevant aspect is the features ofthe terms that are part of them. In our scenario those terms arerelated to EHRs. Different kinds of information can be foundin an EHR. OpenEHR divides this information into five sub-types [9] and we also have adopted that division in the defini-tion of our canonical ontology: Observations comprise the datathat can be measured in an objective way, such as the age of apatient, his respiration rate, etc. Evaluations represent the ev-idence obtained from observations, for example, the diagnosisof an illness. Instructions represent actions to be performed inthe future such as the prescription of a medicine or the requestof a laboratory test. Actions are used to model the informationrecorded due to the execution of an instruction and finally thereis one last type to record administrative events such as admis-sion or discharge information. In this paper, we just focus onone type of evaluations, namely the diagnoses, but similar ideasto those that will be explained here for diagnoses could be alsoapplied to the other types of information. Moreover, the termsof the application ontologies are obtained from the particularhealth information systems and then linked to the terms in thecanonical ontology by using ontology mappings.A certain number of works related to ours can be found atpresent. With regard to the benefits of taking semantics intoaccount, some works discuss the convenience of using seman-tic technologies in several heathcare-related issues. In [10], thehandicaps for widespread adoption of semantic technologieswithin a care records system are pinpointed. In [11], the chal-lenges to be addressed in order to be able to use the so-calledSmart Internet to enable reforms on healthcare information sys-tems are discussed. Finally, in [12], the triplespace paradigm issuggested as semantic middleware to support pervasive accessto electronic patient summaries. The studies mentioned next alsorely on semantic technologies for interchanging data, as opposedto other formats such as XML, which are structure based. Morespecifically, related to the topic of facilitating semantic inter-operability between heterogeneous health information systems,the following works deal only with the interoperability betweenstandard-based health information systems: [13] provides a so-lution to achieve semantic interoperability between systems thathave been developed under the HL7 reference model and whichrequires that the source system has some prior knowledge aboutthe target system. In [14], ontology mappings are proposed be-tween pairs of archetype-based models. In [15], a model-drivenengineering approach that transforms archetypes of the ISO13606 standard into OWL models is presented. Finally, authorsin [16] describe an approach to translate from the ArchetypeDefinition Language (ADL [17]) to OWL, they also presentsome techniques to map archetypes to formal ontologies andshow the convenience of using semantic rules on the resultingFig. 1. Global architecture of the solution.representation in order to guide the execution of primary careguidelines. In this paper, we present a wider approach sinceapart from the interoperability of standard-based systems wedeal also with interoperability considering proprietary models.Some other works that tackle the problem of semantic interop-erability of EHRs from a different perspective are the following.In [18], a semantic conceptualization model for an EHR systemis presented. This still an early work is more oriented towardthe accessibility, use, and management of the EHR at a locallevel, but it also aims at providing a base in order to solve theinteroperability problem from a semantic point of view. In [19],the hypothesis that semantic technologies are potential bridgingtechnologies between the EHRs and medical terminologies—aswell as a possible representation of the combined semantics ofsystems to be integrated—is raised and some experimental studyis made on this issue. We also promote the connection betweenthe semantic representation of EHR statements and their codesin well-known terminologies. Finally, in [20], authors discusshow advanced middleware, such as enterprise middleware bus,and semantic web services can assist in solving interoperabilityissues between eHealth systems.The rest of the paper is organized as follows. In Section II,the global architecture of the framework is presented, and ex-tensive details about the canonical ontology and the auxiliarymodules DB2OntoModule and MappingModule are given. Thefeasibility of the solution is shown in Section III, and finally,conclusions are discussed in Section IV.II. GLOBAL ARCHITECTUREIn Fig. 1, the three-layered architecture of the solution can befound. The lower layer contains the particular underlying repos-itory of each healthcare institution, where the information of theEHRs is stored. Associated with each kind of underlying reposi-tory, there is some kind of file (e.g., database schema, set of ADLfiles) where information about the structures in the repositorycan be found. Then, the middle layer contains one applicationontology for each information system, built on top of the under-lying repository. These application ontologies are created semi-automatically from the underlying repositories by some auxil-iary modules (e.g., DB2OntoModule and ADL2OntoModule),or imported from an ontology repository, and describe seman-tically each underlying repository. Moreover, they are linked to
  • 3. 426 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, VOL. 16, NO. 3, MAY 2012their corresponding repositories by some Σ links that specifyhow to transfer information from each of the representationsto the other. Finally, the upper layer contains one canonicalontology. This ontology will contain the necessary classes andproperties to represent the different types of information thatcan be found in an EHR and is linked to each of the applicationontologies by some integration mappings defined by a Mapping-Module. Each particular healthcare institution will have only apartial view of the global framework, since with our proposalthere is no need for that institution to know anything but its un-derlying repository, its application ontology, and the canonicalontology.The proposed framework allows one healthcare institution tointerpret on the fly clinical statements sent by another one—even when they use proprietary formats. We support our claimon the following techniques.1) Logic-based descriptions: Representations of diagnosesconsidered by particular health information systems, de-scribed using standards as well as proprietary models,are expressed in our approach by using OWL2 ontologyaxioms. Moreover, terms in those axioms are related tocanonical ontology terms that focus their descriptions onlanguage and technology independent aspects. This ap-proach increases the opportunities of solving the interop-erability issue since it relies mainly on semantic aspects.2) Automated reasoning: All ontology descriptions, as wellas the mappings among elements of the ontologies, areexpressed in the same formalism OWL2. This uniformrepresentation allows the use of well-known reasoners inorder to derive new statements from the existing ones. Fur-thermore, the mismatch problem is avoided and automaticintegration is facilitated.3) Transfer mechanism: A process, guided by the previoustwo items, is implemented to transform a particular clinicalstatement from a healthcare institution into a correspond-ing clinical statement for another healthcare institution.So-called path mappings play a crucial role during thetransfer process, smoothing out the structural differencesbetween EHR representations.Finally, we are aware that the messiness of real world EHRsmay sometimes hinder the task of fitting them into the pre-sented proposal, but in our opinion this does not invalidate theadvantages it can provide in many situations.In the following sections, the canonical ontology, theDB2OntoModule, and the MappingModule are describedthoroughly.A. Canonical Ontology: Representing Diagnoses in OWLThe canonical ontology contains the necessary classes andproperties to represent the different types of information thatcan be found in an EHR. Following openEHRs classification ofEHR entries, we have defined in the ontology five classes to rep-resent the general categories: Observation, Evaluation,Instruction, Action, and Admin. Moreover, these fiveclasses have been specialized to represent more specific typesof entries. As we pointed out in Section I, in this paper, we dealwith diagnoses, which are a special case of evaluations.A diagnosis is defined as the act of identifying a disease fromits signs and symptoms, as well as the decision reached by thatact.1For this reason, in addition to representing a diagnosis as asubclass of Evaluation, its definition is enhanced with twoproperties: hasFinding, to indicate the conclusion reachedby the physician about what is happening to the patient, andhasObs, to indicate the information about the observation(s)which lead to that assessment.2Specific diagnoses are defined as subclasses of the classDiagnosis. For example, the evidence obtained as a result ofan ECG can be described by specializing the range restrictionsof the properties hasFinding and hasObs. For instance, theobservation that leads to an ECG diagnosis is an ECG recording,which is made up of several components:3some of the compo-nents refer to information about the heart’s electrical axis (i.e.,the general direction of the heart’s depolarization wavefront),while the others refer to information about the entire ECG.One advantage of working in the medical area is the exis-tence of medical terminologies, such as SNOMED [21] andLOINC [22]. These terminologies cover most areas of clinicalinformation and provide a consistent way to identify medicalterms univocally, which can be very helpful at the time of gather-ing and exchanging clinical results. Our system takes advantageof these terminologies to enhance the definition of the classesin the canonical ontology. Thus, whenever possible, each termin the ontology is related to its corresponding code in thoseterminologies.The use of terminological codes into the definitions of theclasses in the ontology increases the chances of achieving asuccessful communication.Finally, since building a canonical ontology is not an easytask, we think that efforts that are being done to definearchetypes in openEHR could be reused to achieve that task.1http://www.merriam-webster.com/dictionary/diagnosis2For the presentation we prefer a logic notation instead of the more verboseRDF/XML syntax.3For the sake of brevity, in this example only some components of an ECGare taken into account. Please refer to [2] for the whole set of components.
  • 4. BERGES et al.: TOWARD SEMANTIC INTEROPERABILITY OF ELECTRONIC HEALTH RECORDS 427B. DB2OntoModuleTaking into account the widespread use of relational databasesto store EHR records, we show in this section the main fea-tures of the module DB2OntoModule.4This module takes asinput a database schema and after applying a set of rules basedon schema features, it obtains the ontological representationsof those relational databases (i.e., the application ontology ofthat system). In the specialized literature, many approaches fortranslating relational structures into more expressive formalismscan be found: object models, description logics, and SemanticWeb technologies. Some of them follow the so-called declara-tive approach, which first convert the relational structure into adeclarative language and then the result is modified by the userto declare additional knowledge about the database (e.g., [23]).Our proposal also uses the declarative approach but its novelcontribution relies on the large number of schema propertiesthat it considers, allowing us to make explicit more knowl-edge, and in the fact that it associates with the obtained classestheir corresponding codes that appear in well-known medicalterminologies.In order for the DB2OntoModule to accomplish the last fea-ture, it deals with an element called “Terminology Manager”or, in short, “TM”, which has an associated function of theform getX(conceptName), where X is the name of a terminology(LOINC, SNOMED, or any other) and conceptName is the nameof the relation or attribute whose terminological code is to be re-trieved. For example, in the case of a relation BloodPressure(id,systolic, diastolic) the TM would containConcerning schema features, the DB2OntoModule works asfollows:Relations: Relations of the relational schema are trans-lated into OWL2 classes. Moreover, if for a given relation R,TM.getLOINC(R)=LC (being LC a particular LOINC code),a new axiom R≡ ∃loinc.{‘LC’} is added to the ontology(analogously for other terminological codes).Attributes: Two options arise: 1) If for a given attribute ain R (R.a) TM.getLOINC(R.a) returns some code LC, then anew class A is created (if there is no other class which alreadyhas that code). Moreover, the axioms A≡ ∃loinc.{‘LC’} andA ∃value.getType(a) are added. Finally, if attribute a iscompulsory in R, the axiom R ∃hasA.A is added. 2) If thereis no code for R.a in TM, a property a is created in the ontology,where Domain(a)=R and Range(a)=getType(a). Moreover, ifattribute a is compulsory in R, the axiom R ∃a is added.Integrity constraints: An integrity constraint such asR.a>30 adds a new axiom of type R ∃ hasA.(A and ∃4Other modules, such as the ADL2Onto module, would be used to performthe translations between other sources and the ontologyvalue[>30]) if R.a is in TM, and a new axiom of type R ∃a[>30] otherwise.Once the previous steps are accomplished, the next one in-volves enriching the obtained descriptions by using several typesof information, such as inclusion, exclusion, and functionaldependences:Inclusion dependences: Three different situations are consid-ered (see a previous work [24] from our group for more details):1) dependences between key (R.K) and nonkey (S.x) attributes,which indicate the existence of a foreign keys of type S.x ⊆R.K. These dependences are reflected by defining an associa-tion between the ontology classes obtained from those relations(S ∃ x.R); 2) dependences between the keys of two relations(R.K⊆R’.K’); and 3) dependences between a subset of a keyand a key (R.subK⊆R’.K’), which also have the correspondingreflection.Exclusion dependences: An exclusion dependence betweenthe keys of two relations (R.K∩R’.K’=∅) creates a new axiomof the form R ¬R’ in the ontology. In addition, if there is noclass in the ontology that subsumes both R and R’, such newclass S is created and the axioms R S and R’ S are added.Functional dependences: If a functional dependence of theform R.X→ R.y is detected, with X and y being a nonkey attributeset and a nonkey attribute, respectively, a new class X is created.Moreover, two new properties hasX and hasY are defined andthe axioms R ∃hasX.X and X ∃hasY.getType(y) are addedto the ontology.Furthermore, the ontology can be enriched by using domaininformation for attribute values, for example, in the case ofproperties expressed by enumerating attribute values. For anattribute R.a whose possible values are either A1 or A2, if bothhave a corresponding code in the TM, classes A1 and A2 arecreated in the ontology. Moreover, one general class to groupthose two classes is created (e.g., A0) and axioms A1 A0,A2 A0, A1 ¬A2, and R ∀a.A0 are added. However, in thecase where A1 and A2 have no terminological code in the TM,class A0 is created as an enumeration of two individuals a1 anda2, and axiom R ∀a.A0 is added too.All the previous types of considerations are applied in thefollowing sequence: first inclusion dependences; then when theinput relational schema is not in second or third normal form,functional dependences are used to create new classes; next ex-clusion dependences are exploited; and last integrity constraintsand domain information for attribute values are considered. Fi-nally, once the DB2OntoModule has performed the aforemen-tioned steps, a candidate ontology has been created. However,we feel that it is advisable to allow the health system administra-tor to modify the ontology in a flexible way. For example, somecommon changes could be substituting relationships with ≡relationships, modifying the names of the terms that have beencreated, or adding some missing terminological code. Thesechanges can be done manually using any well-known ontologyeditor.DB2OntoModule at Work: For example, a particular registra-tion for an ECG diagnosis may consist of four relational tablesaccording to the following schema (all attributes are consideredcompulsory)
  • 5. 428 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, VOL. 16, NO. 3, MAY 2012and the following inclusion dependences between nonkey andkey attributes:Moreover, let us consider the bogus case where the at-tribute finding of the ECGDiagnosis table must be ei-ther ‘normal ECG” or ‘abnormal ECG.” As a result of ap-plying the initial steps for transforming the schema to on-tology elements, four new classes are created in the ontol-ogy: EGCDiagnosis, ECGObservation, ECGAxis, andECGGlobal, each with its respective LOINC code. More-over, since the compulsory attribute P-Axis, whose type is“int,” has also a LOINC code at the TM, axioms P-Axis≡∃loinc.‘8626-4’, ECGAxis ∃hasP-Axis.P-Axisand P-Axis ∃value.xsd:int are created (same processfor the other attributes). Then, the rules for inclusion depen-dences are applied, and, for example, from the inclusion depen-dence ECGObservation.axis ⊆ ECGAxis.code, axiomECGObservation ∃axis.ECGAxis is created. More-over, information about the allowed values for the findingattribute is considered and a new class ECGFinding is createdas superclass of two other classes NormalECG and Abnor-malECG. Finally, manual changes are applied. For example, wehave chosen to substitute some of the subclass relationships withequivalence relationships, so the created ontology has, amongothers, the following axioms:5The second task of the DB2OntoModule is to create the Σlinks that indicate how to transfer the information from thedatabase to the ontology that has been created from it (and viceversa). This task was previously tackled by our research group,so we refer the readers to [25] for further technical details.5Throughout the paper, namespaces a: and b: will be used to refer to termsin the application ontologies of two particular systems A and B. Moreover,namespace c: or no namespace are used to indicate the terms in the canonicalontology.C. MappingModuleOnce an application ontology of one particular system hasbeen generated by the corresponding translator module, it mustbe integrated with the canonical ontology, and the mappings be-tween the terms of that application ontology and the canonicalontology must be created. A MappingModule has been imple-mented for this purpose. Wide research has been done in thespecialized literature about ontology mapping (e.g., [26]), soworking in new techniques for that same issue is out of thescope of our study. So, our MappingModule takes a pragmaticapproach and receives as input a set of basic mapping axiomsspecifically defined by the system administrator (for example,to state that the property a:loinc is equivalent to the propertyc:loinc). Then, it incorporates these basic mappings into theontologies and, with the help of a reasoner, it creates an integra-tion mapping that relates the terms of the application ontologywith those of the canonical ontology.However, our module presents a distinguishing feature, sinceit considers mappings between ontology paths, which are rarelyconsidered in other works. In order to be aware of the importanceof discovering path mappings, let us compare the definitionsof classes c:ECGRecording and a:ECGObservation inSections II-A and II-B, respectively. Both share the same LOINCcode (34534-8), so their semantics are the same. Looking at thedescription of c:ECGRecording, it can be seen that any in-dividual belonging to that class will be directly related to anindividual of the class c:P-Axis via the property c:comp(assume the same intuition for the other components). How-ever, in the case of the descriptions in the application ontologyof system A, it turns out that classes a:ECGObservationand a:P-Axis are not directly related, but indirectly: firsta:ECGObservation is related to the class a:ECGAxis viathe property a:hasAxis and then the class a:ECGAxis is re-lated to the class a:P-Axis via the property a:hasP-Axis.Then, it could be stated that there is a simple path betweenclasses c:ECGRecording and c:P-Axis, while there isa composite path between classes a:ECGObservation anda:P-Axis .Intuitively, those two paths could be regarded as equivalent,since their only difference is from the structural point of viewcaused by the heterogeneous origin of the ontologies, not froma semantic point of view. Let us show how our module dealswith that aspect.Definition 1: An ontology path is a regular expression of theform A.(p.[B])+where A,B represent class names and prepresents property names, all from the same ontology.Let us denote equivalences between paths with the symbol≡p . For instance, the aforementioned example is represented asAlthough in this example an equivalence path mapping hasbeen presented, a corresponding idea is valid for subclass pathmappings p and superclass path mappings p . In order to
  • 6. BERGES et al.: TOWARD SEMANTIC INTEROPERABILITY OF ELECTRONIC HEALTH RECORDS 429determine path mappings, first path mapping candidates aresearched.Definition 2: Let PathC = C0.p1[C1] . . . .pn [Cn ] andPathD = D0.q1[D1] . . . .qm [Dm ] be two ontology paths. Apath mapping candidate exists between PathC and PathD ifany of the following statements holds:1) C0 D0 and Cn Dm (represented as PathC∼PathD )2) C0 D0 and Cn Dm (represented as PathC∼PathD )Moreover, if PathC∼PathD and PathC∼PathD thenPathC ≡∼PathDA path mapping candidate becomes a proper path mappingwhen the semantics of both paths is found to be the same. Pathmappings are useful at the time of transforming individuals fromone ontology so that they meet the requirements of the targetontology. The implementation of path mappings is done by us-ing SWRL [27] rules. SWRL increases the expressivity of OWLand, thus, allows us to model more domain knowledge than theone achieved by using OWL in its own. Moreover, since SWRLcan be tightly integrated with OWL, there is no impedance mis-match between the modeling language and the rules language:SWRL rules can use directly the classes, properties, and individ-uals defined in the OWL model. For example, the path mappingsshown before would be implemented using the following rules(one in each way)As looking for all the candidate path mappings between twolarge ontologies might be a hard task considering both time andresources, a threshold can be established to indicate the max-imum length of the paths to be searched. Some other heuris-tics could be applied too to discover candidate path mappingsefficiently.So, to sum up, the integration mapping that is generatedbetween an application ontology and the canonical ontologycan be defined as it follows.Definition 3: An integration mapping is a structure I =O, G, M where O is a set of OWL2 axioms that comprises theapplication ontology corresponding to a healthcare institution,G is the set of OWL2 axioms for the canonical ontology, andM is a set of mapping axioms of any of the following forms:1) Co Expg , Co Expg or Co ≡ Expg , where Co is aclass name from O, and Expg is a OWL2 class expressionthat uses only terms from G.2) po pg or po pg , where po is a property name from O,and pg is property name from G.3) sameAs(io , ig ), where io is the name of an individualfrom O, and ig is the name of an individual from G.4) Patho p Pathg , Patho p Pathg or Patho ≡pPathg , where Patho is an ontology path in O and Pathgis an ontology path in G.The result of the engineering process of producing the set Mof mapping axioms is the key for the interoperability of twodifferent health information systems.III. FRAMEWORK AT WORKThe main contribution of our proposal is the capability of onesystem B of interpreting information sent by another system Aon the fly, without prior peer-to-peer agreement on the seman-tics and syntax of the interchanged data. In this example, let ussuppose that the database schema of system A is the one pre-sented in Section II-B. Moreover, in the case of system B, let usconsider that it follows the HL7 standard and that different rep-resentations are used to represent ECG information dependingon the result of the ECG (e.g.,: ECGNormalDiag for normalECG results, ECGAbnormalDiag when abnormalities havebeen detected). The work of the ADL2OntoModule and Map-pingModule led to the following axioms, with respect to theapplication ontology of system Bwhere p1=b:ECGNormalDiag.b:component[b:P-Ax]and p2=c:ECGDiagnosis.c:hasObs[c:ECGRecor-ding].c:comp[c:P- Axis].Moreover, let us suppose that system A wants to send tosystem B the following information about the ECGDiagnosiswhose code is ecg01:σcode= ecg01 (ECGDiagnosis)= (ecg01, Normal ECG, r01)σcode= r01 (ECGRecording)= (r01, ax01, gl01)σcode= ax01 (ECGAxis)= (ax01, 27, 88, 49)σcode= gl01 (ECGGlobal) = (gl01, 138, 390, 39, 112, 62)Finally, assume that some of the mapping axioms between theapplication ontology of system A and the canonical ontology arethe following:The process that needs to be carried out is composed of severalsteps:Step 1 (Classification of the information in the application on-tology): In this step, the information to be sent is converted intostatements about individuals generated for the application ontol-ogy of system A. For example, the main individual a:ecg01
  • 7. 430 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, VOL. 16, NO. 3, MAY 2012will be an instance of the class a:ECGDiagnosis. This isa straightforward process thanks to the Σ links created by theDB2OntoModule between the storage system of system A andits application ontology. Among others, the following OWLstatements (represented as triples) will be created:Step 2 (Enrichment of the local information at the applica-tion ontology): In this step, implicit information (regarding theindividuals) that can be inferred from the application ontologyof system A is made explicit with the help of a reasoner. Forexample, in this step each individual inherits a terminology codefrom its corresponding class:Step 3 (Classification of the information in the canonical on-tology): At this point, thanks to the equivalence, subsumption,and path mappings that have been defined by the MappingMod-ule and the help of a reasoner, the individuals are now classifiedas instances of the concepts of the canonical ontology. For exam-ple, given that a:ECGObservation≡ ∃loinc.34534-8and c:ECGRecording ≡ ∃ loinc.34534-8 it is wiseto think that the MappingModule will infer the equivalencemapping a:ECGObservation ≡ c:ECGRecording.Then, as the assertional box of the application ontol-ogy of system A contains the triple (a:r01 rdf:typea:ECGObservation), the new triple (a:r01 rdf:typec:ECGRecording) is inferred. Moreover, since triples(a:r01 rdf:type a:ECGObservation), (a:r01a:hasAxis a:ax01) and (a:ax01 a:hasP-Axisa:pax01) exist, path rule R1 is fired and the triple (a:r01c:comp a:pax01) is generated. The remaining new triples,some of which are shown next, can be figured out accordinglyStep 4 (Recognition at the receiver’s ontology): The triplesgenerated up to this moment are sent to system B and, thanks tothe ontological mappings defined for this ontology by the Map-pingModule, the individuals will be recognized as instancesof the classes of its application ontology. For example, due to(a:f01 c:snomed 102593009), b:snomed≡c:snomedand the definition of class b:ECGNormalFind, f01 is clas-sified as an individual of class b:ECGNormalFind, and then,due to the definition of class b:ECGDiagnosis, now themain individual a:ecg01 is classified as an individual of classb:ECGNormalDiag :Fig. 2. Excerpt of the generated HL7 entry.Step 5 (Storage at the receiver’s system): At this point, itis straightforward to store the information into the underlyingrepository of system B due to the Σ links that indicate how totransform the collection of triples into a suitable HL7 document(see Fig. 2). Note that since the main individual ecg01 has beenrecognized as of class b:ECGNormalDiag, it is possible tochoose from the HL7 entry templates of system B the one whichrepresents only information about normal ECG results—despitein the sender’s system there was only one table for storing allkind of ECG diagnoses.IV. CONCLUSION AND DISCUSSIONWe have presented a semantic-based framework that allowsthe interoperability of medical diagnoses between health in-formation systems, including those which were not developedfollowing EHR standards. The feasibility of the idea has beenproved through an example. To sum up, the main features ofthe framework presented in this paper are the following: 1) itis extensible to both standard and proprietary models, since anyhealthcare institution could create its own application ontologyand relate it to the terms of the canonical ontology via an inte-gration mapping. Two modules are provided in order to facilitatesuch a technically demanding skill: one module that facilitatesthe task of obtaining the definitions of the application ontol-ogy from a particular underlying system and another modulethat facilitates the task of linking definitions of the applicationontology to definitions of the canonical ontology; 2) it uses aformal ontology as canonical conceptual model, which allowsus to focus on semantic aspects that are independent of the lan-guages or technologies used to describe EHRs. As a result, itis not based on peer-to-peer transformations but on the seman-tic acknowledgment of one instance of a class in the source
  • 8. BERGES et al.: TOWARD SEMANTIC INTEROPERABILITY OF ELECTRONIC HEALTH RECORDS 431ontology as instance of another class in the target ontology;3) the features of any specific system remain unknown to theother systems in the framework. Acknowledging and using thecanonical ontology as a shared model is enough; 4) reasoningplays a major role in several parts of the framework, whichdecreases the need of human intervention.However, there are still some challenges, such as those re-garding scalability, that need to be addressed in order forthis approach to be accepted widely. In the case of theDB2OntoModule, the existence of the terminology managerTM is assumed. The fact that a particular term of a databasehas a corresponding terminological code in the TM allows amore precise definition of that term in the application ontology.We are aware that database systems may not provide with sucha set of correspondences, so syntactic and semantic similaritymeasures (such as Levenshtein distance6or WordNet7-basedsimilarity) between the terms in the database and those in theterminologies would have to be applied in order to obtain a set ofcandidate codes. Moreover, relational databases whose schemacan be consulted have been chosen as underlying repositories. Inthe real world, data can be far messier and come from unstruc-tured or semistructured sources. In general, the less structuredthe source is, the more difficult the construction of the ontologywill be. In the case of unstructured sources, machine learningand text mining algorithms could be used in order to create an on-tology from input documents. For semistructured data in XML,XQuery8and XPath9could be used for the extraction of relevantinformation, and moreover, fuzzy extensions of those languagescould be used to enhance that extraction. Another technique thatcould be applied in semistructured sources is ILP [28]. With re-spect to the core task of building an agreed canonical ontology,efforts devoted to classifications on standards (e.g., openEHR)or terminology taxonomies (e.g., SNOMED-CT) can be ex-ploited and oriented toward the design of such an ontology. Fi-nally, challenges concerning mappings between the applicationand canonical ontologies are diverse (e.g., variable granularityof the information, different types of data, etc.). As stated inSection II-C, extensive work has already been made on thisarea, so the definition of a new approach is out of the scopeof this paper. However, we have presented a novel contributionconcerning mapping issues: the definition of the notion of pathmappings and their implementation using SWRL rules. Addi-tionally, we suggest that specific systems publish voluntarily theintegration mappings between their application ontology and thecanonical ontology so that other systems could benefit from thisknowledge at the time of creating their integration mapping.REFERENCES[1] V. N. Stroetman, Ed., D. Kalra, P. Lewalle, A. Rector, J. M. Rodrigues,K. A. Stroetmann, G. Surjan, B. Ustun, M. Virtanen, and P. E. Zanstra,“Semantic interoperability for better health and safer healthcare,” Eur.6http://www.levenshtein.net/7wordnet.princeton.edu/8http://www.w3.org/TR/xquery/9http://www.w3.org/TR/xpath20/Commiss., Luxembourg, BE, Tech. Rep. KK-80-09-453-EN-C, Jan. 2009.ISBN-978-92-79-11139-6.[2] openEHR. (2011). [Online]. Available: http://www.openehr.org[3] Electronic Health Record Communication Part 1: Reference Model, ISO13606-1, 2008.[4] HL7-CDA. (2011). [Online]. Available: http://www.hl7.org[5] P. Schloeffel, T. Beale, G. Hayworth, S. Heard, and H. Leslie, “The re-lationship between CEN 13606, HL7 and openEHR,” presented at theHealth Informat. Conf., Sydney, Australia, 2006.[6] V. Kashyap and A. P. Sheth, “Semantic and schematic similarities betweendatabase objects: A context based approach,” Very Large Databases J.,vol. 5, no. 4, pp. 276–304, 1996.[7] OWL2 Web Ontology Language. World Wide Web Consortium. (2009).[Online]. Available: http://www.w3.org/TR/owl2-overview/[8] M. Uschold and M. Gruninger, “Ontologies: Principles, methods andapplications,” Knowl. Eng. Rev., vol. 11, pp. 93–136, 1996.[9] T. Beale and S. Heard, “An ontology-based model of clinical information,”in Proc. 12th World Congr. Health (Med.) Informat.—Build. SustainableHealth,, Brisbane, Australia, 2007, pp. 760–764.[10] C. Wroe, “Is semantic web technology ready for healthcare?,” Paper pre-sented at the 3rd Eur. Semant. Web Conf., Budva, Montenegro, Jun. 2006[11] J. H. Weber-Jahnke and J. Williams, “The smart internet as a catalystfor health care reform,” in Smart Internet—Current Research and FutureApplications, 2010, pp. 27–48.[12] R. Krummenacher, E. P. B. Simperl, D. Cerizza, E. D. Valle, L. J.B. Nixon, and D. Foxvog, “Enabling the European patient summarythrough triplespaces,” Comput. Methods Programs Biomed., vol. 95,no. 2-S1, pp. 33–43, 2009.[13] O.Kilicand A.Dogac,“Achieving clinicalstatementinteroperability usingR-MIM and archetype-based semantic transformations,” IEEE Trans. Inf.Technol. Biomed., vol. 13, no. 4, pp. 467–477, 2009.[14] V. Bicer, O. Kilic, A. Dogac, and G. B. Laleci, “Archetype-based semanticinteroperability of web service messages in the health care domain,” Int.J. Semant. Web Inf. Syst., vol. 1, no. 4, pp. 1–22, 2005.[15] C. Mart´ınez-Costa, M. M. Tortosa, and J. T. Fern´andez-Breis, “An ap-proach for the semantic interoperability of ISO EN 13606 and openEHRarchetypes,” J. Biomed. Informat., vol. 43, no. 5, pp. 736–746, 2010.[16] L. Lezcano, M.- ´A. Sicilia, and C. Rodr´ıguez-Solano, “Integrating reason-ing and clinical archetypes using OWL ontologies and SWRL rules,” J.Biomed. Informat., vol. 44, no. 2, pp. 343–353, 2011.[17] The openEHR Foundation. Archetype Definition Language. (2007).[Online]. Available: http://www.openehr.org/releases/1.0.2/architecture/am/adl.pdf[18] B. Prados-Suarez, C. Molina, M. Prados, and C. Pe˜na, “On the use of anontology to improve the interoperability and accesibility of the electron-ical health records (ehr),” in Proc. Int. Workshop Semant. Interoperabil.,Rome, Italy, Jan. 2011, pp. 73–81.[19] R. Hedayat, “Semantic web technologies in the quest for compatible dis-tributed health records,” Department of Information Technology, UppsalaUniv. Uppsala, Sweden, White Paper, Mar. 2010.[20] L. Gonz´alez, G. Llamb´ıas, and P. Pazos, “Towards an e-health integrationplatform to support social security services,” presented at the 6th Int.Policy Res. Conf. Soc. Security, Luxembourg, Sep. 2010.[21] SNOMED. (2011). [Online]. Available:http://www.ihtsdo.org/snomed-ct/[22] LOINC. (2011). [Online]. Available: http://loinc.org/[23] P.-A. Champin, G.-J. Houben, and P. Thiran, “Cross: An OWL wrapperfor reasoning on relational databases,” in Conceptual Modeling—ER 2007(Lecture Notes in Computer Science Series), vol. 4801, C. Parent, K.-D. Schewe, V. C. Storey, and B. Thalheim, Eds. New York: Springer-Verlag, 2007, pp. 502–517.[24] J. M. Blanco, A. Illarramendi, and A. Go˜ni, “Building a federated rela-tional database system: An approach using a knowledge-based system,”Int. J. Cooperat. Inf. Syst., vol. 3, no. 4, pp. 415–456, 1994.[25] J. M. Blanco, A. Go˜ni, and A. Illarramendi, “Mapping among knowledgebases and data repositories: Precise definition of its syntax and semantics,”Inf. Syst., vol. 24, no. 4, pp. 275–301, 1999.[26] J. Euzenat and P. Shvaiko, Ontology Matching. New York: Springer-Verlag, 2007.[27] SWRL. (2011). [Online]. Available: http://www.w3.org/Submission/SWRL/[28] S. Muggleton, “Inductive logic programming,” New Generat. Comput.,vol. 8, no. 4, pp. 295–318, 1991.Authors’ photographs and biographies not available at the time of publication.