SlideShare a Scribd company logo
1 of 13
Download to read offline
Clinical Document Data Warehouse (CD2W)
           Josep Vilalta Marzo a , Diego Kaminker b , Josep M. Picas Vidal c, , M. Lluisa Bernard Antoranz                d

                                          , Cristina Siles c , Rafael Rosa Prat a
       a
         Vico Open Modeling S.L.,b Kern Information Technology SRL, c Hospital de la Santa Creu i Sant Pau , d Institut Català
       de la Salut




Abstract                                                              Nowadays, there are several on-going projects with the shared
                                                                      goal of consolidation of clinical information, both at the Spain
This paper shows the development of the Clinical Document             level or at a autonomic community level.
Data Warehouse project (CD2W) and its implementation                  These projects also share a great level of complexity and costs
using CDA R2 documents (Clinical Document Architecture                to achieve their goals. This small project CD2W aspires to
Release 2).                                                           help in achieving this consolidation of clinical information
The main objectives of this project was to prototype a web            with focus on easing the task for the healthcare providers,
based portal allowing access to a clinical document repository         organizations administering huge data bases with several
and a data warehouse with clinical information about patients         difficulties to integrate and no reference information model
 from several medical organizations in Barcelona, Spain (a            available.
major hospital and 4 primary care centers), giving access to
clinical patient information for primary care and leveraging
the same standardized information to populate the data                Materials and Methods
warehouse (secondary use).
The project was developed during the first half of 2009 under         Hospital de la Santa Creu i Sant Pau is a high complexity
the general direction of the Hospital de la Santa Creu i Sant         hospital which dates back six centuries, making it the oldest
Pau. (HSCSP)                                                          hospital in Spain. Healthcare is centered on Barcelona but
Due to the prototype nature of the project, the scope was             extends to the rest of Catalonia. The center plays a prominent
limited to patients with Congestive Heart Disease (CHD) who           role in Spain and is internationally renowned.
consented the use of their information for clinical research to
the HSCSP and with the current vocabularies used by the               The hospital has distinguished itself in the healthcare provided
providers (ICD9 and other local terminologies).                       in many fields, making it a reference centre in several
The data warehouse was developed using HL7 RIM basic                  specialties. The center attends over 34,000 admissions each
concepts.                                                             year and more than 150,000 emergencies. Approximately
The project mission was to improve patient care with the use          300,000 people are visited at the ambulatory services annually
of global standards, open technology, low exploitation cost           and the Day Hospital attends over 60,000 users. There are 71
and ease of use. CDA R2 documents.                                    day hospital beds, 634 hospitalization beds and 19 surgical
During this project, we used the SCRUM agile methodology              rooms.
allowing scalable, progressive and incremental software               Teaching and training programmes at the Hospital de la Santa
development process.                                                  Creu i Sant Pau cover many levels, comprising the UAB
                                                                      (Universitat Autonoma de Barcelona) Faculty of Medicine
                                                                      Teaching Unit, the University School of Nursing, participation
Introduction – Business Case                                          in the State Residency Programmes to train specialists,
                                                                      masters and doctorate course, continuing education, etc. In the
Solving the questions arising when a patient shows several            field of research, Hospital de la Santa Creu i Sant Pau is one
clinical issues is one of the greatest challenges for healthcare      of the most prominent centers in Spain, as can be appreciated
providers.                                                            from the volume of papers published and their input factor, the
Fast clinical decisions making at the healthcare location             number and quality of projects which receive funding and the
generates the need to put all relevant and up-to-date clinical        grants awarded. The Hospital de la Santa Creu i Sant Pau is
information to support the process.                                   governed by the Patronat de la Fundació de Gestió Sanitària
Usually, we don't have efficient filtering flagging the critical      (FGSHSCSP), a board with representatives from the Regional
issues where to focus our attention. We encounter scenarios           Catalan Government (Generalitat de Catalunya), the City Hall
with data overload demanding a big effort to synthesize useful        of Barcelona and the Archbishopric of Barcelona.
information, or sparse data demanding the use of imagination
                                                                      The goals for this project were:
to connect and reach any conclusion.
Another problem is that usually our available clinical                1. Define a scheme to integrate information from disjoint
information source is the sole organization supporting the            information platforms.
healthcare provider. Other information generated by other             2. Implement a process for periodical data and document
channels where the patient was attended is usually brought by         exchange with minimal workload implications for the primary
the patient in several paper formats. Consolidation of relevant       care centers.
and up to date information from distinct healthcare                   3. Generate a clinical data store enabling the coexistence of
organizations from different authorities is a pending issue           clinical documents and relational data in a longitudinal patient
waiting for an agreement from the healthcare authorities and          healthcare record..
harmonization of different data base schemes.
4. Create a simple user interface to ease fast queries to the
relevant clinical information about a patient..
5. Restrict access to clinical information to professionals
authorized by the participating organizations (HSP and ICS)
6. Use open technology for design, development and
implementation of the data warehouse, minimizing
exploitation costs and use global interoperability standards to
enable universal access.
7. Evaluate the impact of normalizing data from different
organizations and code systems..
8.Evaluate useability and added value chain of CD2W to the
physicians for healthcare decisions
9. Evaluate results of this prototype to study the possible
extension of the access to the patients or to other
professionals.
10 .Evaluate the results for a new project with a broader
scope.                                                                         Figure 2 – CD2W Domain Analysis
Implementation, Methodology and Tools                             And finally the derivation of a datawarehouse model to
                                                                  store facts, dimensions and supporting standard clinical
The project has four main design aspects:                         documents.
1. Clinical Datawarehouse Design                                  RIM-wise, facts are acts, and main dimensions are
2. Standard Document Design                                       entities and roles and their attributes.
3. Datawarehouse Population Process Design
4. User Interface Design                                          2. Standard Document Design.
5. Technological implementation.
Lets review them:
                                                                  The documents were the 'lingua franca' between disparate
                                                                  systems used by the participating organizations (primary care
1. Clinical Datawarehouse Design                                  centers, Hospital her and discharge system) [1].

The model for CD2W was derived from the RIM base classes          We designed two different clinical document templates, one
(role,entity,act,participation) [3] following a three step        for the evolution note from the primary care centers and one
methodology:                                                      for the discharge note from the Hospital discharge system.
a- Development of a conceptual framework, to be discussed
with the CD2W stakeholders, including which were the              The templates shared the same information at the header level,
measures or facts, and how should the information be
                                                                  but differed in their section contents.
classified (dimensions).                                          Since this data warehouse was intended for secondary use,
                                                                  patient and physician information was de-identified [2]
                                                                  (names, identifications and addresses were removed or
                                                                  replaced).

                                                                  The information generated by the local provider applications
                                                                  was transformed using an XSL to clinical documents and this
                                                                  standard documents were processed and stored into the
                                                                  C2DW.

                                                                  We tested our mapping, process and query interface with
                                                                  16125 clinical documents from the hospital and primary care
                                                                  centers.




             Figure 1 – CD2W conceptual model

b. Then, on a more technical level, a domain analysis model
was generated.
In order to guide the development team a table was built with
the main required elements from each primary care center and
its location inside of the standard clinical document.              B. Data Generation Process

               Figure 4 – Transformation Table                      Use Cases related to the DW population

                                                                    Login
                                                                    - User validation
                                                                    - Batch load
                                                                    - Stat Processing
                                                                    - Document Processing
                                                                    - Update audit log


                                                                    C. Queries

                                                                    Data access and retrieval from the users
                                                                        −    Retrieve app parameters
                                                                        −    Query control panel (figure 3)
                                                                        −    Query Patient Monitor (figure 4)
                                                                        −    Query Encounter Monitor
3. Data warehouse population process                                    −    Query Conditions Monitor
                                                                        −    Query Healthcare Centers
                                                                        −    Query Healthcare Professionals
The process to populate the data ware house included several
steps.                                                                  −    Query Healthcare Services
         [At the primary care center]                                   −    Query Stats (figure 5,6)
    1.   Select encounters for patients suffering of CHD with           −    Query audit logs
         signed consents.                                               −    Browse CDA R2 document (figure 7)
    2.   Create a basic, shallow XML for each primary care center
         encounter                                                  The following figures illustrate the main user interface forms for
         [At the data processing center]                            CD2W:
    3.   Create a CDA conformant document using the defined
         mapping for the center, for each instance
    4.   Populate the CD2W database with the information from
         each document header, and the document itself.

This process was triggered periodically by the primary care
centers and HSP.


4. User interface design

The user interface design was use-case based.
Identified use cases were:

A. App Administration

Application setup and parametrization
    − Define healthcare agent
    −    Define healthcare agent type
    −    Define healthcare agent role
    −    Define app parameters
    −    Define service catalog
    −    Define service



                                                                                    Figure 3 – CD2W Control Panel
Figure 7- CDA R2 Document for an encounter

                                          5. Technological Implementation
                                          The system was implemented using an open source Apache
                                          application service, a MySQL database, PHP development
                                          environment and the amCharts data analysis and graphing
                                          component.

                                          Evaluation/Assessment/Lessons Learned
                                          The HL7 Reference Information Model was very useful to aid
   Figure 4 – CD2W Patient Query          modeling our specific domain and generate a scheme to
                                          integrate all participating applications into an information
                                          model.

                                          The process for exchange could be established, but we needed
                                          to educate the participating centers on the use of CDA R2 and
                                          the rationale for asking each piece of information.
                                          Nevertheless, we needed to bridge their data model to CDA
                                          R2 by providing them with a customized XSL for each center.
                                          Coexistence of documents with the relational information
                                          needed to explore the embedded information was possible,
                                          although we need to test with more volume.
                                          The user interface was enough for our pilot users from three
                                          centers to achieve their data exploration and verification
                                          needs. The use of open technologies and standards was a key
 Figure 5 – Services for one patient      factor to minimize development time. Our normalizing efforts
                                          were not finished, we end up using ICD-9 and internal ICS
                                          vocabularies.


                                          Future Plans

                                          Extension of this tool to be accesible to patients and other
                                          professionals will be studied, but current policies make very
                                          difficult to gain access to patient data, even for this approved
                                          project. We also want to explore using native XML open
                                          source databases. A project with a greater scope will be
                                          studied but the whole approach and the generated model are
                                          suitable for other domains.

Figure 6- Temporal series for a patient
Acknowledgements

Spain Ministry of Health for Scholarship FIS Dossier P105/230.

Bibliography

[1] HL7 Clinical Document Architecture, Release 2
Robert H Dolin et al
JAMIA 2006;13:30-39 doi:10.1197/jamia.M1888

[2] Ley Orgánica 15/1999, de 13 de diciembre, de Protección de
Datos de Carácter Personal
(BOE núm. 298, de 14-12-1999, pp. 43088-43099)

[3] The Design of the HL7 RIM-based Sharing
Components for Clinical Information Systems
Wei-Yi Yang, Li-Hui Lee, Hsiao-Li Gien, Hsing-Yi Chu, Yi-Ting
Chou, and Der-Ming Liou
World Academy of Science, Engineering and Technology 53 2009



Contact
Josep Vilalta Marzo
Francesc Layret 24, Badalona, Barcelona, Spain
Email: jvilalta@vico.org
Improving the Usability of HL7 Information Models by Automatic Filtering


                  Antonio Villegas and Antoni Oliv´ e                                  Josep Vilalta
      Services and Information Systems Engineering Department               HL7 Education & e-Learning Services
                 Universitat Polit` cnica de Catalunya
                                  e                                      HL7 Spain (Health Level Seven International)
                           Barcelona, Spain                                          Barcelona, Spain
                Email: {avillegas, olive}@essi.upc.edu                            Email: jvilalta@vico.org



   Abstract—The amount of knowledge represented in the            to manually extract knowledge from them. This problem is
Health Level 7 International (HL7) information models is very     shared by other large models [6].
large. The sheer size of those models makes them very useful         Currently, there is a lack of computer support to make
for the communities for which they are developed. However,
the size of the models and their overall organization makes it    those models usable for the goal of knowledge extraction. In
difficult to manually extract knowledge from them.                 this paper, we propose to extract that knowledge by using a
   We propose to extract that knowledge by using a novel          novel filtering method that we have developed, and we show
filtering method that we have developed. Our method is based       that the use of our prototype implementation of that method
on the concept of class interest as a combination of class        improves the usability of HL7 information models.
importance and class closeness. The application of our method
automatically obtains a filtered information model of the whole
                                                                     The structure of the paper is as follows. Section II
HL7 models according to the user preferences. We show that        introduces the HL7 models and describes the main UML
the use of a prototype tool that implements that method and       constructs used to build them. Section III describes the
produces such filtered model improves the usability of the HL7     concept of class importance and references the methods that
models due to its high precision and low computational time.      can be used to compute it. Section IV describes the concept
 Keywords-Usability, Health Level Seven International, HL7,       of class interest with respect to a filter set of classes and
Models, Filtering, UML                                            explains how to compute it. Section V presents our model
                                                                  filtering method. Section VI evaluates the use of the method
                     I. I NTRODUCTION                             in the context of the HL7 models. Finally, Section VII
   The Health Level Seven International (HL7) is a not-for-       summarizes the conclusions and points out future work.
profit, ANSI-accredited standards developing organization
                                                                               II. HL7 I NFORMATION M ODELS
dedicated to providing a comprehensive framework and
related standards for the exchange, integration, sharing, and     Types of Models
retrieval of electronic health information that supports clini-      The HL7 information models comprise three types of
cal practice and the management, delivery and evaluation of       models. Each of the model types is based on the UML,
health services [1].                                              although the concrete notation used differs depending on
   HL7 develops specifications, the most widely used being         the model type. Also, the models differ from each other in
a messaging standard that enables disparate healthcare ap-        terms of their information content, scope, and intended use.
plications to exchange key sets of clinical and administrative    The following types of information models are defined:
data. The HL7 standard specifications are unified by shared            • Reference Information Model (RIM) - The RIM is the
reference models of the healthcare and technical domains               information model that encompasses the HL7 domain
[2], [3].                                                              of interest as a whole. The RIM is a coherent, shared
   The amount of knowledge represented in the HL7 infor-               information model that is the source for the data content
mation models is very large and continuously improved. The             of all HL7 interoperability artifacts: V2.x messages and
sheer size of those models makes them very useful to the               XML clinical documents CDA R2 [3].
communities for which they were developed: HL7 interna-              • Domain Message Information Model (D-MIM) - A D-
tional affiliates with more than fifty HL7 active working                MIM is a refined subset of the RIM that includes a set
groups (Structured Documents, Clinical Decision Support,               of classes, attributes and relationships that can be used
Clinical Genomics...), large integrated healthcare delivery            to create messages and structured clinical documents
networks, government agencies and other organizations that             for a particular domain (a particular area of interest
use those models for the development of their enterprise               in healthcare). There are predefined D-MIMs for a set
information architecture of health systems [4], [5].                   of over 15 universal domains, such as Accounting and
   However, the size of HL7 information models and their               Billing, Care Provision, Claims and Reimbursement,
organization makes it very difficult for those communities              and so on.
Figure 1.   Sample of HL7 RIM refinements related to ActAppointment class



  •   Refined Message Information Model (R-MIM) - The R-                  2) The multiplicities of an association defined between
      MIM is a subset of a D-MIM that is used to express the                RIM classes are strengthened in the subclasses.
      information content for a message/document or set of               3) The multiplicity of an attribute of a RIM class is
      messages/documents with annotations and refinements                    strengthened in a subclass. An optional attribute in
      that are message/document specific. The content of an                  a RIM class can be made mandatory or not allowed
      R-MIM is drawn from the D-MIM for the specific                         in a subclass. Note that it is not allowed to add new
      domain in which the R-MIM is used.                                    attributes.
                                                                          R-MIM models refine D-MIM models in the same way.
Structure of the HL7 Information Models
                                                                       In all cases, the three kind of refinements can be expressed
   The RIM, D-MIM and R-MIM models can be analyzed                     using UML constructs.
as if they were built using in a particular way a small subset            Figure 1 shows a few refinements related to the ActAppo-
of constructs provided by the UML [7]. Figure 1 illustrates            intment class. The instances of this class are appointments
with a very small fragment of the RIM and of one D-MIM                 (a particular kind of Act). There may be several kinds of
the main UML constructs used. RIM comprises six backbone               participations in an appointment. Figure 1 shows only two
classes: Act, Participation, Entity, Role, ActRelationship and         of them: PerformerOfActAppointment and SubjectOfActAp-
RoleLink. Figure 1 shows the first four of these classes.               pointment. To indicate that when the act is an appointment
Each one has a number of attributes with a defined mul-                 then the participations must be instances of PerformerOf-
tiplicity. Surprisingly, there are only eight main associations        ActAppointment or of SubjectOfActAppointment, we redefine
between the RIM classes, all of them binary and with their             the association Participation-Act as shown in the figure. Note
corresponding multiplicities. Figure 1 shows four of these             that redefinition is a UML construct, which is very useful in
associations.                                                          situations like this one. The redefinition of the association
   Each of the RIM classes has many subclasses, although               Role-Participation is similar. The overall semantics of these
only a few of them are explicitly shown in the diagrams                redefinitions is that the performer of an appointment is a
of the HL7 RIM specification. There are many special-                   Person that plays the role AssignedPerson, and that the
ization/generalization relationships (called IsA relationships,        subject of an appointment is a Person that plays the role
e.g. Organization IsA Entity) in the HL7 models. The num-              Patient.
ber of RIM classes and subclasses is over 2,500. Figure 1                 Sometimes, the UML redefinition construct does not allow
shows seven subclasses of four of the backbone RIM classes             the graphical representation of the strengthening of asso-
and seven IsA relationships.                                           ciation multiplicities. In these cases, the redefinition must
   D-MIM models refine the RIM in three ways:                           be formally captured by OCL invariants. For example, in
   1) The participants of one of the eight main associations           Figure 1, the refinement of act in SubjectOfActAppointment
       defined between RIM classes are refined in the sub-               also implies that an instance of ActAppointment is associated
       classes. This is the refinement most often used in the           with a non-empty set (1..*) of SubjectOfActAppointment.
       HL7 models. Note that it is not allowed to add new              However, this cannot be expressed graphically, and an OCL
       associations.                                                   invariant must be used instead [8].
Table I
   Figure 1 also shows the redefinitions of associations                                  T OP -10 M OST I MPORTANT C LASSES .
player-playedRole and scoper-scopedRole between Entity
and Role. The player and the scoper of an AssignedPerson                               Rank   Class                  Importance
and of Patient must be a Person and an Organization,                                    1     Act                       7.51
respectively.                                                                           2     Role                      5.11
                                                                                        3     ActRelationship           4.03
             III. I MPORTANCE OF HL7 C LASSES
                                                                                        4     Participation             3.67
   Our filtering method is based on the concept of class                                 5     Entity                    3.5
importance. The importance of a class is a real number that                             6     Observation               2.64
measures the relative importance of that class in a model.                              7     InfrastructureRoot        1.81
We will see in the next section that we use that importance                             8     Organization              1.72
to select which classes are shown to the users.                                         9     RoleLink                  1.59
   There exist different kinds of methods to compute the                                10    FinancialTransaction      1.54
importance of classes in the literature. The simplest family
of methods is that based on occurrence counting [9]–[11],
where the importance of a class is equal to the number of                   The filtering method described in the next sections can
characteristics the class has represented in the model. These             be used in connection with any of the existing methods for
methods are class centered in the sense that the importance               computing the importance of classes.
of a class depends only in the information the class has.
Therefore, the more information about a class, the more                                IV. I NTEREST OF HL7 C LASSES
important it will be.                                                        The importance of a class is an absolute metric that
   Another family of methods are those based in link analysis             depends only on the whole set of HL7 models. The metric
[11], [12], where the importance of a class is defined as                  is useful when a user wants to know which are the most
a combination of the importance of the classes that are                   important classes, but it is of little use when the user is
connected to it with associations and/or IsA relationships.               interested in a specific subset of classes, independently from
Such recursive definition results in an equation system and                their importance. What is needed then is a metric that
indicates that the more important the classes connected                   measures the interest of a class with respect to such set,
to a class are, the more important such class will be. In                 that we call filter set.
these methods the importance is shared through connections,                  A filter set FS of classes is a non-empty set of classes
changing from a class centered philosophy to a more in-                   from the HL7 models. The filter set comprises the minimum
terconnected approach of the importance. Iterative methods                set of classes in which a user is interested at a particular
are required to solve the importance equation system, which               moment. For example, if the user wants to see what is
increases the computational cost of this kind of methods.                 the knowledge the models have about classes Patient and
   Finally, there are some methods that even use the infor-               ActAppointment, then she defines FS = {Patient, ActAppo-
mation about the existing instances of the classes and the                intment}. We will see in the next section that starting from
associations of the model. Therefore, the importance they                 this filter set, our filtering method retrieves the knowledge
compute takes into account the structural part of the model               represented in the models about Patient and ActAppointment
but also the data that the classes instantiate. The problem               that is likely to be of more interest to the user.
with this family of instance-dependent methods [13], [14]                    Additionally, it is possible to define a set of classes not to
is that without instances the method cannot be used.                      be considered in the filtering method. We call such set the
   As an example, Table I shows the 10 most important                     rejection set RS.
classes of the HL7 models1 computed using the CEntityRank                    Intuitively, the interest to a user of a class c with respect
importance algorithm (see 3.6 of [15]). To compute this                   to a filter set FS should take into account both the absolute
importance, the method takes into account the classes, the                importance of c (as explained in the previous section) and
IsA relationships between them, the attributes and their                  a closeness measure of c with regard to the classes in FS.
multiplicities, the associations and their multiplicities, the            For this reason, we define:
association redefinitions and the OCL invariants.
   The in-depth study of the computation of the importance
of classes is beyond the scope of this paper. A review of                        Φ(c, FS) = α × Ψ(c) + (1 − α) × Ω(c, FS)             (1)
methods to compute the importance taking into account                        where Φ(c, FS) is the interest of class c with respect to
different levels of knowledge is given in [15].                           FS, Ψ(c) the absolute importance of class c, and Ω(c, F S)
   1 The results have been obtained taking into account the RIM and the
                                                                          is the closeness of class c with respect to FS.
following D-MIM models: Laboratory, Account and Billing, Scheduling          Note that α is a balancing parameter in the range [0,1]
and Medical Records [1].                                                  to set the preference between closeness and importance for
the retrieved knowledge. An α > 0.5 benefits importance
against closeness while an α < 0.5 does the opposite. The
default α value is set to 0.5 and can be modified by the user.
   There may be several ways to compute the closeness
Ω(c, FS) of class c with respect to the classes of FS.
Intuitively, the closeness of class c should be directly related
to the inverse of the distance of c to the filter set FS. For
this reason, we define:
                                    |FS|
                 Ω(c, FS) =                                 (2)
                                        d(c, c )
                               c ∈F S

   where |FS| is the number of classes of FS and d(c, c )
is the minimum distance between a class c and a class
c belonging to the filter set FS. Intuitively, those classes
that are closer to more classes of FS will have a greater
closeness Ω(c, FS).
   We assume that a pair of classes c, c are directly
connected to each other if there is a direct association (or
redefinition of association) between them or if one class is
a direct subclass of the other. For these cases, d(c, c ) = 1.        Figure 2.   Method Overview.
Otherwise, when c, c are not directly connected, d(c, c )
is defined as the length of the shortest path between them
traversing associations and/or ascending/descending through
class hierarchies.
   As an example, Table II shows the top-10 classes with    a non-empty initial filter set FS. An example
a greater value of interest when the user defines FS =
{Patient, ActAppointment} and α = 0.5.                      to obtain knowledge about patient and appoint
   Results in Table II indicate that included within the
top-10 there are classes that are directly connected to
                                                            HL7 can be FS = {Patient, ActAppointment}.
all members of the filter set FS = {Patient, ActAp-             In the same way, the user can specify a rejec
pointment} as in the case of SubjectOfActAppointment
(Ω(SubjectOfActAppointment, FS) = 1.0) but also classes     (may be empty) with those classes that have n
that are not directly connected to any class of FS (although
they are closer).
                                                            her.
        V. F ILTERING HL7 I NFORMATION M ODELS
                                                               In addition to the filter set, the user can decid
   We have developed a method for filtering large models, knowledge she wants to obtain by indicating
                                                            of
and we have used the HL7 models as a case study for
developing and experimenting with the method, and its additional classes (Cmax ) the method has t
                                                            of
associated tool. The method consists of four consecutive
steps. The characteristics of each step are detailed below.
                                                            include in the filtered information model.
Figure 2 presents an overview of the method and steps.         Apart from that, the user has the possibility to
   Intuitively, from a small subset of classes selected by the
user the method automatically obtains a filtered information importance method (see Section III) wants to b
model with knowledge of interest.
                                                            following step. Also, she can include her prefe
Step 1: Setting the User Preferences
   The first step of the method consists of prepare the
                                                            closeness and importance by setting a value for t
required information to filter the HL7 information models    parameter α (see (1) in Section IV).
according to the user preferences. Basically, the user focus
on a set of classes (filter set) she is interested in and our Note that RS, Cmax , the importance meth
method surrounds them with additional knowledge from the    parameter α have default values (RS = ∅,
HL7 models. Therefore, it is mandatory for the user to select
                                                            the default importance method is CEntityRa
                                                            α = 0.5) and therefore are all optional.
                                                               The user interaction is required only in this
Table II
                                 M OST I NTERESTING CLASSES WITH REGARD TO F S = {Patient, ActAppointment}.

              Rank   Class (c)                            Ψ(c)   d(c, Patient)   d(c, ActAppointment)   Ω(c, F S)   Φ(c, F S)
               1     SubjectOfActAppointment              0.11        1                   1                1.0       0.7003
               2     Organization                         1.72        1                   3                0.5       0.3552
               3     Person                               1.22        1                   3                0.5       0.3537
               4     ServiceDeliveryLocation              0.79        2                   2                0.5       0.3524
               5     AssignedPerson                       0.72        2                   2                0.5       0.3522
               6     ManufacturedDevice                   0.55        2                   2                0.5       0.3517
               7     LocationOfActAppointment             0.26        3                   1                0.5       0.3508
               8     ReusableDeviceOfActAppointment       0.19        3                   1                0.5       0.3506
               9     SubjectOfAccountEvent                0.13        1                   3                0.5       0.3504
               10    AuthorOfActAppointment               0.12        3                   1                0.5       0.3503



   On the other hand, to compute the closeness Ω(c, FS) of                   The main goal of this step consists in filtering information
an HL7 class with regard to the filter set FS it is required               from the whole HL7 information models involving classes
to know the minimum distances between classes in the HL7                  in the filtered model. To achieve this goal, the method
models (see (2) in Section IV). However, it is only necessary             explores the associations, redefinitions of associations, and
to compute the distance from each class in the filter set to               generalization/specialization relationships in the HL7 infor-
any class out of FS, which requires a lower computational                 mation models that are defined between those classes and
cost. Note that the method computes the closeness only for                includes them in the filtered model to obtain a connected
those classes that are out of the filter set.                              model. The filtered information model for FS = {Patient,
                                                                          ActAppointment} and the previous Interest Set is shown in
Step 3: Select Interest Set
                                                                          Figure 3.
   The third step of the method consists in computing the                    Our method also takes into account associations that
interest (Φ) for each class out of the FS. As previously                  are specified between superclasses of classes included in
shown in (1) of Section IV, the interest Φ(c, FS) of a                    the filtered information model, and brings them down to
candidate class c to be included in the output model is a                 connect such subclasses. An example of that behaviour is
linear combination of the importance Ψ(c) and the closeness               the association between Participation and ActAppointment in
Ω(c, FS) taking into account the balancing parameter α.                   Figure 3. Such association is originally defined between Par-
   Note that if a non-empty rejection set RS was defined                   ticipation and Act (see Figure 1). Given that, ActAppointment
in the first step of our method, those classes included in                 is a subclass of Act. Such association is descended to the
such set will not be considered for the final result nor their             context of ActAppointment to indicate that there exists the
interest Φ will be computed.                                              connection with Participation although Act was not included
   The interest Φ produces a sorted ranking of HL7 classes                in the Interest Set.
and the method selects the top classes of that ranking until
reaching the selected limit Cmax specified in the first step.                  When descending an association there exist the case that
We call such set of classes the Interest Set. Second column               such association could be repeated. Figure 3 shows the
of Table II shows the classes that belong to the Interest                 association between Participation and ActAppointment. Note
Set according to FS = {Patient, ActAppointment} when                      that Participation is not a member of the Interest Set (see
Cmax = 10.                                                                Table II). However, Participation has been included in the
   In case of two or more classes get the same interest our               filtered information model as an auxiliary class (marked
method is non-deterministic: it might select any of those.                in Figure 3 with a light grey color). The rationale is
Some enhancements can be done to try to avoid selecting                   that such association should be descended between each
classes in a random manner, like prioritizing the classes                 of the five subclasses (SubjectOfActAppointment, AuthorOf-
with a higher value of closeness or importance (or any other              ActAppointment, ReusableDeviceOfActAppointment, Loca-
measure) in case of ties.                                                 tionOfActAppointment and SubjectOfAccountEvent) of Par-
                                                                          ticipation present in the Interest Set and ActAppointment
Step 4: Compute Filtered Information Model                                which is not an UML compliant situation.
   Finally, the last step of the method obtains the Interest Set             To avoid repeated associations our method finds the lowest
of classes from the previous step and puts it together with               common parent (LCP) for the previous subclasses, which
the classes of the filter set FS in order to create a filtered              in this case is Participation, includes it in the filtered
information model with the classes of both sets.                          information model as an auxiliary class, and descends the
Figure 3.   Filtered Information Model for F S = {Patient, ActAppointment}.



association to such LCP class. The same situation occurs                   •   The ability of the method to withhold non-relevant
for RoleClassAssociative and RoleChoice, which are LCP                         knowledge (precision)
classes included as auxiliary in the filtered information                   •   The interval between the request being made and the
model of Figure 3.                                                             answer being given (time)
   Besides, if there are two classes in the filtered information
                                                                        Precision Analysis
model such that one is an indirect subclass of the other in
the HL7 models, our method creates an IsA relationship                     A correct method must retrieve the relevant knowledge
between them in the filtered information model (marked                   according to the user preferences. The precision of a method
as indirect) to indicate such knowledge. Figure 3 shows                 is defined as the percentage of relevant knowledge presented
that the five subclasses of Participation and the four ones              to the user.
of RoleClassAssociative are indirect subclasses by marking                 In our context, we use the concept of precision applied
those IsA relationships in a light gray color. For the case             to HL7 universal domains (specified with D-MIM’s). Each
of RoleChoice, its subclasses are directly connected to it              domain contains a main class which is the central point
by means of IsA relationships (marked with ordinary black               of knowledge to the users interested in such domain. The
color).                                                                 other classes presented in the domain conform the relevant
   Finally, the filtered information model presented in Fig-             knowledge related to the main class.
ure 3 shows information about two HL7 domains: the                         HL7 professionals interested in a particular domain decide
Scheduling domain and the Account and Billing domain.                   about the knowledge to incorporate in it through ballots.
By using our filtering method, a user that wanted to know                Thus, a common situation for a user is to focus on the main
about patients and appointments discovers that patients are             class of a domain and to navigate through the D-MIM to
also related to account events. This way, the user easily can           understand its related knowledge.
compose another filter set like FS = {Patient, SubjectOf-                   To know the precision of our method, we simulate the
AccountEvent} to get more knowledge about them in a new                 generation of a D-MIM from its main class. We define a
iteration of our method.                                                single-class filter set with such class and set Cmax with
                                                                        the size of the domain. This way, we will obtain a filtered
                     VI. E VALUATION                                    information model with the same number of classes as such
   Our filtering method and prototype tool provide support               domain.
to the task of extracting knowledge from the HL7 models,                   In one iteration of our method, we obtain two groups
which has normally been done manually or with little                    of classes within the resulting filtered information model:
support.                                                                the relevant classes to the user, that is, the ones that were
   Finding a measure that reflects the ability of our method to          originally defined in the D-MIM by experts, and the non-
satisfy the user is a complicated task. However, there exists           relevant ones. The precision of the result is defined as the
related work [16], [17] about some measurable quantities in             fraction of the relevant classes over the total Cmax .
the field of information retrieval that can be applied to our               To refine the obtained result, the non-relevant classes are
context:                                                                included in a rejection set RS and the method is executed
Precision                                                                                            Precision (Zoom Iterations 1−5)

                                     100




                                                                                                                                                                                                           100
                                                                                                                                      q

                                                                                                                       q    q    q

                                                                                                 q    q       q    q
                                     90




                                                                                                                                                                                                           90
                                                                   q   q   q    q    q       q                                                                                                                                                         q             q
                                                           q                                                                                                                                                                          q
                                     80




                                                                                                                                                                                                           80
                   Precision (%)




                                                                                                                                                                                         Precision (%)
                                                       q                                                                                                                                                                   q
                                     70




                                                                                                                                                                                                           70
                                     60




                                                                                                                                                                                                           60
                                                                                                                                           Medical Records                                                                                     Medical Records
                                                                                                                                           Scheduling                                                                                          Scheduling
                                                  q                                                                              q
                                                                                                                                           Account and Billing                                                   q                        q
                                                                                                                                                                                                                                               Account and Billing
                                     50




                                                                                                                                                                                                           50
                                                                                                                                           Laboratory                                                                                          Laboratory
                                     40




                                                                                                                                                                                                           40
                                             0                         5                         10                        15                      20                    25         30                           1        2           3                4             5

                                                                                                                  Iterations                                                                                                      Iterations


                                                                                                                                               Figure 4.                     Precision analysis for HL7 domains.



again taking into account RS. It is expected that the filtered                                                                                                                                               It is expected that as we increase the size of the filter
information result of this step will have a greater precision.                                                                                                                                           set, the time will increase linearly. Our method computes
   This manner, at each iteration non-relevant classes to the                                                                                                                                            the distances from each class in the filter set to all the
user are rejected, and we know that in a finite number                                                                                                                                                    rest of classes. This computation requires the same time (in
of steps our filtering method will obtain all the classes of                                                                                                                                              average) for each class in the filter set. Therefore, the more
the original domain. The smaller the number of required                                                                                                                                                  classes we have in a filter set, the more the time our method
iterations until getting such domain, the better the method.                                                                                                                                             spends in computing distances.
   Figure 4 shows the number of iterations needed to reach                                                                                                                                                  In our experimentation, we set our prototype tool to apply
the maximum precision for four of the HL7 domains. Note                                                                                                                                                  the filtering method several times with an increasing number
that right side of Figure 4 zooms in the first five iterations.                                                                                                                                            of classes in the filter set. The average results for sizes from a
The test reveals that to reach more than 80% of the relevant                                                                                                                                             single-class filter set up to a 40-classes filter set are presented
classes of a domain, only three iterations are required.                                                                                                                                                 in Figure 5.
                                                                                                                                                                                                            According to the expected use of our method, having
Time Analysis
                                                                                                                                                                                                         a filter set FS of 40 classes is not a common situation
   It is clear that a good method does not only require                                                                                                                                                  (although possible). Sizes of filter sets up to 10 classes
precision, but it also needs to present the results in an                                                                                                                                                are more realistic, in which case the average time does not
acceptable time according to the user.                                                                                                                                                                   exceed one second.
   To find the time spent by our method it is only necessary
to record the time lapse between the request of knowledge,                                                                                                                                                                     VII. C ONCLUSIONS
i.e. once a filter set FS has been indicated by the user, and
                                                                                                                                                                                                           HL7 information models are very large. The wealth of
the receipt of the filtered information model.
                                                                                                                                                                                                         knowledge they contain makes them very useful to their
                                                                                                                                                                                                         potential target audience. However, the size and the or-
                                                                                 Average Time                                                                                                            ganization of these models makes it difficult to manually
                                                                                                                                                                                                         extract knowledge from them. This task is basic for the
             4.0                                                                                                                                                                                         improvement of services provided by HL7 affiliates, ven-
                                                                                                                                                                             q
             3.5                                                                                                                                                     q
                                                                                                                                                                         q


             3.0                                                                                                                               q
                                                                                                                                                   q q q q
                                                                                                                                                             q
                                                                                                                                                                 q
                                                                                                                                                                                                         dors and other organizations that use those models for the
                                                                                                                                           q
                                                                                                                                     q q
             2.5                                                                                                                 q                                                                       development of health systems.
  Time (s)




                                                                                                                           q q
                                                                                                                       q
                                                                                                              q q

             2.0
                                                                                         q
                                                                                             q
                                                                                                 q q
                                                                                                          q
                                                                                                                                                                                                           What is needed is a tool that makes HL7 models more
                                                                                     q
             1.5
                                                                   q
                                                                       q
                                                                           q
                                                                               q q
                                                                                                                                                                                                         usable for that task. We have presented a method that
             1.0                                           q
                                                               q


             0.5                         q q q
                                                 q q
                                                       q
                                                                                                                                                                                                         makes it easier to automatically extract knowledge from the
                                   q q

             0.0                                                                                                                                                                                         HL7 models. Input to our method is the set of classes the
                               0            5              10                  15                 20                   25                 30            35               40
                                                                                                                                                                                                         user is interested in. The method computes the interest of
                                                                                                                                                                                                         each class with respect to that set as a combination of its
                                                                                     Filter Set Size
                                                                                                                                                                                                         importance and closeness. Finally the method selects the
                                                                                                                                                                                                         most interesting classes from that models, including their
                       Figure 5.                       Time analysis for different sizes of F S.                                                                                                         defined knowledge in the original models (e.g. associations,
redefinition of associations, IsA relationships).                        [4] J. Conesa, V. C. Storey, and V. Sugumaran, “Usability of
   The experiments we have done clearly show that the                       upper level ontologies: The case of researchcyc,” Data &
proposed method and its associated tool provides an easier                  Knowledge Engineering, vol. 69, no. 4, pp. 343–356, 2010.
way to extract knowledge from the models. Concretely, our               [5] A. Danko, R. Kennedy, R. Haskell, I. Androwich, P. Button,
prototype tool recovers more than 80% of the knowledge                      C. Correia, S. Grobe, M. Harris, S. Matney, and D. Russler,
of a D-MIM in three iterations, with an average time per                    “Modeling nursing interventions in the act class of HL7 RIM
iteration that for common uses does not exceed one second.                  Version 3,” Journal of biomedical informatics, vol. 36, no.
                                                                            4-5, pp. 294–303, 2003.
   We plan to continue our work along three directions. The
first is to include all HL7 models into our tool to give full            [6] J. Lyman, S. Pelletier, K. Scully, J. Boyd, J. Dalton, S. Tro-
support to all HL7 communities. Currently we have four                      pello, and C. Egyhazy, “Applying the HL7 reference in-
D-MIMs. Experimentation with the full set of models will                    formation model to a clinical data warehouse,” in IEEE
                                                                            International Conference on Systems, Man and Cybernetics,
allow us to improve the method.
                                                                            vol. 5, 2003, pp. 4249–4255.
   We also plan to experiment with the latest definition
and nomenclature of HL7 models published by the HL7                     [7] OMG, Unified Modeling Language: Superstructure, version
iternational. Basically, it specifies a new level on top of the              2.1.1, Object Modeling Group, February 2007.
RIM model that consists on a domain analysis model (DAM)                [8] OMG, Object Constraint Language, version 2.0, Object Mod-
to describe business process and use cases, and a localized                 eling Group, May 2006.
information model (LIM) in the bottom of the model types
to adapt the R-MIMs to locale-specific requirements for                  [9] S. Castano, V. De Antonellis, M. G. Fugini, and B. Pernici,
                                                                            “Conceptual schema analysis: techniques and applications,”
structure and terminology. To take into account these two
                                                                            ACM Transactions on Database Systems, vol. 23, no. 3, pp.
new models is a challenge that will improve our work.                       286–333, 1998.
   Finally, another research area to explore consists in gen-
erating traceability links from the elements in the filtered            [10] D. L. Moody and A. Flitman, “A methodology for clustering
model to the original models, so that it is easy to find                     entity relationship models - a human information processing
                                                                            approach,” in Conceptual Modeling - ER 1999, 18th Inter-
out the origin of each element. Keeping such backward                       national Conference on Conceptual Modeling, ser. Lecture
links improves the integration of different models in an                    Notes in Computer Science, vol. 1728. Springer, 1999, pp.
interoperability context. Also, our method and tool imple-                  114–130.
menting traceability could be used as an aid in the design
                                                                       [11] Y. Tzitzikas, D. Kotzinos, and Y. Theoharis, “On Ranking
of implementation guides for HL7 interoperability artifacts                 RDF Schema Elements (and its Application in Visualiza-
(HL7 V3 messaging and CDA R2 documents).                                    tion),” Journal of Universal Computer Science, vol. 13,
                                                                            no. 12, pp. 1854–1880, 2007.
                     ACKNOWLEDGMENT
   The authors want to thank the collaboration of Diego                [12] Y. Tzitzikas and J.-L. Hainaut, “How to tame a very large
                                                                            er diagram (using link analysis and force-directed drawing
Kaminker, HL7 Education WG co-chair and HL7 Inter-                          algorithms),” in Conceptual Modeling - ER 2005, 24th In-
national Mentoring Committee co-chair, Carles Gallego,                      ternational Conference on Conceptual Modeling, ser. Lecture
current Chair of HL7 Spain, and Dr. Joan Guanyabens,                        Notes in Computer Science, vol. 3716. Springer, 2005, pp.
former Chair of HL7 Spain.                                                  144–159.
   We would also like to thank the people of the GMC
                                                                       [13] C. Yu and H. V. Jagadish, “Schema summarization,” in VLDB
group for their useful comments to previous drafts of this                  2006, 32nd International Conference on Very Large Data
paper. This work has been partly supported by the Ministerio                Bases, 2006, pp. 319–330.
de Ciencia y Tecnolog´a under the project TIN2008-00444,
                      ı
Grupo Consolidado.                                                     [14] X. Yang, C. M. Procopiuc, and D. Srivastava, “Summariz-
                                                                            ing relational databases,” in VLDB 2009, 35th International
                          R EFERENCES                                       Conference on Very Large Data Bases, 2009, pp. 634–645.

 [1] Health Level Seven International, “HL7 web,” feb 2010.            [15] A. Villegas and A. Oliv´ , “On computing the importance of
                                                                                                    e
     [Online]. Available: http://www.hl7.org                                entity types in large conceptual schemas,” in Advances in
                                                                            Conceptual Modeling - Challenging Perspectives, ER 2009
 [2] R. Dolin, L. Alschuler, C. Beebe, P. Biron, S. Boyer, D. Essin,        Workshops, ser. Lecture Notes in Computer Science, vol.
     E. Kimber, T. Lincoln, and J. Mattison, “The HL7 clinical              5833. Springer, 2009, pp. 22–32.
     document architecture,” Journal of the American Medical
     Informatics Association, vol. 8, no. 6, pp. 552–569, 2001.        [16] R. Baeza-Yates and B. Ribeiro-Nieto, Modern Information
                                                                            Retrieval. Addison Wesley, 1999.
 [3] R. Dolin, L. Alschuler, S. Boyer, C. Beebe, F. Behlen,
     P. Biron, and A. Shabo, “HL7 clinical document architecture,      [17] C. Van Rijsbergen, “Information Retrieval,” Cataloging &
     release 2,” Journal of the American Medical Informatics                Classification Quarterly, vol. 22, no. 3, 1996. [Online].
     Association, vol. 13, no. 1, pp. 30–39, 2006.                          Available: http://www.dcs.gla.ac.uk/Keith/Preface.html

More Related Content

Similar to 2010 last papers

THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...hiij
 
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...hiij
 
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...hiij
 
Be31178182
Be31178182Be31178182
Be31178182IJMER
 
Uses of computer ln
Uses of computer lnUses of computer ln
Uses of computer lnPreet Sweet
 
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImagingPresentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImagingmaigva
 
Patient Care Summary Exchange Guidance
Patient Care Summary Exchange GuidancePatient Care Summary Exchange Guidance
Patient Care Summary Exchange GuidanceBrian Ahier
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality Paul Courtney
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Kees van Bochove
 
Articulo CDA Spirometry Tests Standardization (explanation implementation) ...
Articulo CDA Spirometry Tests Standardization (explanation  implementation)  ...Articulo CDA Spirometry Tests Standardization (explanation  implementation)  ...
Articulo CDA Spirometry Tests Standardization (explanation implementation) ...Manel Domingo Falcón
 
Effective utilisation and allocation of health resources
Effective utilisation and allocation of health resourcesEffective utilisation and allocation of health resources
Effective utilisation and allocation of health resourcesHealth Informatics New Zealand
 
PATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectPATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectLaud Randy Amofah
 
Social Networking to Share Medical Knowledge: The experience of Madrid Health...
Social Networking to Share Medical Knowledge: The experience of Madrid Health...Social Networking to Share Medical Knowledge: The experience of Madrid Health...
Social Networking to Share Medical Knowledge: The experience of Madrid Health...Plan de Calidad para el SNS
 
The PhUSE Therapeutic Area Wiki Page
The PhUSE Therapeutic Area Wiki PageThe PhUSE Therapeutic Area Wiki Page
The PhUSE Therapeutic Area Wiki PageAngelo Tinazzi
 
Conferencia saptarshi purkayastha_hit_workshop_2016
Conferencia saptarshi purkayastha_hit_workshop_2016Conferencia saptarshi purkayastha_hit_workshop_2016
Conferencia saptarshi purkayastha_hit_workshop_2016hitworkshop2016
 
PhD dissertation Luis Marco Ruiz
PhD dissertation Luis Marco RuizPhD dissertation Luis Marco Ruiz
PhD dissertation Luis Marco RuizLuis Marco Ruiz
 
Web based database management to support telemedicine system
Web based database management to support telemedicine systemWeb based database management to support telemedicine system
Web based database management to support telemedicine systemijait
 
Framework for Data Warehousing and Mining Clinical Records of Patients: A Review
Framework for Data Warehousing and Mining Clinical Records of Patients: A ReviewFramework for Data Warehousing and Mining Clinical Records of Patients: A Review
Framework for Data Warehousing and Mining Clinical Records of Patients: A ReviewBRNSSPublicationHubI
 
A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...IRJET Journal
 

Similar to 2010 last papers (20)

THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH -IN A LARGE HEALTHCARE ORGANI...
 
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
 
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
THE 4 R’S – REASON, REDCAP, REVIEW AND RESEARCH - IN A LARGE HEALTHCARE ORGAN...
 
Be31178182
Be31178182Be31178182
Be31178182
 
Uses of computer ln
Uses of computer lnUses of computer ln
Uses of computer ln
 
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImagingPresentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
 
Patient Care Summary Exchange Guidance
Patient Care Summary Exchange GuidancePatient Care Summary Exchange Guidance
Patient Care Summary Exchange Guidance
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019
 
Articulo CDA Spirometry Tests Standardization (explanation implementation) ...
Articulo CDA Spirometry Tests Standardization (explanation  implementation)  ...Articulo CDA Spirometry Tests Standardization (explanation  implementation)  ...
Articulo CDA Spirometry Tests Standardization (explanation implementation) ...
 
Effective utilisation and allocation of health resources
Effective utilisation and allocation of health resourcesEffective utilisation and allocation of health resources
Effective utilisation and allocation of health resources
 
PATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectPATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM project
 
Social Networking to Share Medical Knowledge: The experience of Madrid Health...
Social Networking to Share Medical Knowledge: The experience of Madrid Health...Social Networking to Share Medical Knowledge: The experience of Madrid Health...
Social Networking to Share Medical Knowledge: The experience of Madrid Health...
 
The PhUSE Therapeutic Area Wiki Page
The PhUSE Therapeutic Area Wiki PageThe PhUSE Therapeutic Area Wiki Page
The PhUSE Therapeutic Area Wiki Page
 
May HL7 News - Cda spirometry tests standardizations
 May HL7 News - Cda spirometry tests standardizations May HL7 News - Cda spirometry tests standardizations
May HL7 News - Cda spirometry tests standardizations
 
Conferencia saptarshi purkayastha_hit_workshop_2016
Conferencia saptarshi purkayastha_hit_workshop_2016Conferencia saptarshi purkayastha_hit_workshop_2016
Conferencia saptarshi purkayastha_hit_workshop_2016
 
PhD dissertation Luis Marco Ruiz
PhD dissertation Luis Marco RuizPhD dissertation Luis Marco Ruiz
PhD dissertation Luis Marco Ruiz
 
Web based database management to support telemedicine system
Web based database management to support telemedicine systemWeb based database management to support telemedicine system
Web based database management to support telemedicine system
 
Framework for Data Warehousing and Mining Clinical Records of Patients: A Review
Framework for Data Warehousing and Mining Clinical Records of Patients: A ReviewFramework for Data Warehousing and Mining Clinical Records of Patients: A Review
Framework for Data Warehousing and Mining Clinical Records of Patients: A Review
 
A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...
 

Recently uploaded

Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableVip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableNehru place Escorts
 
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls ServiceCALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls ServiceMiss joya
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...astropune
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...Taniya Sharma
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night EnjoyCall Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoybabeytanya
 
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls ServiceKesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Servicemakika9823
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Servicevidya singh
 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...indiancallgirl4rent
 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Miss joya
 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Miss joya
 
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...Taniya Sharma
 
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...narwatsonia7
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...Garima Khatri
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...astropune
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 

Recently uploaded (20)

Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls AvailableVip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
 
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls ServiceCALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night EnjoyCall Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
 
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls ServiceKesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
 
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
 
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...
Russian Call Girls in Bangalore Manisha 7001305949 Independent Escort Service...
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
 
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 

2010 last papers

  • 1. Clinical Document Data Warehouse (CD2W) Josep Vilalta Marzo a , Diego Kaminker b , Josep M. Picas Vidal c, , M. Lluisa Bernard Antoranz d , Cristina Siles c , Rafael Rosa Prat a a Vico Open Modeling S.L.,b Kern Information Technology SRL, c Hospital de la Santa Creu i Sant Pau , d Institut Català de la Salut Abstract Nowadays, there are several on-going projects with the shared goal of consolidation of clinical information, both at the Spain This paper shows the development of the Clinical Document level or at a autonomic community level. Data Warehouse project (CD2W) and its implementation These projects also share a great level of complexity and costs using CDA R2 documents (Clinical Document Architecture to achieve their goals. This small project CD2W aspires to Release 2). help in achieving this consolidation of clinical information The main objectives of this project was to prototype a web with focus on easing the task for the healthcare providers, based portal allowing access to a clinical document repository organizations administering huge data bases with several and a data warehouse with clinical information about patients difficulties to integrate and no reference information model from several medical organizations in Barcelona, Spain (a available. major hospital and 4 primary care centers), giving access to clinical patient information for primary care and leveraging the same standardized information to populate the data Materials and Methods warehouse (secondary use). The project was developed during the first half of 2009 under Hospital de la Santa Creu i Sant Pau is a high complexity the general direction of the Hospital de la Santa Creu i Sant hospital which dates back six centuries, making it the oldest Pau. (HSCSP) hospital in Spain. Healthcare is centered on Barcelona but Due to the prototype nature of the project, the scope was extends to the rest of Catalonia. The center plays a prominent limited to patients with Congestive Heart Disease (CHD) who role in Spain and is internationally renowned. consented the use of their information for clinical research to the HSCSP and with the current vocabularies used by the The hospital has distinguished itself in the healthcare provided providers (ICD9 and other local terminologies). in many fields, making it a reference centre in several The data warehouse was developed using HL7 RIM basic specialties. The center attends over 34,000 admissions each concepts. year and more than 150,000 emergencies. Approximately The project mission was to improve patient care with the use 300,000 people are visited at the ambulatory services annually of global standards, open technology, low exploitation cost and the Day Hospital attends over 60,000 users. There are 71 and ease of use. CDA R2 documents. day hospital beds, 634 hospitalization beds and 19 surgical During this project, we used the SCRUM agile methodology rooms. allowing scalable, progressive and incremental software Teaching and training programmes at the Hospital de la Santa development process. Creu i Sant Pau cover many levels, comprising the UAB (Universitat Autonoma de Barcelona) Faculty of Medicine Teaching Unit, the University School of Nursing, participation Introduction – Business Case in the State Residency Programmes to train specialists, masters and doctorate course, continuing education, etc. In the Solving the questions arising when a patient shows several field of research, Hospital de la Santa Creu i Sant Pau is one clinical issues is one of the greatest challenges for healthcare of the most prominent centers in Spain, as can be appreciated providers. from the volume of papers published and their input factor, the Fast clinical decisions making at the healthcare location number and quality of projects which receive funding and the generates the need to put all relevant and up-to-date clinical grants awarded. The Hospital de la Santa Creu i Sant Pau is information to support the process. governed by the Patronat de la Fundació de Gestió Sanitària Usually, we don't have efficient filtering flagging the critical (FGSHSCSP), a board with representatives from the Regional issues where to focus our attention. We encounter scenarios Catalan Government (Generalitat de Catalunya), the City Hall with data overload demanding a big effort to synthesize useful of Barcelona and the Archbishopric of Barcelona. information, or sparse data demanding the use of imagination The goals for this project were: to connect and reach any conclusion. Another problem is that usually our available clinical 1. Define a scheme to integrate information from disjoint information source is the sole organization supporting the information platforms. healthcare provider. Other information generated by other 2. Implement a process for periodical data and document channels where the patient was attended is usually brought by exchange with minimal workload implications for the primary the patient in several paper formats. Consolidation of relevant care centers. and up to date information from distinct healthcare 3. Generate a clinical data store enabling the coexistence of organizations from different authorities is a pending issue clinical documents and relational data in a longitudinal patient waiting for an agreement from the healthcare authorities and healthcare record.. harmonization of different data base schemes.
  • 2. 4. Create a simple user interface to ease fast queries to the relevant clinical information about a patient.. 5. Restrict access to clinical information to professionals authorized by the participating organizations (HSP and ICS) 6. Use open technology for design, development and implementation of the data warehouse, minimizing exploitation costs and use global interoperability standards to enable universal access. 7. Evaluate the impact of normalizing data from different organizations and code systems.. 8.Evaluate useability and added value chain of CD2W to the physicians for healthcare decisions 9. Evaluate results of this prototype to study the possible extension of the access to the patients or to other professionals. 10 .Evaluate the results for a new project with a broader scope. Figure 2 – CD2W Domain Analysis Implementation, Methodology and Tools And finally the derivation of a datawarehouse model to store facts, dimensions and supporting standard clinical The project has four main design aspects: documents. 1. Clinical Datawarehouse Design RIM-wise, facts are acts, and main dimensions are 2. Standard Document Design entities and roles and their attributes. 3. Datawarehouse Population Process Design 4. User Interface Design 2. Standard Document Design. 5. Technological implementation. Lets review them: The documents were the 'lingua franca' between disparate systems used by the participating organizations (primary care 1. Clinical Datawarehouse Design centers, Hospital her and discharge system) [1]. The model for CD2W was derived from the RIM base classes We designed two different clinical document templates, one (role,entity,act,participation) [3] following a three step for the evolution note from the primary care centers and one methodology: for the discharge note from the Hospital discharge system. a- Development of a conceptual framework, to be discussed with the CD2W stakeholders, including which were the The templates shared the same information at the header level, measures or facts, and how should the information be but differed in their section contents. classified (dimensions). Since this data warehouse was intended for secondary use, patient and physician information was de-identified [2] (names, identifications and addresses were removed or replaced). The information generated by the local provider applications was transformed using an XSL to clinical documents and this standard documents were processed and stored into the C2DW. We tested our mapping, process and query interface with 16125 clinical documents from the hospital and primary care centers. Figure 1 – CD2W conceptual model b. Then, on a more technical level, a domain analysis model was generated.
  • 3. In order to guide the development team a table was built with the main required elements from each primary care center and its location inside of the standard clinical document. B. Data Generation Process Figure 4 – Transformation Table Use Cases related to the DW population Login - User validation - Batch load - Stat Processing - Document Processing - Update audit log C. Queries Data access and retrieval from the users − Retrieve app parameters − Query control panel (figure 3) − Query Patient Monitor (figure 4) − Query Encounter Monitor 3. Data warehouse population process − Query Conditions Monitor − Query Healthcare Centers − Query Healthcare Professionals The process to populate the data ware house included several steps. − Query Healthcare Services [At the primary care center] − Query Stats (figure 5,6) 1. Select encounters for patients suffering of CHD with − Query audit logs signed consents. − Browse CDA R2 document (figure 7) 2. Create a basic, shallow XML for each primary care center encounter The following figures illustrate the main user interface forms for [At the data processing center] CD2W: 3. Create a CDA conformant document using the defined mapping for the center, for each instance 4. Populate the CD2W database with the information from each document header, and the document itself. This process was triggered periodically by the primary care centers and HSP. 4. User interface design The user interface design was use-case based. Identified use cases were: A. App Administration Application setup and parametrization − Define healthcare agent − Define healthcare agent type − Define healthcare agent role − Define app parameters − Define service catalog − Define service Figure 3 – CD2W Control Panel
  • 4. Figure 7- CDA R2 Document for an encounter 5. Technological Implementation The system was implemented using an open source Apache application service, a MySQL database, PHP development environment and the amCharts data analysis and graphing component. Evaluation/Assessment/Lessons Learned The HL7 Reference Information Model was very useful to aid Figure 4 – CD2W Patient Query modeling our specific domain and generate a scheme to integrate all participating applications into an information model. The process for exchange could be established, but we needed to educate the participating centers on the use of CDA R2 and the rationale for asking each piece of information. Nevertheless, we needed to bridge their data model to CDA R2 by providing them with a customized XSL for each center. Coexistence of documents with the relational information needed to explore the embedded information was possible, although we need to test with more volume. The user interface was enough for our pilot users from three centers to achieve their data exploration and verification needs. The use of open technologies and standards was a key Figure 5 – Services for one patient factor to minimize development time. Our normalizing efforts were not finished, we end up using ICD-9 and internal ICS vocabularies. Future Plans Extension of this tool to be accesible to patients and other professionals will be studied, but current policies make very difficult to gain access to patient data, even for this approved project. We also want to explore using native XML open source databases. A project with a greater scope will be studied but the whole approach and the generated model are suitable for other domains. Figure 6- Temporal series for a patient
  • 5. Acknowledgements Spain Ministry of Health for Scholarship FIS Dossier P105/230. Bibliography [1] HL7 Clinical Document Architecture, Release 2 Robert H Dolin et al JAMIA 2006;13:30-39 doi:10.1197/jamia.M1888 [2] Ley Orgánica 15/1999, de 13 de diciembre, de Protección de Datos de Carácter Personal (BOE núm. 298, de 14-12-1999, pp. 43088-43099) [3] The Design of the HL7 RIM-based Sharing Components for Clinical Information Systems Wei-Yi Yang, Li-Hui Lee, Hsiao-Li Gien, Hsing-Yi Chu, Yi-Ting Chou, and Der-Ming Liou World Academy of Science, Engineering and Technology 53 2009 Contact Josep Vilalta Marzo Francesc Layret 24, Badalona, Barcelona, Spain Email: jvilalta@vico.org
  • 6. Improving the Usability of HL7 Information Models by Automatic Filtering Antonio Villegas and Antoni Oliv´ e Josep Vilalta Services and Information Systems Engineering Department HL7 Education & e-Learning Services Universitat Polit` cnica de Catalunya e HL7 Spain (Health Level Seven International) Barcelona, Spain Barcelona, Spain Email: {avillegas, olive}@essi.upc.edu Email: jvilalta@vico.org Abstract—The amount of knowledge represented in the to manually extract knowledge from them. This problem is Health Level 7 International (HL7) information models is very shared by other large models [6]. large. The sheer size of those models makes them very useful Currently, there is a lack of computer support to make for the communities for which they are developed. However, the size of the models and their overall organization makes it those models usable for the goal of knowledge extraction. In difficult to manually extract knowledge from them. this paper, we propose to extract that knowledge by using a We propose to extract that knowledge by using a novel novel filtering method that we have developed, and we show filtering method that we have developed. Our method is based that the use of our prototype implementation of that method on the concept of class interest as a combination of class improves the usability of HL7 information models. importance and class closeness. The application of our method automatically obtains a filtered information model of the whole The structure of the paper is as follows. Section II HL7 models according to the user preferences. We show that introduces the HL7 models and describes the main UML the use of a prototype tool that implements that method and constructs used to build them. Section III describes the produces such filtered model improves the usability of the HL7 concept of class importance and references the methods that models due to its high precision and low computational time. can be used to compute it. Section IV describes the concept Keywords-Usability, Health Level Seven International, HL7, of class interest with respect to a filter set of classes and Models, Filtering, UML explains how to compute it. Section V presents our model filtering method. Section VI evaluates the use of the method I. I NTRODUCTION in the context of the HL7 models. Finally, Section VII The Health Level Seven International (HL7) is a not-for- summarizes the conclusions and points out future work. profit, ANSI-accredited standards developing organization II. HL7 I NFORMATION M ODELS dedicated to providing a comprehensive framework and related standards for the exchange, integration, sharing, and Types of Models retrieval of electronic health information that supports clini- The HL7 information models comprise three types of cal practice and the management, delivery and evaluation of models. Each of the model types is based on the UML, health services [1]. although the concrete notation used differs depending on HL7 develops specifications, the most widely used being the model type. Also, the models differ from each other in a messaging standard that enables disparate healthcare ap- terms of their information content, scope, and intended use. plications to exchange key sets of clinical and administrative The following types of information models are defined: data. The HL7 standard specifications are unified by shared • Reference Information Model (RIM) - The RIM is the reference models of the healthcare and technical domains information model that encompasses the HL7 domain [2], [3]. of interest as a whole. The RIM is a coherent, shared The amount of knowledge represented in the HL7 infor- information model that is the source for the data content mation models is very large and continuously improved. The of all HL7 interoperability artifacts: V2.x messages and sheer size of those models makes them very useful to the XML clinical documents CDA R2 [3]. communities for which they were developed: HL7 interna- • Domain Message Information Model (D-MIM) - A D- tional affiliates with more than fifty HL7 active working MIM is a refined subset of the RIM that includes a set groups (Structured Documents, Clinical Decision Support, of classes, attributes and relationships that can be used Clinical Genomics...), large integrated healthcare delivery to create messages and structured clinical documents networks, government agencies and other organizations that for a particular domain (a particular area of interest use those models for the development of their enterprise in healthcare). There are predefined D-MIMs for a set information architecture of health systems [4], [5]. of over 15 universal domains, such as Accounting and However, the size of HL7 information models and their Billing, Care Provision, Claims and Reimbursement, organization makes it very difficult for those communities and so on.
  • 7. Figure 1. Sample of HL7 RIM refinements related to ActAppointment class • Refined Message Information Model (R-MIM) - The R- 2) The multiplicities of an association defined between MIM is a subset of a D-MIM that is used to express the RIM classes are strengthened in the subclasses. information content for a message/document or set of 3) The multiplicity of an attribute of a RIM class is messages/documents with annotations and refinements strengthened in a subclass. An optional attribute in that are message/document specific. The content of an a RIM class can be made mandatory or not allowed R-MIM is drawn from the D-MIM for the specific in a subclass. Note that it is not allowed to add new domain in which the R-MIM is used. attributes. R-MIM models refine D-MIM models in the same way. Structure of the HL7 Information Models In all cases, the three kind of refinements can be expressed The RIM, D-MIM and R-MIM models can be analyzed using UML constructs. as if they were built using in a particular way a small subset Figure 1 shows a few refinements related to the ActAppo- of constructs provided by the UML [7]. Figure 1 illustrates intment class. The instances of this class are appointments with a very small fragment of the RIM and of one D-MIM (a particular kind of Act). There may be several kinds of the main UML constructs used. RIM comprises six backbone participations in an appointment. Figure 1 shows only two classes: Act, Participation, Entity, Role, ActRelationship and of them: PerformerOfActAppointment and SubjectOfActAp- RoleLink. Figure 1 shows the first four of these classes. pointment. To indicate that when the act is an appointment Each one has a number of attributes with a defined mul- then the participations must be instances of PerformerOf- tiplicity. Surprisingly, there are only eight main associations ActAppointment or of SubjectOfActAppointment, we redefine between the RIM classes, all of them binary and with their the association Participation-Act as shown in the figure. Note corresponding multiplicities. Figure 1 shows four of these that redefinition is a UML construct, which is very useful in associations. situations like this one. The redefinition of the association Each of the RIM classes has many subclasses, although Role-Participation is similar. The overall semantics of these only a few of them are explicitly shown in the diagrams redefinitions is that the performer of an appointment is a of the HL7 RIM specification. There are many special- Person that plays the role AssignedPerson, and that the ization/generalization relationships (called IsA relationships, subject of an appointment is a Person that plays the role e.g. Organization IsA Entity) in the HL7 models. The num- Patient. ber of RIM classes and subclasses is over 2,500. Figure 1 Sometimes, the UML redefinition construct does not allow shows seven subclasses of four of the backbone RIM classes the graphical representation of the strengthening of asso- and seven IsA relationships. ciation multiplicities. In these cases, the redefinition must D-MIM models refine the RIM in three ways: be formally captured by OCL invariants. For example, in 1) The participants of one of the eight main associations Figure 1, the refinement of act in SubjectOfActAppointment defined between RIM classes are refined in the sub- also implies that an instance of ActAppointment is associated classes. This is the refinement most often used in the with a non-empty set (1..*) of SubjectOfActAppointment. HL7 models. Note that it is not allowed to add new However, this cannot be expressed graphically, and an OCL associations. invariant must be used instead [8].
  • 8. Table I Figure 1 also shows the redefinitions of associations T OP -10 M OST I MPORTANT C LASSES . player-playedRole and scoper-scopedRole between Entity and Role. The player and the scoper of an AssignedPerson Rank Class Importance and of Patient must be a Person and an Organization, 1 Act 7.51 respectively. 2 Role 5.11 3 ActRelationship 4.03 III. I MPORTANCE OF HL7 C LASSES 4 Participation 3.67 Our filtering method is based on the concept of class 5 Entity 3.5 importance. The importance of a class is a real number that 6 Observation 2.64 measures the relative importance of that class in a model. 7 InfrastructureRoot 1.81 We will see in the next section that we use that importance 8 Organization 1.72 to select which classes are shown to the users. 9 RoleLink 1.59 There exist different kinds of methods to compute the 10 FinancialTransaction 1.54 importance of classes in the literature. The simplest family of methods is that based on occurrence counting [9]–[11], where the importance of a class is equal to the number of The filtering method described in the next sections can characteristics the class has represented in the model. These be used in connection with any of the existing methods for methods are class centered in the sense that the importance computing the importance of classes. of a class depends only in the information the class has. Therefore, the more information about a class, the more IV. I NTEREST OF HL7 C LASSES important it will be. The importance of a class is an absolute metric that Another family of methods are those based in link analysis depends only on the whole set of HL7 models. The metric [11], [12], where the importance of a class is defined as is useful when a user wants to know which are the most a combination of the importance of the classes that are important classes, but it is of little use when the user is connected to it with associations and/or IsA relationships. interested in a specific subset of classes, independently from Such recursive definition results in an equation system and their importance. What is needed then is a metric that indicates that the more important the classes connected measures the interest of a class with respect to such set, to a class are, the more important such class will be. In that we call filter set. these methods the importance is shared through connections, A filter set FS of classes is a non-empty set of classes changing from a class centered philosophy to a more in- from the HL7 models. The filter set comprises the minimum terconnected approach of the importance. Iterative methods set of classes in which a user is interested at a particular are required to solve the importance equation system, which moment. For example, if the user wants to see what is increases the computational cost of this kind of methods. the knowledge the models have about classes Patient and Finally, there are some methods that even use the infor- ActAppointment, then she defines FS = {Patient, ActAppo- mation about the existing instances of the classes and the intment}. We will see in the next section that starting from associations of the model. Therefore, the importance they this filter set, our filtering method retrieves the knowledge compute takes into account the structural part of the model represented in the models about Patient and ActAppointment but also the data that the classes instantiate. The problem that is likely to be of more interest to the user. with this family of instance-dependent methods [13], [14] Additionally, it is possible to define a set of classes not to is that without instances the method cannot be used. be considered in the filtering method. We call such set the As an example, Table I shows the 10 most important rejection set RS. classes of the HL7 models1 computed using the CEntityRank Intuitively, the interest to a user of a class c with respect importance algorithm (see 3.6 of [15]). To compute this to a filter set FS should take into account both the absolute importance, the method takes into account the classes, the importance of c (as explained in the previous section) and IsA relationships between them, the attributes and their a closeness measure of c with regard to the classes in FS. multiplicities, the associations and their multiplicities, the For this reason, we define: association redefinitions and the OCL invariants. The in-depth study of the computation of the importance of classes is beyond the scope of this paper. A review of Φ(c, FS) = α × Ψ(c) + (1 − α) × Ω(c, FS) (1) methods to compute the importance taking into account where Φ(c, FS) is the interest of class c with respect to different levels of knowledge is given in [15]. FS, Ψ(c) the absolute importance of class c, and Ω(c, F S) 1 The results have been obtained taking into account the RIM and the is the closeness of class c with respect to FS. following D-MIM models: Laboratory, Account and Billing, Scheduling Note that α is a balancing parameter in the range [0,1] and Medical Records [1]. to set the preference between closeness and importance for
  • 9. the retrieved knowledge. An α > 0.5 benefits importance against closeness while an α < 0.5 does the opposite. The default α value is set to 0.5 and can be modified by the user. There may be several ways to compute the closeness Ω(c, FS) of class c with respect to the classes of FS. Intuitively, the closeness of class c should be directly related to the inverse of the distance of c to the filter set FS. For this reason, we define: |FS| Ω(c, FS) = (2) d(c, c ) c ∈F S where |FS| is the number of classes of FS and d(c, c ) is the minimum distance between a class c and a class c belonging to the filter set FS. Intuitively, those classes that are closer to more classes of FS will have a greater closeness Ω(c, FS). We assume that a pair of classes c, c are directly connected to each other if there is a direct association (or redefinition of association) between them or if one class is a direct subclass of the other. For these cases, d(c, c ) = 1. Figure 2. Method Overview. Otherwise, when c, c are not directly connected, d(c, c ) is defined as the length of the shortest path between them traversing associations and/or ascending/descending through class hierarchies. As an example, Table II shows the top-10 classes with a non-empty initial filter set FS. An example a greater value of interest when the user defines FS = {Patient, ActAppointment} and α = 0.5. to obtain knowledge about patient and appoint Results in Table II indicate that included within the top-10 there are classes that are directly connected to HL7 can be FS = {Patient, ActAppointment}. all members of the filter set FS = {Patient, ActAp- In the same way, the user can specify a rejec pointment} as in the case of SubjectOfActAppointment (Ω(SubjectOfActAppointment, FS) = 1.0) but also classes (may be empty) with those classes that have n that are not directly connected to any class of FS (although they are closer). her. V. F ILTERING HL7 I NFORMATION M ODELS In addition to the filter set, the user can decid We have developed a method for filtering large models, knowledge she wants to obtain by indicating of and we have used the HL7 models as a case study for developing and experimenting with the method, and its additional classes (Cmax ) the method has t of associated tool. The method consists of four consecutive steps. The characteristics of each step are detailed below. include in the filtered information model. Figure 2 presents an overview of the method and steps. Apart from that, the user has the possibility to Intuitively, from a small subset of classes selected by the user the method automatically obtains a filtered information importance method (see Section III) wants to b model with knowledge of interest. following step. Also, she can include her prefe Step 1: Setting the User Preferences The first step of the method consists of prepare the closeness and importance by setting a value for t required information to filter the HL7 information models parameter α (see (1) in Section IV). according to the user preferences. Basically, the user focus on a set of classes (filter set) she is interested in and our Note that RS, Cmax , the importance meth method surrounds them with additional knowledge from the parameter α have default values (RS = ∅, HL7 models. Therefore, it is mandatory for the user to select the default importance method is CEntityRa α = 0.5) and therefore are all optional. The user interaction is required only in this
  • 10. Table II M OST I NTERESTING CLASSES WITH REGARD TO F S = {Patient, ActAppointment}. Rank Class (c) Ψ(c) d(c, Patient) d(c, ActAppointment) Ω(c, F S) Φ(c, F S) 1 SubjectOfActAppointment 0.11 1 1 1.0 0.7003 2 Organization 1.72 1 3 0.5 0.3552 3 Person 1.22 1 3 0.5 0.3537 4 ServiceDeliveryLocation 0.79 2 2 0.5 0.3524 5 AssignedPerson 0.72 2 2 0.5 0.3522 6 ManufacturedDevice 0.55 2 2 0.5 0.3517 7 LocationOfActAppointment 0.26 3 1 0.5 0.3508 8 ReusableDeviceOfActAppointment 0.19 3 1 0.5 0.3506 9 SubjectOfAccountEvent 0.13 1 3 0.5 0.3504 10 AuthorOfActAppointment 0.12 3 1 0.5 0.3503 On the other hand, to compute the closeness Ω(c, FS) of The main goal of this step consists in filtering information an HL7 class with regard to the filter set FS it is required from the whole HL7 information models involving classes to know the minimum distances between classes in the HL7 in the filtered model. To achieve this goal, the method models (see (2) in Section IV). However, it is only necessary explores the associations, redefinitions of associations, and to compute the distance from each class in the filter set to generalization/specialization relationships in the HL7 infor- any class out of FS, which requires a lower computational mation models that are defined between those classes and cost. Note that the method computes the closeness only for includes them in the filtered model to obtain a connected those classes that are out of the filter set. model. The filtered information model for FS = {Patient, ActAppointment} and the previous Interest Set is shown in Step 3: Select Interest Set Figure 3. The third step of the method consists in computing the Our method also takes into account associations that interest (Φ) for each class out of the FS. As previously are specified between superclasses of classes included in shown in (1) of Section IV, the interest Φ(c, FS) of a the filtered information model, and brings them down to candidate class c to be included in the output model is a connect such subclasses. An example of that behaviour is linear combination of the importance Ψ(c) and the closeness the association between Participation and ActAppointment in Ω(c, FS) taking into account the balancing parameter α. Figure 3. Such association is originally defined between Par- Note that if a non-empty rejection set RS was defined ticipation and Act (see Figure 1). Given that, ActAppointment in the first step of our method, those classes included in is a subclass of Act. Such association is descended to the such set will not be considered for the final result nor their context of ActAppointment to indicate that there exists the interest Φ will be computed. connection with Participation although Act was not included The interest Φ produces a sorted ranking of HL7 classes in the Interest Set. and the method selects the top classes of that ranking until reaching the selected limit Cmax specified in the first step. When descending an association there exist the case that We call such set of classes the Interest Set. Second column such association could be repeated. Figure 3 shows the of Table II shows the classes that belong to the Interest association between Participation and ActAppointment. Note Set according to FS = {Patient, ActAppointment} when that Participation is not a member of the Interest Set (see Cmax = 10. Table II). However, Participation has been included in the In case of two or more classes get the same interest our filtered information model as an auxiliary class (marked method is non-deterministic: it might select any of those. in Figure 3 with a light grey color). The rationale is Some enhancements can be done to try to avoid selecting that such association should be descended between each classes in a random manner, like prioritizing the classes of the five subclasses (SubjectOfActAppointment, AuthorOf- with a higher value of closeness or importance (or any other ActAppointment, ReusableDeviceOfActAppointment, Loca- measure) in case of ties. tionOfActAppointment and SubjectOfAccountEvent) of Par- ticipation present in the Interest Set and ActAppointment Step 4: Compute Filtered Information Model which is not an UML compliant situation. Finally, the last step of the method obtains the Interest Set To avoid repeated associations our method finds the lowest of classes from the previous step and puts it together with common parent (LCP) for the previous subclasses, which the classes of the filter set FS in order to create a filtered in this case is Participation, includes it in the filtered information model with the classes of both sets. information model as an auxiliary class, and descends the
  • 11. Figure 3. Filtered Information Model for F S = {Patient, ActAppointment}. association to such LCP class. The same situation occurs • The ability of the method to withhold non-relevant for RoleClassAssociative and RoleChoice, which are LCP knowledge (precision) classes included as auxiliary in the filtered information • The interval between the request being made and the model of Figure 3. answer being given (time) Besides, if there are two classes in the filtered information Precision Analysis model such that one is an indirect subclass of the other in the HL7 models, our method creates an IsA relationship A correct method must retrieve the relevant knowledge between them in the filtered information model (marked according to the user preferences. The precision of a method as indirect) to indicate such knowledge. Figure 3 shows is defined as the percentage of relevant knowledge presented that the five subclasses of Participation and the four ones to the user. of RoleClassAssociative are indirect subclasses by marking In our context, we use the concept of precision applied those IsA relationships in a light gray color. For the case to HL7 universal domains (specified with D-MIM’s). Each of RoleChoice, its subclasses are directly connected to it domain contains a main class which is the central point by means of IsA relationships (marked with ordinary black of knowledge to the users interested in such domain. The color). other classes presented in the domain conform the relevant Finally, the filtered information model presented in Fig- knowledge related to the main class. ure 3 shows information about two HL7 domains: the HL7 professionals interested in a particular domain decide Scheduling domain and the Account and Billing domain. about the knowledge to incorporate in it through ballots. By using our filtering method, a user that wanted to know Thus, a common situation for a user is to focus on the main about patients and appointments discovers that patients are class of a domain and to navigate through the D-MIM to also related to account events. This way, the user easily can understand its related knowledge. compose another filter set like FS = {Patient, SubjectOf- To know the precision of our method, we simulate the AccountEvent} to get more knowledge about them in a new generation of a D-MIM from its main class. We define a iteration of our method. single-class filter set with such class and set Cmax with the size of the domain. This way, we will obtain a filtered VI. E VALUATION information model with the same number of classes as such Our filtering method and prototype tool provide support domain. to the task of extracting knowledge from the HL7 models, In one iteration of our method, we obtain two groups which has normally been done manually or with little of classes within the resulting filtered information model: support. the relevant classes to the user, that is, the ones that were Finding a measure that reflects the ability of our method to originally defined in the D-MIM by experts, and the non- satisfy the user is a complicated task. However, there exists relevant ones. The precision of the result is defined as the related work [16], [17] about some measurable quantities in fraction of the relevant classes over the total Cmax . the field of information retrieval that can be applied to our To refine the obtained result, the non-relevant classes are context: included in a rejection set RS and the method is executed
  • 12. Precision Precision (Zoom Iterations 1−5) 100 100 q q q q q q q q 90 90 q q q q q q q q q q 80 80 Precision (%) Precision (%) q q 70 70 60 60 Medical Records Medical Records Scheduling Scheduling q q Account and Billing q q Account and Billing 50 50 Laboratory Laboratory 40 40 0 5 10 15 20 25 30 1 2 3 4 5 Iterations Iterations Figure 4. Precision analysis for HL7 domains. again taking into account RS. It is expected that the filtered It is expected that as we increase the size of the filter information result of this step will have a greater precision. set, the time will increase linearly. Our method computes This manner, at each iteration non-relevant classes to the the distances from each class in the filter set to all the user are rejected, and we know that in a finite number rest of classes. This computation requires the same time (in of steps our filtering method will obtain all the classes of average) for each class in the filter set. Therefore, the more the original domain. The smaller the number of required classes we have in a filter set, the more the time our method iterations until getting such domain, the better the method. spends in computing distances. Figure 4 shows the number of iterations needed to reach In our experimentation, we set our prototype tool to apply the maximum precision for four of the HL7 domains. Note the filtering method several times with an increasing number that right side of Figure 4 zooms in the first five iterations. of classes in the filter set. The average results for sizes from a The test reveals that to reach more than 80% of the relevant single-class filter set up to a 40-classes filter set are presented classes of a domain, only three iterations are required. in Figure 5. According to the expected use of our method, having Time Analysis a filter set FS of 40 classes is not a common situation It is clear that a good method does not only require (although possible). Sizes of filter sets up to 10 classes precision, but it also needs to present the results in an are more realistic, in which case the average time does not acceptable time according to the user. exceed one second. To find the time spent by our method it is only necessary to record the time lapse between the request of knowledge, VII. C ONCLUSIONS i.e. once a filter set FS has been indicated by the user, and HL7 information models are very large. The wealth of the receipt of the filtered information model. knowledge they contain makes them very useful to their potential target audience. However, the size and the or- Average Time ganization of these models makes it difficult to manually extract knowledge from them. This task is basic for the 4.0 improvement of services provided by HL7 affiliates, ven- q 3.5 q q 3.0 q q q q q q q dors and other organizations that use those models for the q q q 2.5 q development of health systems. Time (s) q q q q q 2.0 q q q q q What is needed is a tool that makes HL7 models more q 1.5 q q q q q usable for that task. We have presented a method that 1.0 q q 0.5 q q q q q q makes it easier to automatically extract knowledge from the q q 0.0 HL7 models. Input to our method is the set of classes the 0 5 10 15 20 25 30 35 40 user is interested in. The method computes the interest of each class with respect to that set as a combination of its Filter Set Size importance and closeness. Finally the method selects the most interesting classes from that models, including their Figure 5. Time analysis for different sizes of F S. defined knowledge in the original models (e.g. associations,
  • 13. redefinition of associations, IsA relationships). [4] J. Conesa, V. C. Storey, and V. Sugumaran, “Usability of The experiments we have done clearly show that the upper level ontologies: The case of researchcyc,” Data & proposed method and its associated tool provides an easier Knowledge Engineering, vol. 69, no. 4, pp. 343–356, 2010. way to extract knowledge from the models. Concretely, our [5] A. Danko, R. Kennedy, R. Haskell, I. Androwich, P. Button, prototype tool recovers more than 80% of the knowledge C. Correia, S. Grobe, M. Harris, S. Matney, and D. Russler, of a D-MIM in three iterations, with an average time per “Modeling nursing interventions in the act class of HL7 RIM iteration that for common uses does not exceed one second. Version 3,” Journal of biomedical informatics, vol. 36, no. 4-5, pp. 294–303, 2003. We plan to continue our work along three directions. The first is to include all HL7 models into our tool to give full [6] J. Lyman, S. Pelletier, K. Scully, J. Boyd, J. Dalton, S. Tro- support to all HL7 communities. Currently we have four pello, and C. Egyhazy, “Applying the HL7 reference in- D-MIMs. Experimentation with the full set of models will formation model to a clinical data warehouse,” in IEEE International Conference on Systems, Man and Cybernetics, allow us to improve the method. vol. 5, 2003, pp. 4249–4255. We also plan to experiment with the latest definition and nomenclature of HL7 models published by the HL7 [7] OMG, Unified Modeling Language: Superstructure, version iternational. Basically, it specifies a new level on top of the 2.1.1, Object Modeling Group, February 2007. RIM model that consists on a domain analysis model (DAM) [8] OMG, Object Constraint Language, version 2.0, Object Mod- to describe business process and use cases, and a localized eling Group, May 2006. information model (LIM) in the bottom of the model types to adapt the R-MIMs to locale-specific requirements for [9] S. Castano, V. De Antonellis, M. G. Fugini, and B. Pernici, “Conceptual schema analysis: techniques and applications,” structure and terminology. To take into account these two ACM Transactions on Database Systems, vol. 23, no. 3, pp. new models is a challenge that will improve our work. 286–333, 1998. Finally, another research area to explore consists in gen- erating traceability links from the elements in the filtered [10] D. L. Moody and A. Flitman, “A methodology for clustering model to the original models, so that it is easy to find entity relationship models - a human information processing approach,” in Conceptual Modeling - ER 1999, 18th Inter- out the origin of each element. Keeping such backward national Conference on Conceptual Modeling, ser. Lecture links improves the integration of different models in an Notes in Computer Science, vol. 1728. Springer, 1999, pp. interoperability context. Also, our method and tool imple- 114–130. menting traceability could be used as an aid in the design [11] Y. Tzitzikas, D. Kotzinos, and Y. Theoharis, “On Ranking of implementation guides for HL7 interoperability artifacts RDF Schema Elements (and its Application in Visualiza- (HL7 V3 messaging and CDA R2 documents). tion),” Journal of Universal Computer Science, vol. 13, no. 12, pp. 1854–1880, 2007. ACKNOWLEDGMENT The authors want to thank the collaboration of Diego [12] Y. Tzitzikas and J.-L. Hainaut, “How to tame a very large er diagram (using link analysis and force-directed drawing Kaminker, HL7 Education WG co-chair and HL7 Inter- algorithms),” in Conceptual Modeling - ER 2005, 24th In- national Mentoring Committee co-chair, Carles Gallego, ternational Conference on Conceptual Modeling, ser. Lecture current Chair of HL7 Spain, and Dr. Joan Guanyabens, Notes in Computer Science, vol. 3716. Springer, 2005, pp. former Chair of HL7 Spain. 144–159. We would also like to thank the people of the GMC [13] C. Yu and H. V. Jagadish, “Schema summarization,” in VLDB group for their useful comments to previous drafts of this 2006, 32nd International Conference on Very Large Data paper. This work has been partly supported by the Ministerio Bases, 2006, pp. 319–330. de Ciencia y Tecnolog´a under the project TIN2008-00444, ı Grupo Consolidado. [14] X. Yang, C. M. Procopiuc, and D. Srivastava, “Summariz- ing relational databases,” in VLDB 2009, 35th International R EFERENCES Conference on Very Large Data Bases, 2009, pp. 634–645. [1] Health Level Seven International, “HL7 web,” feb 2010. [15] A. Villegas and A. Oliv´ , “On computing the importance of e [Online]. Available: http://www.hl7.org entity types in large conceptual schemas,” in Advances in Conceptual Modeling - Challenging Perspectives, ER 2009 [2] R. Dolin, L. Alschuler, C. Beebe, P. Biron, S. Boyer, D. Essin, Workshops, ser. Lecture Notes in Computer Science, vol. E. Kimber, T. Lincoln, and J. Mattison, “The HL7 clinical 5833. Springer, 2009, pp. 22–32. document architecture,” Journal of the American Medical Informatics Association, vol. 8, no. 6, pp. 552–569, 2001. [16] R. Baeza-Yates and B. Ribeiro-Nieto, Modern Information Retrieval. Addison Wesley, 1999. [3] R. Dolin, L. Alschuler, S. Boyer, C. Beebe, F. Behlen, P. Biron, and A. Shabo, “HL7 clinical document architecture, [17] C. Van Rijsbergen, “Information Retrieval,” Cataloging & release 2,” Journal of the American Medical Informatics Classification Quarterly, vol. 22, no. 3, 1996. [Online]. Association, vol. 13, no. 1, pp. 30–39, 2006. Available: http://www.dcs.gla.ac.uk/Keith/Preface.html