Paper María Iglesia - CEIB: A R&D services in bioimaging oriented to integration of environments with HER on Cloud Computing
CEIB: R&D services in bioimaging oriented to integration of environments with EHR on Cloud Computing Maria de la Iglesia Vayá#*1, Jose Maria Salinas#*2, Rosa Valenzuela#3, Fernando Gomez#4, Luis Martí-Bonmatí#5 # CEIB Valencian Health Service - Spain 1 firstname.lastname@example.org 2 email@example.com 3 firstname.lastname@example.org 4 email@example.com 5 firstname.lastname@example.org* These authors have contributed equally to this work.Abstract— The management system and knowledge extraction of & D to the scientific community through the implementationbioimaging in the cloud (R & D Cloud CEIB) which is proposed of logic services and retrieval determined Bioimaging sets.in this article will use the services offered by the centralization of For the exploitation of the valuable information stored in thebioimaging through Valencian Biobank Medical Imaging (GIMC GIMC, there was designed a logical bus of R & D services inin Spanish) as a basis for managing and extracting knowledgefrom a bioimaging bank, providing that knowledge in the form of the cloud based on opensource technologies, which we call Rservices with high added value and expertise to the Electronic & D Cloud CEIB. This is intended to provide high serviceHealth Records (EHR), thus bringing the results of R & D to the expertise and excellence in biomedical imaging as a portfoliopatient, improving the quality of the information contained there in this area through service-oriented architecture (SOA) whichin. R & D Cloud CEIB has four general modules: Search engine is being implemented in the AVS.(SE), manager of clinical trials (GEBID), anonymizer (ANON) Within R & D Cloud CEIB we can find services that enableand Bioimage knowledge Engine (BIKE). The BIKE is the data mining on the information entered from traditionalcentral module and through its submodules analyzes and capture devices of medical bioimaging through the headingsgenerates knowledge to provide to the EHR through services. of the DICOM  standard format; advanced post processingThe technology used in R & D Cloud CEIB is completely basedon Open Source. bioimaging techniques through open source libraries (FSL , , etc.); and different tools for the diagnosis, among I. INTRODUCTION which is the bioimaging classifier from the optimal selection of visual biomarkers which will be discussed later on. The bioimaging has now become one of the most The article continues with an overview of the centralizedinnovative multidisciplinary fields of medical research given imaging system in AVS (GIMC) and a short description of thethe important role it plays in the diagnosis of diseases. New proposed system R & D Cloud CEIB defining each of theneeds and improved technology require us to look for the best modules to better understand the global system. Then, itproposals to promote diagnostic based on medical imaging comes the classifier bioimaging, establishing the generalwith the help of innovative technologies and signal analysis features of it. In the last section, conclusions and future workthrough optimization of our platforms. That is why, among will be outlined the main objectives of the system R & Dother display systems for medical imaging diagnosis is Cloud CEIB.considered cuttingedge medical devices, which are due toincorporate advanced analysis techniques to meet the II. SYSTEM OVERVIEWexpectations that society expected of public services. Amongothers, the standardization of imaging biomarkers, providing In the field of bioimaging, centralized storage systems are aincreasing value to aid imaging by obtaining objective reference within the strategic framework of the AVS and themeasures to identify, measure and monitor those underlying European Community (EuroBioimage, The Euro-Bioimagingpathophysiologic processes not detectable by the observer’s Vision “to provide a clear path of access to imagingsubjectivity. technologies for every biomedical scientist in Europe”), In Valencian Community, the Health Information System creating a Europe-wide plan for this type of infrastructure thatof the Valencian Health Service (AVS) is a particularly large are harmonized and coordinated among all the nodes involved.portfolio that offers an assortment of highly specialized As a result of the creation of GIMC, images from patientssolutions. The centralization of bioimaging through from the entire population of the Valencian CommunityValenciano Biobank Medical Imaging (GIMC), will support R through archiving systems and departmental image
transmission (PACS) will form the basis of knowledge of the CEIB that enables the provision of indexed image blocks to the knowledge engines and clinical trial manager. Fig. 1 Centralized imaging system of AVS. future community science in our society through R & Dservices that are presented. The architecture defined in the R& D Cloud CEIB is defined as the following elements:bioimaging bank, scientific community, search engine,anonymizer, clinical trials manager and knowledge engine. In Fig. 3 Centralized imaging system of AVS (GIMC) with TELVENT solution.the following sections we describe each of those elements. B. Scientific community (SC) One of the main goals pursued by the proposed system is to be able to offer to the scientific community a basis for clinical trials from subsets of images from the GIMC. Scientific community can make structured requests to the GIMC. System will provide to the scientific community tools to manage this information as well as a set of advanced bioimaging postprocessing tools. C. Anonymizer (ANON) In compliance with the Data Protection Act, all images provided from the GIMC must be provided in an anonymised form, always preserving the anonimization of the patient Fig. 2 Structure of R & D Cloud CEIB information. System allows different types of anonymity, from the alteration of the existing text information in DICOMA. Bioimaging bank (GIMC). headers up to image-level deformation of parts that can Y The Valencian Medical Imaging Biobank (GIMC) is the identify the patient (especially in neuroimaging obtained bysystem in charge of centralized storage of all the bioimaging magnetic resonance). This part provides a restricted module,of the AVS, having as sources all the bioimaging generated in in which the information needed to reverse the anonymizationdifferent health centers across the Valencian Community process is stored, so that, given the event that specific needs ofthrough the synchronized copy of their internal PACS. The a trial, if more information is needed of the patient underGIMC is comprised in three blocks: The storage, which study, the system will be able to provide more informationmanages the optimized storage of all the images collected in about the patient.DICOM format; the database, which manages DICOMheaders of images received through a relational database D. R & D Bioimage trials manager system (GEBID)(storage and index); the application server, which allows The R & D bioimaging trials manager system is responsibleabstracting the system of these two previous blocks from an for providing the scientific community a platform to helpapplication layer that facilitates the management of biobank them to manage information from clinical trials. The GEBIDimage information. The GIMC provides access to all of the is based on the implementation of a customized instance ofAVS corporate applications, such as Orion Clinic (specialized XNAT  (eXtensible Neuroimaging Archive Toolkit).care management), Abucasis management (primary care) and XNAT is an open source platform designed to facilitate theother applications through DICOM web access services management of image sets and associated data (assestments,(WADO). GIMC also forms the storage basis for the retrieval reconstructions and any other information). Initially it issystems (search engines) implemented in the R & D Cloud
designed to work with neuroimaging, but the open data model architecture to enhance parallel processing. BIKE-and customizable XML-based technologies allow to adapt the postprocessing serves as a basis for all necessary bioimagingplatform for any type of bioimaging. XNAT follows a three- analysis in other modules of BIKE such as bioimagingtier architecture that includes a data file, an user interface and classifier and the module of defining and quantifyinga middleware engine. The data file can be incorporated into biomarkers.the platform through different ways, such as XML files, webforms, DICOM transfers from image capture devices or image 2. Module of defining and quantifying biomarkers (BIKE-viewers like Oxiris and so on. Among its most important Image)features are the personalized safe access to information, Image biomarkers define objective features extracted fromquality control processes of data and image information, medical images, related to normal biological processes,classification and storage of data, ability to run custom diseases or therapeutic responses. In recent years it has beensearches, communication with bioimaging generating systems, shown that imaging biomarkers provide useful complementaryprogrammability of process flows using scripts (pipelines), the information to traditional radiologic diagnosis to establish theincorporation of intermediate results and conclusions to the presence of a disturbance or injury; to measure biologicalstudy, recording of all actions taken to control and monitor status; to define its natural history and progress; stratify thequality, etc. All these features make XNAT an ideal platform abnormal phenotypes and to evaluate the effects of treatment.for the management of clinical trials. To develop an imaging biomarker it must be performed a series of steps designed to validate their relationship to the reality studied and checking its reliability, both clinical and technical. BIKE-Image module provides all the necessary tools to carry out effectively from simple measurements of size or shape to the implementation of complex models. This facilitates the definition of proof of concept and mechanism, standardized and optimized acquisition of anatomical images, functional and molecular, analysis of data using computer models, adequated visualization of the results, obtaining appropriate statistical measures, and testing of principle, efficacy and effectiveness. BIKE-Image module used as the basis for these processes, tools provided by the BIKE- Postprocessing. 3. Module of study of DICOM headers (BIKE-Datamining) Fig. 4 Bioimage trials manager system (GEBID) Within the world of medical imaging, DICOM is the standard format used. This format, file-level, also includes theE. Bioimaging Knowledge Engine (BIKE) image information obtained from the radiological procedure, The knowledge engine of the R&D Cloud CEIB (BIKE) includes in its header information in text format such asconsists of a series of modules: bioimaging postprocessing, patient demographic information, clinical information, qualitydefining and quantifying biomarkers aid, study of DICOM control data of the image, technical data of the capture deviceheader and bioimaging classifier. We describe this modules in and image type, and many more features. BIKEDataminingthe next sections. module provides tools to exploit that information in the DICOM headers for creating dashboards for the analysis of1. Module of bioimaging postprocessing aid (BIKE- various indicators of quality, radiation, etc. Using these data, Postprocessing) specialized statistical reports can be generated, which allow the quantification and control of processes, and reporting The digital processing of data obtained by the medical corporate structured format using the DICOM-SR.imaging adquire machines is a field that can extractinformation which is beyond the simple observation of images 4. Bioimaging classifier module (BIKE-Classifier)on film or on monitors of the diagnostics services. The digital Starting from the definition of clinical decision supportbioimaging processing allows to precise the anatomy of the system, an image decision support system (SADI) is aarea of study and obtain functional, and even molecular, computer system that provides specific knowledge for theinformation. With this service, the BIKE equips the system interpretation of medical imaging for the diagnosis purpose,with a set of tools based in opensource graphics libraries (FSL) prognosis, treatment or management processes of care. Thethat helps the bioimaging postprocessing in clinical trials features of SADI search may include findings associated withthrough GEBID. These tools may be used individually or the diagnosis or prognosis of the patient, therapy planning andgrouped sequentially through process management control and operations, quality control of biomedical signalsapplications such as LONI Pipelines. Given the complexity of multicenter biobanks anomalous pattern matching. The use ofmany of the postprocessing techniques required for the these systems can enhance the medical skills in thecalculation of results, the system will leverage the cloud
management of multiple variables in biomedical care III. CONCLUSIONSprocesses and help achieve balance in the health servicethrough the optimal use of resources and knowledge available. The GIMC generated within the AVS is an ideal dataAs experiences in other communities, there is a system of source for analyzing the images acquired with all bioimagingcomputer-aided diagnosis (CAD)  for mammography, modalities. The proposed system, R & D Cloud CEIB, willalready implemented in some hospitals in Castilla La Mancha provide these bioimages available to the community sciencecommunity, which processes images from mammography through differents tools like search engine (SE), clinical trialsgenerating the same analysis in which indicate the possible manager (GEBID) and knowledge engine (BIKE). BIKEinjuries that may exist, thus helping the radiologist in their provides services to perform data mining activities at thediagnosis. BIKE includes among its modules the DICOM header (BIKE-Datamining), image postprocessingBIKEclassifier, a classification system that allows multiple (BIKE-Postprocessing), definition and quantification ofclassification in a number of existing diagnostic groups. This biomarkers (BIKE-Image) and classification (BIKEclassifier).classification is based on an optimal selection of biomarkers The main goal of R & D Cloud CEIB is that all knowledgeand visual characteristics. This selection of biomarkers acquired in the system will move the patient through the(Feature Selection (FS )) can be performed using mutual publication of web services available to the electronic medicalinformation. FS is a combinatorial computational complexity record system of the patient (HSE).problem. FS Methods Must Be oriented to find suboptimalsolutions in a feasible number of iterations. The BIKE-Classifier uses the BIKE-Image to extract biomarkers and ACKNOWLEDGMENTother visual indicators quantified, and the BIKE- The authors would like to thanks people fromPostprocessing for the extraction of visual features that are not Quantification Quirón for their feedback.based on biomarkers. This system performs a supervisedlearning. In supervised classification, models or algorithms REFERENCESare capable of learning from a set of instances or cases. Each  FSL Group ‐ http://www.fmrib.ox.ac.uk/fsl/instance is a vector of features labeled with a class variable.  M. Jenkinson, C.F. Beckmann, T.E.J. Behrens, M.W. Woolrich, Formally, the problem of supervised classification is to assign and S.M. Smith. FSL. NeuroImage, 2011. In press.a set value of the variable class to a new instance. A classifier  M.W. Woolrich, S. Jbabdi, B. Patenaude, M. Chappell, S. Makni, T. can be viewed as a class assignment to each of the instances. Behrens, C. Beckmann, M. Jenkinson, S.M. Smith. Bayesian analysis of neuroimaging data in FSL. NeuroImage, 45:S173‐In the first phase we have a set of training or learning (for 186, 2009. designing the classifier) and another called test or validation  S.M. Smith, M. Jenkinson, M.W. Woolrich, C.F. Beckmann, T.E.J. (for classification), these will serve to build a model or rule Behrens, H. Johansen‐Berg, P.R. Bannister, M. De Luca, I. for classification. In the second phase is the actual process of Drobnjak, D.E. Flitney, R. Niazy, J. Saunders, J. Vickers, Y. Zhang, classifying objects or samples of the class is unknown to N. De Stefano, J.M. Brady, and P.M. Matthews. Advances in which they belong. The methods have been used so far are: functional and structural MR image analysis and neural networks, bayesian classifiers, support vector machines. implementation as FSL. NeuroImage, 23(S1):208‐219, 2004.  caBIG Community Website ‐ https://cabig.nci.nih.gov/ In our case, we propose the use of las SVM Support Vector  Rex, D. E., Ma, J.Q., and Toga, A.W. (2003). ”The LONI Pipeline Machine. The entry of this process is the optimal selection of Processing Environment.” Neuroimage, 19(3), 1033‐48. biomarkers and visual markers that allow classification into  Dinov ID, Lozev K, Petrosyan P, Liu Z, Eggert P, Pierce, J, diagnostic groups in the system. Zamanyan, A, Chakrapani, S, Van Horn, JD, Parker, DS, Magsipoc, R, Leung, K, Gutman, B, Woods, RP, Toga, AW. (2010). ”Neuroimaging Study Designs, Computational Analyses and Data Provenance Using the LONI Pipeline.” PLoS ONE 5(9): e13070. doi:10.1371/journal.pone.0013070.  XNAT ‐ Open source informatics for biomedical imaging research ‐ http.//www.xnat.org  Marcus, D.S., Olsen T., Ramaratnam M., and Buckner, R.L. (2007). The Extensible Neuroimaging Archive Toolkit (XNAT): An informatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics 5(1): 11‐34.  Postproceso en Imagen Médica: morfologia, funcional y molecular. Jos Vicente Manjón, Luis Martí‐Bonmatí, Montserrat Robles, Bernardo Celda .  Biomarcadores de imagen, imagen cuantitativa y bioingeniera (L. Mart Bonmat a,b,c, , A. Alberich‐Bayarri a,c , G. Garca‐Mart a,d , R. Sanz Requena a,c , C. Perez Castillo a,c, J.M. Carot Sierra e y J.V. Manjn Herrera). Radiologa 2011.  DICOM ‐ http://dicom.nema.org/  Downing G, Biomarkers Definitions Working Group. Fig. 5 Bioimaging Knowledge Engine (BIKE) Biomarkers and surrogate endpoints. Clin Pharmacol Therap. 2001;69:8995.
 Schuster DP. The opportunities and challenges of developing support vector machine classifiers. Mavroforakis ME, Georgiou imaging biomarkers to study lung function and disease. Am J HV, Dimitropoulos N, Cavouras D, Theodoridis S. Artif Intell Respir Crit Care Med. 2007;176:22430. Med. 2006 Jun; 37(2):145‐62.  Van Beers B, Cuenod CA, Mart‐Bonmat L, Matos C, Niessen W,  Clustering technique‐based least square support vector Padhani A, European Society of Radiology Working Group on machine for EEG signal classification. Siuly, Li Y, Wen PP. Imaging Biomarkers. White paper on Imaging Biomarkers. Comput Methods Programs Biomed. 2011 Dec;104(3):358‐72. Insights Imaging. 2010;1:425.  Boyan Bonev, Francisco Escolano and Miguel Cazorla. Feature  C. Campbell, Kernel methods: a survey of current techniques, selection, mutual information, and the classification of high‐ Neurocomputing 2002, 48, 63‐84. dimensional patterns: Applications to image classification and  Mammographic masses characterization based on localized microarray data analysis. Pattern Analysis and Applications. texture and dataset fractal analysis using linear, neural and Volume 11 Issue 3‐4, August 2008