Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform


Published on

Presentation held at Healthgrid Conference 2011 - Bristol, England

Abstract. This paper presents the architecture of the Virtual Imaging Platform supporting the execution of medical image simulation workflows on multiple computing infrastructures. The system relies on the MOTEUR engine for workflow execution and on the DIRAC pilot-job system for workload management. The jGASW code wrapper is extended to describe applications running on multiple infrastructures and a DIRAC cluster agent that can securely involve personal cluster resources with no administrator intervention is proposed. Grid data management is complemented with local storage used as a failover in case of file transfer errors. Between November 2010 and April 2011 the platform was used by 10 users to run 484 workflow instances representing 10.8 CPU years. Tests show that a small personal cluster can significantly contribute to a simulation running on EGI and that the improved data manager can decrease the job failure rate from 7.7% to 1.5%.

More information:

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform

  1. 1. VIPVirtual Imaging Platform Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform Rafael FERREIRA DA SILVA1, Sorina CAMARASU-POP1, Baptiste GRENIER3 Vanessa HAMAR2, David MANSET3, Johan MONTAGNAT4, Jérôme REVILLARD3 Javier ROJAS BALDERRAMA4, Andrei TSAREGORODTSEV2, Tristan GLATARD1 1 Université de Lyon, CNRS, INSERM, CREATIS 2 Centre de Physique des Particules de Marseille 3 maatG France 4 CNRS/UNS, I3S lab, MODALIS team Bristol, HealthGrid 2011 VIP ANR-09-COSI-013-02 1
  2. 2. VIPVirtual Imaging Platform Introduction   Simulation Computation: 16h Example: 2D+t ultrasound simulation [O. Bernard] Example: Prostate traitement in protontherapy Computation: 2 months [L. Grevillot, D. Sarrut] 8.5 CPU days US MRI CT PET VIP ANR-09-COSI-013-02 2
  3. 3. VIPVirtual Imaging Platform Introduction   European Grid Infrastructure (EGI)   High throughput for computation and data transfers   Challenges   High latencies   Low reliability VIP ANR-09-COSI-013-02 3
  4. 4. VIPVirtual Imaging Platform Our Goal   Multi-infrastructure workflow execution   Grid resources   Personal clusters (non-grid resources)   Improve data transfer reliability   Local reliable storage VIP ANR-09-COSI-013-02 4
  5. 5. VIPVirtual Imaging Platform VIP Architecture VIP ANR-09-COSI-013-02 5
  6. 6. VIPVirtual Imaging Platform Workflows   Virtual Imaging Platform (VIP)   Workflow Activities Data sources Data sinks Simulated US image of the heart   Core Simulation Workflows   MRI, US, CT and PET VIP ANR-09-COSI-013-02 6
  7. 7. VIPVirtual Imaging Platform Workflow Wrapping   Simulations are described as workflows   Workflows are interpreted and executed by MOTEUR   Core engine workflow interpreter   Data flow evaluation   Production of computational tasks   Workflow activities are described using jGASW   Code wrapper for distributed platforms and local executions   Grid jobs (bash scripts) are generated from processed data   Each jGASW descriptor bundles a unique executable VIP ANR-09-COSI-013-02 7
  8. 8. VIPVirtual Imaging Platform Pilot Jobs   Pull mode   Execution environment verification   Worker node reservation Moteur + jGASW User Jobs Pilot-jobs System gLite WMS X Worker Nodes DIRAC Architecture VIP ANR-09-COSI-013-02 8
  9. 9. VIPVirtual Imaging Platform Multi-infrastructure Execution   Design constraints   No intervention from the cluster administrator   No sharing or generic user accounts   Minimal technical assumptions on the cluster architecture   Setup   The agent is launched by the user on his cluster(s) account(s)   Download of a tar.gz bundle and run setup script   Support to PBS, BQS and SLURM   Execution   Queries WMS for user’s waiting tasks Personal cluster   Submits pilots to the local cluster queue VIP cluster bundle   Embedded data transfer client (VLET)! VIP ANR-09-COSI-013-02 9
  10. 10. VIPVirtual Imaging Platform Multi-infrastructure Execution   Conditions   EGI   134-node cluster limited to 67 pilots running concurrently   3 executions of a PET workflow   Results   Conclusion   Involving a small personal cluster in a simulation has a significant impact on the execution VIP ANR-09-COSI-013-02 10
  11. 11. VIPVirtual Imaging Platform Reliable Data Management   EGI three-tier data management   Challenge: data availability between 80-95%   Storage Elements (SE) – DPM, dCache, STORM, Castor   Logical File Catalog (LFC) – single index space   Current Data Management in VIP   Critical input files are replicated   Files are cached by pilot jobs   Output files are stored on site SE   Jobs error rate: 5-10%   Local Data Manager   Failover storage   Available for users and grid jobs   Overlay of DPM SE VIP ANR-09-COSI-013-02 11
  12. 12. VIPVirtual Imaging Platform Reliable Data Management   Data Management use case Input download Output upload VIP ANR-09-COSI-013-02 12
  13. 13. Impact of the Data Manager VIPVirtual Imaging Platform on Job Reliability   Conditions   EGI, biomed VO (production infrastructure)   Ultrasonic Simulation comprised of 128 jobs   Each job has 5 input files + 1 output file   Failure rate: 1%   Results   Conclusion   Job data transfers failure rate can be significantly decreased in the presence of a failover storage VIP ANR-09-COSI-013-02 13
  14. 14. VIPVirtual Imaging Platform Web Portal   Workflow submission and management   Workflow execution and monitoring VIP ANR-09-COSI-013-02 14
  15. 15. VIPVirtual Imaging Platform Web Portal   Detailed performance information VIP ANR-09-COSI-013-02 15
  16. 16. VIPVirtual Imaging Platform Web Portal   Authentication based on X.509 certificates VIP ANR-09-COSI-013-02 16
  17. 17. VIPVirtual Imaging Platform Web Portal   File transfer operations VIP ANR-09-COSI-013-02 17
  18. 18. VIPVirtual Imaging Platform Platform Usage Statistics   484 workflows executions (Nov 2010 – Apr 2011)   10 (real) users   5 application classes VIP ANR-09-COSI-013-02 18
  19. 19. VIPVirtual Imaging Platform Conclusions   VIP can execute workflows on multi-infrastructures   Extension of jGASW application description   Extension of DIRAC to support personal clusters with no administration intervention and respecting common security rules   Results show that a small personal cluster can significantly contribute to a simulation running on EGI   VIP Data Manager   A first test shows that job failure rate was decreased from 7.7% to 1.5%   Try it!   Requires a certificate registered in the Biomed VO  VIP ANR-09-COSI-013-02 19
  20. 20. VIPVirtual Imaging Platform Future Work   Workflow Execution   Scalability and Reliability   Memory shortage   Workflow checkpointing   Provenance of simulated data VIP ANR-09-COSI-013-02 20
  21. 21. VIPVirtual Imaging Platform Future Work   Simulators and Models   Biological object model sharing for image simulation   See poster FORESTIER et al. at CBMS 2011   Semantic catalog of biological models   3D visualization VIP ANR-09-COSI-013-02 21
  22. 22. VIPVirtual Imaging Platform Future Work   Simulators and Models   Multi-Modality Simulation Workflow   See poster MARION et al. at CBMS 2011 SIMRI (MRI) Sindbad (CT) VIP ANR-09-COSI-013-02 22
  23. 23. VIPVirtual Imaging Platform Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform Questions?! VIP ANR-09-COSI-013-02 23
  24. 24. VIPVirtual Imaging Platform References   Fabrizio Gagliardi, Bob Jones, Fran ̧ois Grey, Marc-Elian Bégin, and Matti Heikkurinen. Building an infrastructure for scientific grid computing: status and goals of the egee project. Phil. Trans. R. Soc. A 15, 363(1833):1729– 1742, aug 2005.   T. Li, S. Camarasu-Pop, T. Glatard, T. Grenier, and H. Benoit-Cattin. Optimization of mean-shift scale parameters on the egee grid. In Studies in health technology and informatics, Proceedings of Healthgrid 2010, volume 159, pages 203–214, 2010.   A. Marion, G. Forestier, H. Benoit-Cattin, S. Camarasu-Pop, P. Clarysse, R. Ferreira da Silva, B. Gibaud, T. Glatard, P. Hugonnard, C. Lartizien, H. Liebgott, J. Tabary, S. Valette, and D. Friboulet. Multi- modality medical image simulation of biological models with the Virtual Imaging Platform (VIP). In IEEE CBMS 2011, Bristol, UK, 2011. submitted.   Tristan Glatard, Johan Montagnat, Diane Lingrand, and Xavier Pennec. Flexible and efficient workflow deployement of data-intensive applications on grids with MOTEUR. International Journal of High Performance Computing Applications (IJHPCA), 22(3):347–360, August 2008.   Javier Rojas Balderrama, Johan Montagnat, and Diane Lingrand. jGASW: A Service-Oriented Frame- work Supporting High Throughput Computing and Non-functional Concerns. In IEEE International Conference on Web Services, ICWS 2010, Miami (FL), USA, July 2010. IEEE Computer Society.   A Tsaregorodtsev, N Brook, A Casajus Ramo, Ph Charpentier, J Closier, G Cowan, R Graciani Diaz, E Lanciotti, Z Mathe, R Nandakumar, S Paterson, V Romanovsky, R Santinelli, M Sapunov, A C Smith, M Seco Miguelez, and A Zhelezov. DIRAC3 . The New Generation of the LHCb Grid Software. Journal of Physics: Conference Series, 219(6):062029, 2009.   L. Grevillot, T. Frisson, D Maneval, N. Zahra, J.N. Badel, and D. Sarrut. Simulation of a 6 mv elekta precise linac photon beam using gate / geant4. Phys Med Biol, 56(4), 2011. VIP ANR-09-COSI-013-02 24
  25. 25. VIPVirtual Imaging Platform References   Silvia Olabarriaga, T. Glatard, and P.T. de Boer. A virtual laboratory for medical image analysis. IEEE T Inf Technol B, 14(4):979–985, 2010.   Vladimir V. Korkhov, Jakub T. Moscicki, and Valeria V. Krzhizhanovskaya. Dynamic workload bal- ancing of parallel applications with user-level scheduling on the grid. Future Generation Computer Systems, 25(1):28 – 34, 2009.   Ivo D. Dinov, John D. Van Horn, Kamen M. Lozev, Rico Magsipoc, Petros Petrosyan, Zhizhong Liu, Allan MacKenzie-Graham, Paul Eggert, Parker Douglas S., and Arthur W. Toga. Efficient, Distributed and Interactive Neuroimaging Data Analysis Using the LONI Pipeline. Frontiers in Neuroinformatics, 3(22):1–10, 2009.   Thierry Delaitre, Tamas Kiss, Ariel Goyeneche, Gabor Terstyanszky, Stephen Winter, and Péter Kacsuk. GEMLCA: Running Legacy Code Applications as Grid services. Journal of Grid Computing, 3(1):75– 90, 2005.   Kyle Chard, Wei Tan, Joshua Boverhof, Ravi Madduri, and Ian Foster. Wrap Scientific Applications as WSRF Grid Services Using gRAVI. In International Conference on Web Services, ICWS’09, Los Angeles (CA), USA, July 2009.   Martin Senger, Peter Rice, Alan Bleasby, Tom Oinn, and Mahmut Uludag. Soaplab2: More Reliable Sesame Door to Bioinformatics Programs. In Bioinformatics Open Source Conference, BOSC’08, Toronto, ON Canada, July 2008.   Douglas Thain, Todd Tannenbaum, and Miron Livny. Condor and the Grid, pages 299–335. John Wiley & Sons, Ltd, 2003.   Vinod Kasam, Jean Salzemann, Marli Botha, Ana Dacosta, Gianluca Degliesposti, Raul Isea, Do- man Kim, Astrid Maass, Colin Kenyon, Giulio Rastelli, Martin Hofmann-Apitius, and Vincent Breton. Wisdom-ii: Screening against multiple targets implicated in malaria using computational grid infrastruc- tures. Malaria Journal, 8(1):88, 2009. VIP ANR-09-COSI-013-02 25
  26. 26. VIPVirtual Imaging Platform Workflow Wrapping   Multi-infrastructure code descriptor (jGASW extension) VIP ANR-09-COSI-013-02 26