SlideShare a Scribd company logo
1 of 23
Download to read offline
Dai Yang, Josef Weidendorfer, Tilman Küstner and Carsten Trinitis
Chair of Computing Architecture
Technical University of Munich (TUM)
Sibylle Ziegler
Klinik und Poliklinik für Nuklearmedizin,
Ludwig Maximillian Universität München
14. September 2017
Enabling Application Integrated Proactive Fault
Tolerance
ENVELOPE – Efficiency and Reliability: Selforganisation in HPC Systems
ParCo Conferences 2017
http://envelope.itec.kit.edu/
• Complexity of HPC towards Exascale Computing
To hide the complexity of HPC from the application programmer.
• Missing dynamic in HPC applications
• With increasing degree of heterogeneity
• Efficiency
To increase the efficiency of existing and new HPC applications.
• Reliability
To increase the reliability of HPC environment.
- Global Checkpointing and Restart do not scale well enough for exascale
• This work is part of BMBF Project ENVELOPE and funded by BMBF under grant title
01IH16010D.
• Computer resources for this project have been provided by the Gauss Centre for
Supercomputing/Leibniz Supercomputing Centre under grant: pr63qi.
2(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Motivation
Background
• Application integrated approach
• In comparison to application transparent, system-level approach
• For both existing and new applications
Basic Idea
• Exchange/expand/shrink application („Malleable“ Application)
• Application should be able to retreat itself
• Incremental adaptable
• Data-Oriented, SPMD Model (same as MPI)
• PGAS-like
3(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Goals
• Modularized Design, plugin-based, expandable
• Index space abstraction
• A bit of data management – no global array
• Automatic Load-Balancing
• (proactive) Fault Tolerance
• (future) Reactive Fault Tolerance by using In-Memory Checkpointing
4(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (0) – Design Principles
• Application - Integrated
• Typical data types (1D/2D/3D) + (future) any data types
• Typical HPC communication backend:
currently MPI (works with simple OpenMP as well)
5(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (1) – Design
• Partitioning over index spaces
• Automatic Data (Re-)Balancing by Repartitioning:
• Uniform Distribution per # of Elements or task-wise
• By Element weight
• (future) by Profiling
• Fault Tolerance
• Proactive, via Repartitioning
• (future) Reactive, via local in-memory checkpointing
• Communication Backend:
• Working: MPI
• WIP: Shared Memory
• WIP: Agents for System State Information
• MQTT and TCP
6(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (2) At a Glance
• Access Pattern (r/w) and Data Flow (CopyIn/CopyOut) controlled
• Supports coupling of different data containers
• Data Consistency by using given reduction operations upon multiple write access
• Flexible data partitions (malleable) for repartitioning
7(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (3) – Partitioning
• Types of partitioning and corresponding partitioners
o Master: all data in only one task
o Blocked: every task has a slice of data
o All: everyone has everything
o (future) Halo, Bisection and others
• Switch Partitioning for Data redistribution
• Data Flow and Consistency is checked and enforced
8(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (4) – Partitioning and Partitioners
• Different Repartition Methods: continuous and incremental
• Steps:
1. Synchronize Tasks, communicate failed Task Numbers
2. Create a new Group excluding failed tasks
3. Get partitioner, rerun partitioner with this new group -> new balanced indexes
4. Calculate differences and data transfer action required - Transition
5. For each data container: Execute the transition
6. (optional) remove/migrate old group to new group
7. Update Data/Address space Mapping
LAIK (5) – Repartitioning
(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 9
10(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (6) – Basic API
• Laik_Instance* inst = laik_init_mpi(&argc, &argv);
• Laik_Group* world = laik_world(inst);
• Laik_Space* space = laik_new_space_1d(inst, matrix.rows());
• Laik_Partitioner* part = laik_new_block_partitioner_iw1(getEW, &matrix);
• Laik_Partitioning* p = laik_new_partitioning(world, space, part);
• Laik_Data* result = laik_alloc_1d(world, laik_Float, nRows);
• laik_switchto_new(result, laik_All, LAIK_DF_None);
• laik_switchto_flow(result, LAIK_DF_Init | LAIK_DF_ReduceOut |
LAIK_DF_Sum);
• laik_map_def1(result, (void**) &res, 0);
• Laik_Slice* slc = laik_my_slice(p, sNo);
• laik_switchto_flow(result, LAIK_DF_CopyIn);
• Laik_Group* g2 = laik_new_shrinked_group(g, removeLen, removeList);
• rep = laik_new_reassign_partitioner(g2, getEW, (void*)&matrix);
• laik_migrate_and_repartition(part, g2, rep);
11(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
MLEM – Short Introduction The small animal PET scanner
MADPET-II
1152 detectors, 662976 lines of response
Field of view 140 x 140 x 40 voxels, total 784000 voxels
12(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
MLEM Algorithm
• Adaptation for Matrix Partitioning using LAIK
• Improve Mapping algorithm of sparse matrix to handle multiple independent slices
• Creation of Data Container for all working vectors
• Add loop to handle multiple slices
• Added wrapper for handling parameters for repartitioning
• System: CooLMUC 2 - NeXtScale nx360M5, Xeon E5-2697v3 14C 2.6GHz, Infiniband
FDR14
• Testinput: 12GB Probability Sparse Matrix, 10 Iterations
• Simulated Fault by enforce shrinking after 6th Iteration
13(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Steps Done for Porting MLEM to LAIK
14(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (0) – Overview
15(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (1) – Overhead of LAIK
16(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (2) – Time for Repartitiong
17(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (3) – 2 Repart Algorithms
• LAIK: A library to increase elasticity in parallel application
• By adding partitioned index spaces as abstraction
• Repartitioning as central functionality
• Automatic Load-Balancing
• Fault Tolerant
• Modularized and expandable
• Increased elasticity in parallel codes
• Porting MLEM & Results
• Limited effort in application porting required
• Low overhead of LAIK
• LAIK scales at least at the same scale as the original application
18(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Conclusion
Working in Progress
• Porting further application, e.g. LULESH
• Further Scalability research using >10000 cores on SuperMUC
• Agent system
• Shared memory backend
• Further optimization to reduce communication effort
Proposed
• Solution to overcome MPI-Weakness
• Local in-memory Checkpointing
• Non-regular data structure
• Elastic index space size for hierarchical instantiations
19(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Future Work
[1] Alrutz, T., Backhaus, J., and et. al. GASPI: A Partitioned Global Address Space
programming interface. In Facing the Multicore-Challenge III (2013), vol. 7686 of Lecture
notes in computer science, Springer Berlin Heidelberg.
[2] Bergman, K., Borkar, S., and et. al. Exascale computing study: Technology challenges in
achieving exascale systems. DARPA IPTO Office, Tech. Rep 15 (2008).
[3] Forum, M. P. I. MPI: A Message-Passing Interface Standard Version 3.0, 2012.
[4] Furlinger, K., Glass, C., Knüpfer, A., Tao, J., Hünich, D., Idrees, K., Maiterth, M., Mhedheb,
Y., and Zhou, H. DASH: Data structures and algorithms with support for hierarchical locality. In
Euro-Par 2014 Workshops (Porto, Portugal) (2014).
[5] Idrees, K. Effective use of the PGAS paradigm: Driving transformations and self-adaptive
behavior in dash-applications. In Proceedings of the 1st Int. Workshop on Program
Transformation for Programmability in Heterogeneous Architectures (2016).
[6] Kale, L. V., and Krishnan, S. Charm++: a portable concurrent object oriented system based
on c++. In ACM Sigplan Notices (1993), vol. 28, ACM, pp. 91–108.
[7] Küstner, T., Weidendorfer, J., Schirmer, J., Klug, T., Trinitis, C., and Ziegler, S. Parallel
MLEM on multicore architectures. In ICCS 2009: 9th Int. Conf. on Computational Science
(Berlin, Heidelberg, 2009), G. Allen et al., Ed., Springer.
20(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
[8] Nagarajan, A. B., and Mueller, F. Proactive fault tolerance for HPC with Xen virtualization.
In Proceedings of the 21st annual Int. Conf. on Supercomputing (2007).
[9] Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., and Apra, E. ` Advances,
applications and performance of the global arrays shared memory programming toolkit. The
Int. Journal of High Performance Computing Applications 20, 2 (2006).
[10] Pickartz, S., Clauss, C., Lankes, S., Krempel, S., Moschny, T., and Monti, A. Nonintrusive
Migration of MPI Processes in OS-Bypass Networks. In 2016 IEEE Int. Parallel and Distributed
Processing Symposium Workshops (IPDPSW) (2016).
[11] Rafecas, M., Mosler, B., Dietz, M., Pgl, M., Stamatakis, A., McElroy, D. P., and Ziegler, S.
I. Use of a Monte Carlo-based probability matrix for 3-D iterative reconstruction of MADPET-II
data. IEEE Trans. on Nuclear Science 51, 5 (2004).
[12] Saraswat, V., Bloom, B., and et. al. X10 language specification version 2.5.
[13] Shepp, L. A., and Vardi, Y. Maximum likelihood reconstruction for emission tomography.
IEEE Transactions on Medical Imaging 1, 2 (1982), 113–122.
[14] Strul, D., Slates, R. B., Dahlbom, M., Cherry, S. R., and Marsden, P. K. An improved
analytical detector response function model for multilayer small-diameter PET scanners.
Physics in Medicine and Biology 48 (2003), 979–994.
21(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
[15] Treichler, S., Bauer, M., and Aiken, A. Language support for dynamic, hierarchical data
partitioning. In ACM SIGPLAN Notices (2013), vol. 48, ACM, pp. 495–514.
[16] Wang, C., Mueller, F., and et. al. Proactive process-level live migration and back migration
in HPC environments. J. of Parallel and Distributed Comp. 72, 2 (2012).
[17] Weidendorfer, J., Yang, D., and Trinitis, C. Laik: A library for fault tolerant distribution of
global data for parallel applications. In Proceedings of the 27th PARS Workshop (PARS 2017)
(Hagen, 2017), Gesellschaft für Informatik.
[18] Zhou, H., Mhedheb, Y., and et. al. DART-MPI: an mpi-based implementation of a PGAS
runtime system. CoRR abs/1507.01773 (2015).
[19] Zima, H., Chamberlain, B. L., and Callahan, D. Parallel programmability and the Chapel
language. International Journal on HPC Applications, Special Issue on High Productivity
Languages and Models 21, 3 (2007), 291–312.
22(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
• LAIK
https://github.com/envelope-project/laik
• MLEM Project
https://github.com/envelope-project/mlem
• Josef Weidendorfer:
weidendo@in.tum.de
• Dai Yang
d.yang@tum.de
• Tilman Küstner
kuestner@in.tum.de
• Carsten Trinitis
carsten.trinitis@tum.de
23(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Infos

More Related Content

What's hot

RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012Eleni Trouva
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017ARCFIRE ICT
 
Rumba presentation at FEC2
Rumba presentation at FEC2Rumba presentation at FEC2
Rumba presentation at FEC2ARCFIRE ICT
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingTal Lavian Ph.D.
 
IRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinIRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinEleni Trouva
 
Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Eleni Trouva
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)ARCFIRE ICT
 
Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Eleni Trouva
 
IRATI project presentation
IRATI project presentationIRATI project presentation
IRATI project presentationEleni Trouva
 
Irati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopIrati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopEleni Trouva
 
RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...Eleni Trouva
 
RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013Eleni Trouva
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A TutorialAPNIC
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Eleni Trouva
 
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Eleni Trouva
 
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingMulti-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingARCFIRE ICT
 

What's hot (20)

RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017
 
Rumba presentation at FEC2
Rumba presentation at FEC2Rumba presentation at FEC2
Rumba presentation at FEC2
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video Streaming
 
IRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinIRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, Dublin
 
Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014
 
Seamless mpls
Seamless mpls Seamless mpls
Seamless mpls
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)
 
guna_2015.DOC
guna_2015.DOCguna_2015.DOC
guna_2015.DOC
 
Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Update on IRATI technical work after month 6
Update on IRATI technical work after month 6
 
IRATI project presentation
IRATI project presentationIRATI project presentation
IRATI project presentation
 
Irati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopIrati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA Workshop
 
Design Principles for 5G
Design Principles for 5GDesign Principles for 5G
Design Principles for 5G
 
RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...
 
Mpls
MplsMpls
Mpls
 
RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A Tutorial
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012
 
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
 
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingMulti-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
 

Similar to Enabling Application Integrated Proactive Fault Tolerance

Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskySpark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskyDatabricks
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdfLevLafayette1
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersAM Publications
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 
Evolutionary Multi-Goal Workflow Progress in Shade
Evolutionary  Multi-Goal Workflow Progress in ShadeEvolutionary  Multi-Goal Workflow Progress in Shade
Evolutionary Multi-Goal Workflow Progress in ShadeIRJET Journal
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Trieu Nguyen
 
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with CapellaObeo
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdfOpenACC
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)IJCSEA Journal
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET Journal
 
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
An Adjacent Analysis of the Parallel Programming Model Perspective: A SurveyIRJET Journal
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsVladislavKashansky
 

Similar to Enabling Application Integrated Proactive Fault Tolerance (20)

Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskySpark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi Clusters
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Evolutionary Multi-Goal Workflow Progress in Shade
Evolutionary  Multi-Goal Workflow Progress in ShadeEvolutionary  Multi-Goal Workflow Progress in Shade
Evolutionary Multi-Goal Workflow Progress in Shade
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...
 
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)
 
SICOMORO
SICOMOROSICOMORO
SICOMORO
 
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous EnvironmentData Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Enabling Application Integrated Proactive Fault Tolerance

  • 1. Dai Yang, Josef Weidendorfer, Tilman Küstner and Carsten Trinitis Chair of Computing Architecture Technical University of Munich (TUM) Sibylle Ziegler Klinik und Poliklinik für Nuklearmedizin, Ludwig Maximillian Universität München 14. September 2017 Enabling Application Integrated Proactive Fault Tolerance ENVELOPE – Efficiency and Reliability: Selforganisation in HPC Systems ParCo Conferences 2017 http://envelope.itec.kit.edu/
  • 2. • Complexity of HPC towards Exascale Computing To hide the complexity of HPC from the application programmer. • Missing dynamic in HPC applications • With increasing degree of heterogeneity • Efficiency To increase the efficiency of existing and new HPC applications. • Reliability To increase the reliability of HPC environment. - Global Checkpointing and Restart do not scale well enough for exascale • This work is part of BMBF Project ENVELOPE and funded by BMBF under grant title 01IH16010D. • Computer resources for this project have been provided by the Gauss Centre for Supercomputing/Leibniz Supercomputing Centre under grant: pr63qi. 2(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Motivation
  • 3. Background • Application integrated approach • In comparison to application transparent, system-level approach • For both existing and new applications Basic Idea • Exchange/expand/shrink application („Malleable“ Application) • Application should be able to retreat itself • Incremental adaptable • Data-Oriented, SPMD Model (same as MPI) • PGAS-like 3(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Goals
  • 4. • Modularized Design, plugin-based, expandable • Index space abstraction • A bit of data management – no global array • Automatic Load-Balancing • (proactive) Fault Tolerance • (future) Reactive Fault Tolerance by using In-Memory Checkpointing 4(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (0) – Design Principles
  • 5. • Application - Integrated • Typical data types (1D/2D/3D) + (future) any data types • Typical HPC communication backend: currently MPI (works with simple OpenMP as well) 5(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (1) – Design
  • 6. • Partitioning over index spaces • Automatic Data (Re-)Balancing by Repartitioning: • Uniform Distribution per # of Elements or task-wise • By Element weight • (future) by Profiling • Fault Tolerance • Proactive, via Repartitioning • (future) Reactive, via local in-memory checkpointing • Communication Backend: • Working: MPI • WIP: Shared Memory • WIP: Agents for System State Information • MQTT and TCP 6(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (2) At a Glance
  • 7. • Access Pattern (r/w) and Data Flow (CopyIn/CopyOut) controlled • Supports coupling of different data containers • Data Consistency by using given reduction operations upon multiple write access • Flexible data partitions (malleable) for repartitioning 7(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (3) – Partitioning
  • 8. • Types of partitioning and corresponding partitioners o Master: all data in only one task o Blocked: every task has a slice of data o All: everyone has everything o (future) Halo, Bisection and others • Switch Partitioning for Data redistribution • Data Flow and Consistency is checked and enforced 8(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (4) – Partitioning and Partitioners
  • 9. • Different Repartition Methods: continuous and incremental • Steps: 1. Synchronize Tasks, communicate failed Task Numbers 2. Create a new Group excluding failed tasks 3. Get partitioner, rerun partitioner with this new group -> new balanced indexes 4. Calculate differences and data transfer action required - Transition 5. For each data container: Execute the transition 6. (optional) remove/migrate old group to new group 7. Update Data/Address space Mapping LAIK (5) – Repartitioning (C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 9
  • 10. 10(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (6) – Basic API • Laik_Instance* inst = laik_init_mpi(&argc, &argv); • Laik_Group* world = laik_world(inst); • Laik_Space* space = laik_new_space_1d(inst, matrix.rows()); • Laik_Partitioner* part = laik_new_block_partitioner_iw1(getEW, &matrix); • Laik_Partitioning* p = laik_new_partitioning(world, space, part); • Laik_Data* result = laik_alloc_1d(world, laik_Float, nRows); • laik_switchto_new(result, laik_All, LAIK_DF_None); • laik_switchto_flow(result, LAIK_DF_Init | LAIK_DF_ReduceOut | LAIK_DF_Sum); • laik_map_def1(result, (void**) &res, 0); • Laik_Slice* slc = laik_my_slice(p, sNo); • laik_switchto_flow(result, LAIK_DF_CopyIn); • Laik_Group* g2 = laik_new_shrinked_group(g, removeLen, removeList); • rep = laik_new_reassign_partitioner(g2, getEW, (void*)&matrix); • laik_migrate_and_repartition(part, g2, rep);
  • 11. 11(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 MLEM – Short Introduction The small animal PET scanner MADPET-II 1152 detectors, 662976 lines of response Field of view 140 x 140 x 40 voxels, total 784000 voxels
  • 12. 12(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 MLEM Algorithm
  • 13. • Adaptation for Matrix Partitioning using LAIK • Improve Mapping algorithm of sparse matrix to handle multiple independent slices • Creation of Data Container for all working vectors • Add loop to handle multiple slices • Added wrapper for handling parameters for repartitioning • System: CooLMUC 2 - NeXtScale nx360M5, Xeon E5-2697v3 14C 2.6GHz, Infiniband FDR14 • Testinput: 12GB Probability Sparse Matrix, 10 Iterations • Simulated Fault by enforce shrinking after 6th Iteration 13(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Steps Done for Porting MLEM to LAIK
  • 14. 14(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (0) – Overview
  • 15. 15(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (1) – Overhead of LAIK
  • 16. 16(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (2) – Time for Repartitiong
  • 17. 17(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (3) – 2 Repart Algorithms
  • 18. • LAIK: A library to increase elasticity in parallel application • By adding partitioned index spaces as abstraction • Repartitioning as central functionality • Automatic Load-Balancing • Fault Tolerant • Modularized and expandable • Increased elasticity in parallel codes • Porting MLEM & Results • Limited effort in application porting required • Low overhead of LAIK • LAIK scales at least at the same scale as the original application 18(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Conclusion
  • 19. Working in Progress • Porting further application, e.g. LULESH • Further Scalability research using >10000 cores on SuperMUC • Agent system • Shared memory backend • Further optimization to reduce communication effort Proposed • Solution to overcome MPI-Weakness • Local in-memory Checkpointing • Non-regular data structure • Elastic index space size for hierarchical instantiations 19(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Future Work
  • 20. [1] Alrutz, T., Backhaus, J., and et. al. GASPI: A Partitioned Global Address Space programming interface. In Facing the Multicore-Challenge III (2013), vol. 7686 of Lecture notes in computer science, Springer Berlin Heidelberg. [2] Bergman, K., Borkar, S., and et. al. Exascale computing study: Technology challenges in achieving exascale systems. DARPA IPTO Office, Tech. Rep 15 (2008). [3] Forum, M. P. I. MPI: A Message-Passing Interface Standard Version 3.0, 2012. [4] Furlinger, K., Glass, C., Knüpfer, A., Tao, J., Hünich, D., Idrees, K., Maiterth, M., Mhedheb, Y., and Zhou, H. DASH: Data structures and algorithms with support for hierarchical locality. In Euro-Par 2014 Workshops (Porto, Portugal) (2014). [5] Idrees, K. Effective use of the PGAS paradigm: Driving transformations and self-adaptive behavior in dash-applications. In Proceedings of the 1st Int. Workshop on Program Transformation for Programmability in Heterogeneous Architectures (2016). [6] Kale, L. V., and Krishnan, S. Charm++: a portable concurrent object oriented system based on c++. In ACM Sigplan Notices (1993), vol. 28, ACM, pp. 91–108. [7] Küstner, T., Weidendorfer, J., Schirmer, J., Klug, T., Trinitis, C., and Ziegler, S. Parallel MLEM on multicore architectures. In ICCS 2009: 9th Int. Conf. on Computational Science (Berlin, Heidelberg, 2009), G. Allen et al., Ed., Springer. 20(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 21. [8] Nagarajan, A. B., and Mueller, F. Proactive fault tolerance for HPC with Xen virtualization. In Proceedings of the 21st annual Int. Conf. on Supercomputing (2007). [9] Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., and Apra, E. ` Advances, applications and performance of the global arrays shared memory programming toolkit. The Int. Journal of High Performance Computing Applications 20, 2 (2006). [10] Pickartz, S., Clauss, C., Lankes, S., Krempel, S., Moschny, T., and Monti, A. Nonintrusive Migration of MPI Processes in OS-Bypass Networks. In 2016 IEEE Int. Parallel and Distributed Processing Symposium Workshops (IPDPSW) (2016). [11] Rafecas, M., Mosler, B., Dietz, M., Pgl, M., Stamatakis, A., McElroy, D. P., and Ziegler, S. I. Use of a Monte Carlo-based probability matrix for 3-D iterative reconstruction of MADPET-II data. IEEE Trans. on Nuclear Science 51, 5 (2004). [12] Saraswat, V., Bloom, B., and et. al. X10 language specification version 2.5. [13] Shepp, L. A., and Vardi, Y. Maximum likelihood reconstruction for emission tomography. IEEE Transactions on Medical Imaging 1, 2 (1982), 113–122. [14] Strul, D., Slates, R. B., Dahlbom, M., Cherry, S. R., and Marsden, P. K. An improved analytical detector response function model for multilayer small-diameter PET scanners. Physics in Medicine and Biology 48 (2003), 979–994. 21(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 22. [15] Treichler, S., Bauer, M., and Aiken, A. Language support for dynamic, hierarchical data partitioning. In ACM SIGPLAN Notices (2013), vol. 48, ACM, pp. 495–514. [16] Wang, C., Mueller, F., and et. al. Proactive process-level live migration and back migration in HPC environments. J. of Parallel and Distributed Comp. 72, 2 (2012). [17] Weidendorfer, J., Yang, D., and Trinitis, C. Laik: A library for fault tolerant distribution of global data for parallel applications. In Proceedings of the 27th PARS Workshop (PARS 2017) (Hagen, 2017), Gesellschaft für Informatik. [18] Zhou, H., Mhedheb, Y., and et. al. DART-MPI: an mpi-based implementation of a PGAS runtime system. CoRR abs/1507.01773 (2015). [19] Zima, H., Chamberlain, B. L., and Callahan, D. Parallel programmability and the Chapel language. International Journal on HPC Applications, Special Issue on High Productivity Languages and Models 21, 3 (2007), 291–312. 22(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 23. • LAIK https://github.com/envelope-project/laik • MLEM Project https://github.com/envelope-project/mlem • Josef Weidendorfer: weidendo@in.tum.de • Dai Yang d.yang@tum.de • Tilman Küstner kuestner@in.tum.de • Carsten Trinitis carsten.trinitis@tum.de 23(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Infos