SlideShare a Scribd company logo
1 of 34
High Performance Dense Linear System Solver with Soft Error Resilience  Peng Du, Piotr Luszczek, Jack Dongarra November 9, 2011
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],November 9, 2011
Soft error ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],November 9, 2011 0 0 1 1 0 1 0 1 1 0 1 1 0 1 0 1
LU based linear solver
Block LU factorization GETF2 TRSM GEMM GETF2 STRSM
General work flow ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why is soft error hard to handle? ,[object Object],[object Object]
Example: Error propagation Error location (using matlab notation and 1-based index) Error strikes right before panel factorization of (41:200, 41:60),  Case 1 : Error at (35,10), in L area Case 2 : Error at (50,120), in A’ area Note: Pivoting on the left of panel factorization is delayed to the end of error detection and recovery so that error in L area does not move
Case 1: Non-propagating error Error does not propagate in this case
Case 2: Propagating error
Soft error challenge November 9, 2011
Error modeling  (for propagating error) ,[object Object],[object Object],November 9, 2011 L U
Error modeling (for “where”) Define an initial erroneous initial matrix  Input matrix One step of LU If no soft error occurs If soft error occurs at step t
Locate Error Column  j
Recover Ax=b ,[object Object]
Recover Ax=b Given: To Solve:
Recover Ax=b
Recover Ax=b Recall: Therefore:
Recover Ax=b Sherman Morrison
Recover Ax=b
Recover Ax=b
Recover Ax=b Needs protection
How to detect & recovery a soft error in L? ,[object Object],[object Object],[object Object],[object Object],L U
[object Object],[object Object],Checkpointing for L, idea 1 NOT SCALABLE
Checkpointing for L, idea 2 ,[object Object],[object Object],[object Object],SCALABLE
Encoding for L ,[object Object]
Kraken Performance Two 2.6 GHz six-core AMD Opteron processors per node 32x32  MPI processes,  6  threads/(process, core)  6,144  cores used in total
Kraken Performance Two 2.6 GHz six-core AMD Opteron processors per node 64x64  MPI processes,  6  threads/(process, core)  24,576  used cores in total
November 9, 2011
[object Object],November 9, 2011
Locate Error
Locate Error
Locate Error ,[object Object],[object Object],[object Object]
Extra Storage ,[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

Control system Lab record
Control system Lab record Control system Lab record
Control system Lab record Yuvraj Singh
 
(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2Serhii Havrylov
 
Fuzzy control design_tutorial
Fuzzy control design_tutorialFuzzy control design_tutorial
Fuzzy control design_tutorialResul Çöteli
 
GRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesGRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesxryuseix
 
Transfer fn mech. systm 1
Transfer fn mech. systm 1Transfer fn mech. systm 1
Transfer fn mech. systm 1Syed Saeed
 

What's hot (8)

Control system Lab record
Control system Lab record Control system Lab record
Control system Lab record
 
Lec20
Lec20Lec20
Lec20
 
Lec21
Lec21Lec21
Lec21
 
(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2
 
Exploring Petri Net State Spaces
Exploring Petri Net State SpacesExploring Petri Net State Spaces
Exploring Petri Net State Spaces
 
Fuzzy control design_tutorial
Fuzzy control design_tutorialFuzzy control design_tutorial
Fuzzy control design_tutorial
 
GRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesGRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our lives
 
Transfer fn mech. systm 1
Transfer fn mech. systm 1Transfer fn mech. systm 1
Transfer fn mech. systm 1
 

Similar to Ieee cluster talk

Slattery_SIAMCSE15
Slattery_SIAMCSE15Slattery_SIAMCSE15
Slattery_SIAMCSE15Karen Pao
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Universität Rostock
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
k10790 nilesh prajapati control me 6th sem
k10790 nilesh prajapati control me 6th semk10790 nilesh prajapati control me 6th sem
k10790 nilesh prajapati control me 6th semharshprajapati12
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzerbutest
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationGeoffrey Fox
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Computer architecture
Computer architectureComputer architecture
Computer architectureRvishnupriya2
 
Computer architecture
Computer architectureComputer architecture
Computer architecturevishnu973656
 
nand2tetris 舊版投影片 -- 第三章 循序邏輯
nand2tetris 舊版投影片 -- 第三章 循序邏輯nand2tetris 舊版投影片 -- 第三章 循序邏輯
nand2tetris 舊版投影片 -- 第三章 循序邏輯鍾誠 陳鍾誠
 
L1811_PID.pdf
L1811_PID.pdfL1811_PID.pdf
L1811_PID.pdfBaghdad
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDing Li
 
Intelligent soft computing based
Intelligent soft computing basedIntelligent soft computing based
Intelligent soft computing basedijasa
 

Similar to Ieee cluster talk (20)

MATEX @ DAC14
MATEX @ DAC14MATEX @ DAC14
MATEX @ DAC14
 
Slattery_SIAMCSE15
Slattery_SIAMCSE15Slattery_SIAMCSE15
Slattery_SIAMCSE15
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
tracking.ppt
tracking.ppttracking.ppt
tracking.ppt
 
Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
k10790 nilesh prajapati control me 6th sem
k10790 nilesh prajapati control me 6th semk10790 nilesh prajapati control me 6th sem
k10790 nilesh prajapati control me 6th sem
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzer
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
nand2tetris 舊版投影片 -- 第三章 循序邏輯
nand2tetris 舊版投影片 -- 第三章 循序邏輯nand2tetris 舊版投影片 -- 第三章 循序邏輯
nand2tetris 舊版投影片 -- 第三章 循序邏輯
 
L1811_PID_experiment.pdf
L1811_PID_experiment.pdfL1811_PID_experiment.pdf
L1811_PID_experiment.pdf
 
L1811_PID.pdf
L1811_PID.pdfL1811_PID.pdf
L1811_PID.pdf
 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detr
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Intelligent soft computing based
Intelligent soft computing basedIntelligent soft computing based
Intelligent soft computing based
 
PG Project
PG ProjectPG Project
PG Project
 
Work
WorkWork
Work
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Ieee cluster talk

Editor's Notes

  1. ----- Meeting Notes (9/7/11 15:16) ----- move these ahead
  2. ----- Meeting Notes (9/9/11 14:40) ----- 1. generate the checksum from the matrix which becomes additional columns move 3 before 4 then check for error
  3. ----- Meeting Notes (9/7/11 15:16) ----- move these ahead
  4. ----- Meeting Notes (9/7/11 15:16) ----- show two cases seperately
  5. ----- Meeting Notes (9/7/11 15:16) ----- Show the first equation earilier expend the first equation ----- Meeting Notes (9/9/11 14:40) ----- no AG
  6. ----- Meeting Notes (9/9/11 14:40) ----- does not cost a lot what was done before determine the column
  7. ----- Meeting Notes (9/7/11 15:16) ----- show the sherman morrison
  8. ----- Meeting Notes (9/7/11 15:16) ----- requries the copy of A at the beginning
  9. ----- Meeting Notes (9/7/11 15:16) ----- show what the two rows are
  10. ----- Meeting Notes (9/7/11 15:16) ----- explain why it scales
  11. ----- Meeting Notes (9/7/11 15:16) ----- do this at the end use li not l2 definition of n describe the locating process
  12. ----- Meeting Notes (9/7/11 15:16) ----- Netlib->Scalapack 1.8
  13. ----- Meeting Notes (9/9/11 14:40) ----- wj on the left of s reminds what w is look for a match of wj in w
  14. ----- Meeting Notes (9/7/11 15:16) ----- NxN copy ratio not r leave MxN (generalized for rectanglar matrices)