SlideShare a Scribd company logo
1 of 22
Protecting from Transient Failures in Cloud
Deployments
Supervised By:
Dr. Mohammad Abdur Rouf (Professor)
Head of Department of Computer Science & Engineering
Dhaka University of Engineering and Technology, Gazipur
Ph.D. (KAIST) , M.Sc. Engg. (BUET), © B.Sc. Engg.
(KU)
Dhaka University of Engineering and Technology, Gazipur
Presented by
Md. Mostafijur Rahman
M.Sc. Engg. Student ID #: 132431 (p)
Department of Computer Science
and Engineering
Outline of the talk
• Introduction
• Related Work
• Application Deployment
- Architecture
- Idea Architecture
- Deployment Model
- Error Recovery Process Dependency Graph
- Error Recovery Basic Flowchart
- Error Recovery Flow Chart
• Implementation Aspects
- Cloud deployment graph
- Execute the Deployment Script
- Experimental Evaluation
- ASP.NET Message Passing
- ASP.NET Message Passing Exception Handling
- ASP.NET Message Passing Console Display
- Microsoft Azure Cloud Dashboard
- Microsoft Azure Cloud Message Routing
• Conclusions and Future Work
• References Useful Links
Dhaka University of Engineering and Technology, Gazipur
Introduction
• Transient failures occur temporarily for a short period of time.
• AURA is an Openstack application deployment tool with error-
recovery improvements. [https://github.com/giagiannis/aura]
• To overcome this failure, Direct Acyclic Graph that traverse the
graph breath first manner and find the deployment specific script
error using AURA Tool.
• To monitor the graphical user Interface and try minimum fail script
node re-executed.
• To wait for threshold time and retry fetching the resource before
calling it quits.
• A fixed amount of retries within threshold time, To make consider it a
permanent failure.
Dhaka University of Engineering and Technology, Gazipur
Related Work
To protect transient failure, Wrangler receives application descriptions in XML
format and deploys them to different cloud providers ( Amazon EC2,
OpenNebula and Eucalyptus). [1]
Liu provide database transaction and implement the necessary mechanisms to
achieve the ACID properties to protect transient failure. [1]
Katsuno study the problem of deployment parallelization in multi-module
applications and protecting transient failure [1]
Pegasus is deploying scientific workloads to protect transient failure . [1]
NPACI Rocks [1] attempts to automate the provisioning of high-performance
computing clusters in order to simplify software installation and version
tracking.
Dhaka University of Engineering and Technology, Gazipur
Application Deployment Architecture
Dhaka University of Engineering and Technology, Gazipur
AURA Master: It is deployed in a dedicated VM into the cloud .
AURA Executors: The AURA Executors are responsible to execute the appropriate
deployment scripts. Each software module is deployed by a dedicated AURA Executor.
Application: It consists of multiple modules (Methods).
Queue: It helps to store message in FIFO order.
REST API: Data will be stored in JSON Format in file to help application Interface.
Scheduler: Responsible for monitoring the entire deployment process.
Cloud Connector: It helps to make interface with cloud provider.
WEB UI: Describe new applications, issue new deployment requests, obtain real-time
monitoring statistic.
Figure 1. AURA Architecture
Sc
Application
WEB UI Queue
REST API
Cloud
Connector
Scheduler
AURA Master
Application
Module (1) Module (2) Module (3)
AURA ExecutorAURA ExecutorAURA Executor
Cloud Provider
Idea Application Deployment Architecture
Dhaka University of Engineering and Technology, Gazipur
Programming ASP.NET & SQL Server 2012 Database
Namespace AURA {
Public function Module1(parameter){ send/Receive a message }
Public function Module2(parameter){ send/Receive a message }
Public function Module3(parameter){ send/Receive a message }
Public function Queue(parameter) { Stored FIFO & action do }
Public function Scheduler(parameter){ Retry within threshold time limit }
Public function data_manage(parameter){ JSON FORMAT Data Manipulate As DB }
Public function Cloud_connector(parameter){ cloud_connector=>Cloud
Provider=>Message passing to Modules}
Public function Web_UI(parameter) { Graphical Representation to User }
}
Sc
Application
WEB UI Queue
REST API
Cloud
Connector
Scheduler
AURA Master
Application
Module (1) Module (2) Module (3)
AURA ExecutorAURA ExecutorAURA Executor
Cloud Provider
Application Deployment Model
Dhaka University of Engineering and Technology, Gazipur
In Figure 2, It has three Modules (1), (2) and (3) that works as Method.
The horizontal solid rows are defined the message passing source to
destination points.
The vertical dotted rows are defined the time slice.
When a module waits for a message (e.g. points A’, B’), it blocks until the
message is received. [1]
A module sends a message in point A, module (1) proceeds with the
script execution between points A-C’. [1]
Error Recovery Process
Every module can send message to other Module . Message must reach in
specific threshold time. Otherwise, It has been tried to re-execute several
times within threshold time. Otherwise, it deletes the fail node script from
layered and adds the new empty layer. [1]
Figure 2 (a). The vertical edges define the script executions . [1]
Horizontal line presents the message passing in different modules. [2]
Figure 2 (b). It represents the dependency Graphs in which a module blocks
and scans for the health of the neighbor nodes.[2]
Dhaka University of Engineering and Technology, Gazipur
3
Transient Failure Handling Flowchart
Dhaka University of Engineering and Technology, Gazipur
Start
Call Service
Success ?
YesNO
Call Successful
Next
Action
Figure 4: Error Recovery Basic Flowchart
Transient
Failure ?
Yes
Retry ?
Delay Threshold Time
YesCall
failed
NO
Permanent
call fail
20<
Error Recovery Flowchart
Dhaka University of Engineering and Technology, Gazipur
Start
Two Parameter Graph T
Failed Node n
failed ← {n}
healthy← ∅
While failed ≠ ∅
Have t ∈ depends (T, v) ?
Have
failed (t) ?
failed ← failed ∪ {t}
healthy ← healthy ∪ {t}
NO
Yes
Yes
Yes
v = POP(failed)
NO
Return Healthy
End
While
Again
Figure 5: Error Recovery Flowchart
Cloud Deployment Graph
Web server starts before the database server is successfully configure in right way.
Each Executor will run the necessary deployment scripts .
To monitor the deployment status through a real-time monitoring UI. [2]
Specifically, To view each script’s logs, a real-time view of the Deployment Graph, the
status of the respective script execution.
Dhaka University of Engineering and Technology, Gazipur
Figure 6. Cloud deployment graph with 1 Web Server and 1 Database Server
Execute the Deployment Script
Green edges represent completed scripts.
Blue edges represent running scripts.
Red edges represent failed executions .
Gray edges represent pending scripts.
Web Server stores the deployment script.
DB Server helps data manipulation activities of the Application. [2]
Dhaka University of Engineering and Technology, Gazipur
Figure 7. Cloud deployment graph with 1 Web Server and 1 Database Server
DBServer1
DBServer2
WEBServer1
DBServer3
WEBServer2
APPServer3
3
APPServer2
APPServer1 APPServer4
WEBServer3
WEBServer4
Experimental Evaluation
Dhaka University of Engineering and Technology, Gazipur
To recovery from failure module deployment script using Retry Policy, to use of
No Bakeoff, Constant Bakeoff, Linear Bakeoff, Fibonacci Bakeoff, Quadratic
Bakeoff, Exponential Bakeoff and Polynomial Bakeoff etc.
# Retries
Constant
Backoff Time
Linear
Backoff
Time
Fibonacci
Backoff Time
Quadratic
Backoff Time
Exponential
Backoff Time
Polynomial
Backoff Time
1 1s 0s 0s 0s 1s 0s
3 3s 3s 2s 5s 7s 9s
5 5s
10s
7s 30s 31s 100s
10 10s 45s
88s
285s 1023s 2025s
20 20s 190s 10945s 2470s 1048575s 36100s
Table 1: Retries Backoff Approaches
Conclusion and Future Work
In our work, To use the AURA deployment tool to detect transient failure and re-
execute the fail module script.
The ability of AURA tool is minimum number of fail script re-execute or Retry
strategy. The retry policy delay depends on number of retries. From the Retry
strategies, To choose the exponential backoff and recover from the transient
failure in effective and optimal way .
I have tested several backoff strategies to use c programing . I have found
many results and compare among them . I have selected the exponential
backoff to protect the transient failure in effective and optimal way.
AURA (Error Recovery ) tool has been developed by using the python
programing language. I want to implement the message passing error node re-
execute or retry strategies using ASP.NET programing and database SQL
Server 2012 and snapshot implementation after certain time.
Dhaka University of Engineering and Technology, Gazipur
ASP.NET Message Passing
Dhaka University of Engineering and Technology, Gazipur
ASP.NET Message Passing Exception Handling
Dhaka University of Engineering and Technology, Gazipur
ASP.NET Message Passing Module 3
Dhaka University of Engineering and Technology, Gazipur
ASP.NET Message Passing Console Display
Dhaka University of Engineering and Technology, Gazipur
Microsoft Azure Cloud Dashboard
Dhaka University of Engineering and Technology, Gazipur
Microsoft Azure Cloud Message Routing
Dhaka University of Engineering and Technology, Gazipur
References
[1] Ioannis Giannakopoulos , Ioannis Konstantinou, Dimitrios Tsoumakos and
Nectarios Koziris “Cloud application deployment with transient failure
recovery .” Giannakopoulos et al. Journal of Cloud Computing: Advances,
Systems and Applications 2018 .
[2] Ioannis Giannakopoulos , Ioannis Konstantinou, Dimitrios Tsoumakos and
Nectarios Koziris “AURA Recovering from Transient Failures in Cloud
Deployments.” 2017 17th IEEE/ACM International Symposium on Cluster,
Cloud and Grid Computing.
Dhaka University of Engineering and Technology, Gazipur
Section Questions and Answers
Thanks
Dhaka University of Engineering and Technology, Gazipur

More Related Content

Similar to Protecting from transient failures in cloud deployments

Protecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentsProtecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentswww.pixelsolutionbd.com
 
Software Testing in Cloud Platform A Survey_final
Software Testing in Cloud Platform A Survey_finalSoftware Testing in Cloud Platform A Survey_final
Software Testing in Cloud Platform A Survey_finalwww.pixelsolutionbd.com
 
IRJET - Model Driven Methodology for JAVA
IRJET - Model Driven Methodology for JAVAIRJET - Model Driven Methodology for JAVA
IRJET - Model Driven Methodology for JAVAIRJET Journal
 
IRJET - Code Compiler Shell
IRJET -  	  Code Compiler ShellIRJET -  	  Code Compiler Shell
IRJET - Code Compiler ShellIRJET Journal
 
Adaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveyAdaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveywww.pixelsolutionbd.com
 
RESEARCH ON DISTRIBUTED SOFTWARE TESTING PLATFORM BASED ON CLOUD RESOURCE
RESEARCH ON DISTRIBUTED SOFTWARE TESTING  PLATFORM BASED ON CLOUD RESOURCERESEARCH ON DISTRIBUTED SOFTWARE TESTING  PLATFORM BASED ON CLOUD RESOURCE
RESEARCH ON DISTRIBUTED SOFTWARE TESTING PLATFORM BASED ON CLOUD RESOURCEijcses
 
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDMACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDIRJET Journal
 
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACH
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACHPERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACH
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACHcscpconf
 
Performance comparison on java technologies a practical approach
Performance comparison on java technologies   a practical approachPerformance comparison on java technologies   a practical approach
Performance comparison on java technologies a practical approachcsandit
 
Developing microservices with Java and applying Spring security framework and...
Developing microservices with Java and applying Spring security framework and...Developing microservices with Java and applying Spring security framework and...
Developing microservices with Java and applying Spring security framework and...IRJET Journal
 
DevOps CI Automation Continuous Integration
DevOps CI Automation Continuous IntegrationDevOps CI Automation Continuous Integration
DevOps CI Automation Continuous IntegrationIRJET Journal
 
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISONSTATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISONijseajournal
 
Challenges in multi core programming by Nishigandha Wankhade
Challenges in multi core programming by Nishigandha WankhadeChallenges in multi core programming by Nishigandha Wankhade
Challenges in multi core programming by Nishigandha WankhadeNishigandha Wankhade
 
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTSCOMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTScscpconf
 
Comparison of open source paas architectural components
Comparison of open source paas architectural componentsComparison of open source paas architectural components
Comparison of open source paas architectural componentscsandit
 
Java remote control for laboratory monitoring
Java remote control for laboratory monitoringJava remote control for laboratory monitoring
Java remote control for laboratory monitoringIAEME Publication
 
Application of cloud computing based on e learning teaching tool
Application of cloud computing based on e learning teaching toolApplication of cloud computing based on e learning teaching tool
Application of cloud computing based on e learning teaching tooleSAT Journals
 

Similar to Protecting from transient failures in cloud deployments (20)

Protecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentsProtecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deployments
 
Software Testing in Cloud Platform A Survey_final
Software Testing in Cloud Platform A Survey_finalSoftware Testing in Cloud Platform A Survey_final
Software Testing in Cloud Platform A Survey_final
 
IRJET - Model Driven Methodology for JAVA
IRJET - Model Driven Methodology for JAVAIRJET - Model Driven Methodology for JAVA
IRJET - Model Driven Methodology for JAVA
 
IRJET - Code Compiler Shell
IRJET -  	  Code Compiler ShellIRJET -  	  Code Compiler Shell
IRJET - Code Compiler Shell
 
Adaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveyAdaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud survey
 
Soumya_S_Mukherjee_Resume
Soumya_S_Mukherjee_ResumeSoumya_S_Mukherjee_Resume
Soumya_S_Mukherjee_Resume
 
RESEARCH ON DISTRIBUTED SOFTWARE TESTING PLATFORM BASED ON CLOUD RESOURCE
RESEARCH ON DISTRIBUTED SOFTWARE TESTING  PLATFORM BASED ON CLOUD RESOURCERESEARCH ON DISTRIBUTED SOFTWARE TESTING  PLATFORM BASED ON CLOUD RESOURCE
RESEARCH ON DISTRIBUTED SOFTWARE TESTING PLATFORM BASED ON CLOUD RESOURCE
 
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDMACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
 
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACH
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACHPERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACH
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACH
 
Performance comparison on java technologies a practical approach
Performance comparison on java technologies   a practical approachPerformance comparison on java technologies   a practical approach
Performance comparison on java technologies a practical approach
 
PID2143641
PID2143641PID2143641
PID2143641
 
Developing microservices with Java and applying Spring security framework and...
Developing microservices with Java and applying Spring security framework and...Developing microservices with Java and applying Spring security framework and...
Developing microservices with Java and applying Spring security framework and...
 
DevOps CI Automation Continuous Integration
DevOps CI Automation Continuous IntegrationDevOps CI Automation Continuous Integration
DevOps CI Automation Continuous Integration
 
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISONSTATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
 
Challenges in multi core programming by Nishigandha Wankhade
Challenges in multi core programming by Nishigandha WankhadeChallenges in multi core programming by Nishigandha Wankhade
Challenges in multi core programming by Nishigandha Wankhade
 
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTSCOMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
 
Comparison of open source paas architectural components
Comparison of open source paas architectural componentsComparison of open source paas architectural components
Comparison of open source paas architectural components
 
Core java(2)
Core java(2)Core java(2)
Core java(2)
 
Java remote control for laboratory monitoring
Java remote control for laboratory monitoringJava remote control for laboratory monitoring
Java remote control for laboratory monitoring
 
Application of cloud computing based on e learning teaching tool
Application of cloud computing based on e learning teaching toolApplication of cloud computing based on e learning teaching tool
Application of cloud computing based on e learning teaching tool
 

More from www.pixelsolutionbd.com

Adaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingAdaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingwww.pixelsolutionbd.com
 
Software rejuvenation based fault tolerance
Software rejuvenation based fault toleranceSoftware rejuvenation based fault tolerance
Software rejuvenation based fault tolerancewww.pixelsolutionbd.com
 
Privacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pPrivacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pwww.pixelsolutionbd.com
 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingwww.pixelsolutionbd.com
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computingwww.pixelsolutionbd.com
 
Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...www.pixelsolutionbd.com
 
Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...www.pixelsolutionbd.com
 
A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...www.pixelsolutionbd.com
 
Privacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2PPrivacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2Pwww.pixelsolutionbd.com
 

More from www.pixelsolutionbd.com (18)

Adaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingAdaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computing
 
Software rejuvenation based fault tolerance
Software rejuvenation based fault toleranceSoftware rejuvenation based fault tolerance
Software rejuvenation based fault tolerance
 
Privacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pPrivacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 p
 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computing
 
Speaking
SpeakingSpeaking
Speaking
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Cyber Physical System
Cyber Physical SystemCyber Physical System
Cyber Physical System
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computing
 
Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...
 
Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...
 
A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...
 
Privacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2PPrivacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2P
 

Recently uploaded

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Recently uploaded (20)

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

Protecting from transient failures in cloud deployments

  • 1. Protecting from Transient Failures in Cloud Deployments Supervised By: Dr. Mohammad Abdur Rouf (Professor) Head of Department of Computer Science & Engineering Dhaka University of Engineering and Technology, Gazipur Ph.D. (KAIST) , M.Sc. Engg. (BUET), © B.Sc. Engg. (KU) Dhaka University of Engineering and Technology, Gazipur Presented by Md. Mostafijur Rahman M.Sc. Engg. Student ID #: 132431 (p) Department of Computer Science and Engineering
  • 2. Outline of the talk • Introduction • Related Work • Application Deployment - Architecture - Idea Architecture - Deployment Model - Error Recovery Process Dependency Graph - Error Recovery Basic Flowchart - Error Recovery Flow Chart • Implementation Aspects - Cloud deployment graph - Execute the Deployment Script - Experimental Evaluation - ASP.NET Message Passing - ASP.NET Message Passing Exception Handling - ASP.NET Message Passing Console Display - Microsoft Azure Cloud Dashboard - Microsoft Azure Cloud Message Routing • Conclusions and Future Work • References Useful Links Dhaka University of Engineering and Technology, Gazipur
  • 3. Introduction • Transient failures occur temporarily for a short period of time. • AURA is an Openstack application deployment tool with error- recovery improvements. [https://github.com/giagiannis/aura] • To overcome this failure, Direct Acyclic Graph that traverse the graph breath first manner and find the deployment specific script error using AURA Tool. • To monitor the graphical user Interface and try minimum fail script node re-executed. • To wait for threshold time and retry fetching the resource before calling it quits. • A fixed amount of retries within threshold time, To make consider it a permanent failure. Dhaka University of Engineering and Technology, Gazipur
  • 4. Related Work To protect transient failure, Wrangler receives application descriptions in XML format and deploys them to different cloud providers ( Amazon EC2, OpenNebula and Eucalyptus). [1] Liu provide database transaction and implement the necessary mechanisms to achieve the ACID properties to protect transient failure. [1] Katsuno study the problem of deployment parallelization in multi-module applications and protecting transient failure [1] Pegasus is deploying scientific workloads to protect transient failure . [1] NPACI Rocks [1] attempts to automate the provisioning of high-performance computing clusters in order to simplify software installation and version tracking. Dhaka University of Engineering and Technology, Gazipur
  • 5. Application Deployment Architecture Dhaka University of Engineering and Technology, Gazipur AURA Master: It is deployed in a dedicated VM into the cloud . AURA Executors: The AURA Executors are responsible to execute the appropriate deployment scripts. Each software module is deployed by a dedicated AURA Executor. Application: It consists of multiple modules (Methods). Queue: It helps to store message in FIFO order. REST API: Data will be stored in JSON Format in file to help application Interface. Scheduler: Responsible for monitoring the entire deployment process. Cloud Connector: It helps to make interface with cloud provider. WEB UI: Describe new applications, issue new deployment requests, obtain real-time monitoring statistic. Figure 1. AURA Architecture Sc Application WEB UI Queue REST API Cloud Connector Scheduler AURA Master Application Module (1) Module (2) Module (3) AURA ExecutorAURA ExecutorAURA Executor Cloud Provider
  • 6. Idea Application Deployment Architecture Dhaka University of Engineering and Technology, Gazipur Programming ASP.NET & SQL Server 2012 Database Namespace AURA { Public function Module1(parameter){ send/Receive a message } Public function Module2(parameter){ send/Receive a message } Public function Module3(parameter){ send/Receive a message } Public function Queue(parameter) { Stored FIFO & action do } Public function Scheduler(parameter){ Retry within threshold time limit } Public function data_manage(parameter){ JSON FORMAT Data Manipulate As DB } Public function Cloud_connector(parameter){ cloud_connector=>Cloud Provider=>Message passing to Modules} Public function Web_UI(parameter) { Graphical Representation to User } } Sc Application WEB UI Queue REST API Cloud Connector Scheduler AURA Master Application Module (1) Module (2) Module (3) AURA ExecutorAURA ExecutorAURA Executor Cloud Provider
  • 7. Application Deployment Model Dhaka University of Engineering and Technology, Gazipur In Figure 2, It has three Modules (1), (2) and (3) that works as Method. The horizontal solid rows are defined the message passing source to destination points. The vertical dotted rows are defined the time slice. When a module waits for a message (e.g. points A’, B’), it blocks until the message is received. [1] A module sends a message in point A, module (1) proceeds with the script execution between points A-C’. [1]
  • 8. Error Recovery Process Every module can send message to other Module . Message must reach in specific threshold time. Otherwise, It has been tried to re-execute several times within threshold time. Otherwise, it deletes the fail node script from layered and adds the new empty layer. [1] Figure 2 (a). The vertical edges define the script executions . [1] Horizontal line presents the message passing in different modules. [2] Figure 2 (b). It represents the dependency Graphs in which a module blocks and scans for the health of the neighbor nodes.[2] Dhaka University of Engineering and Technology, Gazipur 3
  • 9. Transient Failure Handling Flowchart Dhaka University of Engineering and Technology, Gazipur Start Call Service Success ? YesNO Call Successful Next Action Figure 4: Error Recovery Basic Flowchart Transient Failure ? Yes Retry ? Delay Threshold Time YesCall failed NO Permanent call fail 20<
  • 10. Error Recovery Flowchart Dhaka University of Engineering and Technology, Gazipur Start Two Parameter Graph T Failed Node n failed ← {n} healthy← ∅ While failed ≠ ∅ Have t ∈ depends (T, v) ? Have failed (t) ? failed ← failed ∪ {t} healthy ← healthy ∪ {t} NO Yes Yes Yes v = POP(failed) NO Return Healthy End While Again Figure 5: Error Recovery Flowchart
  • 11. Cloud Deployment Graph Web server starts before the database server is successfully configure in right way. Each Executor will run the necessary deployment scripts . To monitor the deployment status through a real-time monitoring UI. [2] Specifically, To view each script’s logs, a real-time view of the Deployment Graph, the status of the respective script execution. Dhaka University of Engineering and Technology, Gazipur Figure 6. Cloud deployment graph with 1 Web Server and 1 Database Server
  • 12. Execute the Deployment Script Green edges represent completed scripts. Blue edges represent running scripts. Red edges represent failed executions . Gray edges represent pending scripts. Web Server stores the deployment script. DB Server helps data manipulation activities of the Application. [2] Dhaka University of Engineering and Technology, Gazipur Figure 7. Cloud deployment graph with 1 Web Server and 1 Database Server DBServer1 DBServer2 WEBServer1 DBServer3 WEBServer2 APPServer3 3 APPServer2 APPServer1 APPServer4 WEBServer3 WEBServer4
  • 13. Experimental Evaluation Dhaka University of Engineering and Technology, Gazipur To recovery from failure module deployment script using Retry Policy, to use of No Bakeoff, Constant Bakeoff, Linear Bakeoff, Fibonacci Bakeoff, Quadratic Bakeoff, Exponential Bakeoff and Polynomial Bakeoff etc. # Retries Constant Backoff Time Linear Backoff Time Fibonacci Backoff Time Quadratic Backoff Time Exponential Backoff Time Polynomial Backoff Time 1 1s 0s 0s 0s 1s 0s 3 3s 3s 2s 5s 7s 9s 5 5s 10s 7s 30s 31s 100s 10 10s 45s 88s 285s 1023s 2025s 20 20s 190s 10945s 2470s 1048575s 36100s Table 1: Retries Backoff Approaches
  • 14. Conclusion and Future Work In our work, To use the AURA deployment tool to detect transient failure and re- execute the fail module script. The ability of AURA tool is minimum number of fail script re-execute or Retry strategy. The retry policy delay depends on number of retries. From the Retry strategies, To choose the exponential backoff and recover from the transient failure in effective and optimal way . I have tested several backoff strategies to use c programing . I have found many results and compare among them . I have selected the exponential backoff to protect the transient failure in effective and optimal way. AURA (Error Recovery ) tool has been developed by using the python programing language. I want to implement the message passing error node re- execute or retry strategies using ASP.NET programing and database SQL Server 2012 and snapshot implementation after certain time. Dhaka University of Engineering and Technology, Gazipur
  • 15. ASP.NET Message Passing Dhaka University of Engineering and Technology, Gazipur
  • 16. ASP.NET Message Passing Exception Handling Dhaka University of Engineering and Technology, Gazipur
  • 17. ASP.NET Message Passing Module 3 Dhaka University of Engineering and Technology, Gazipur
  • 18. ASP.NET Message Passing Console Display Dhaka University of Engineering and Technology, Gazipur
  • 19. Microsoft Azure Cloud Dashboard Dhaka University of Engineering and Technology, Gazipur
  • 20. Microsoft Azure Cloud Message Routing Dhaka University of Engineering and Technology, Gazipur
  • 21. References [1] Ioannis Giannakopoulos , Ioannis Konstantinou, Dimitrios Tsoumakos and Nectarios Koziris “Cloud application deployment with transient failure recovery .” Giannakopoulos et al. Journal of Cloud Computing: Advances, Systems and Applications 2018 . [2] Ioannis Giannakopoulos , Ioannis Konstantinou, Dimitrios Tsoumakos and Nectarios Koziris “AURA Recovering from Transient Failures in Cloud Deployments.” 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Dhaka University of Engineering and Technology, Gazipur
  • 22. Section Questions and Answers Thanks Dhaka University of Engineering and Technology, Gazipur