Software rejuvenation

533 views

Published on

research project on software rejuvenation

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
533
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
21
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Software rejuvenation

  1. 1. CHAPTER 1 INTRODUCTION TO SOFTWARE REJUVENATION IN COMPLEX SYSTEM 1.1 Introduction Industry uses high complex system environment, which tends to software aging. Availability is the critical issue for system failure, which causes system degradation, to avoid this issue software rejuvenation technique is used, we use optimal rejuvenation technique for dynamically solving aging problem based on variable workload and timer policy, performance degradation, crash/hang, failure may occur due to data corruption, numerical errors and maximum use of system resource unnecessarily, this leads to software degradation which is known as software aging [1]. If the load increases system may tends to crash that is software aging occurs. That is solved by software rejuvenation technique; software rejuvenation [2] is proactive fault management technique to clear system errors and prevent system from failures in future. This project implements different software rejuvenation techniques depending on variable workload and optimize rejuvenation time, Rejuvenation time is calculated depending on variable workload which is given by system. System periodically checks the workload and update to rejuvenation manager. Figure 1.1 shows the architecture of Rejuvenation manager, it is consists of aging detector which detects the software aging point and optimizer to optimize the timer value for a point of rejuvenation. Aging detector and optimizer has two components namely variable workload and timer policy which perform their defined function respectively. Aging detector obtains the value provided by rejuvenation manager periodically and checks for the need for change in rejuvenation time depending on workload and this is updated for rejuvenation manager, if there is need for change in rejuvenation time then rejuvenation manager allows optimizer to change the time, the function of optimizer is to optimize the time depending the values provided by aging detector based on workloads.
  2. 2. Figure 1.1 System architecture used for software rejuvenation in complex system Analysis of performance, dependability of complex systems is done through SPNP (Stochastic Petri Net Package) [2].Weight cardinality arc; guarded function of a complex system is constructed through SPNP. A state can be reached by all other states becomes irreducible in Marko chain [2]. A CTMC (Continuous Time Markov Chain) [3] is ergodic if it is irreducible and if a state is reached by all other state recursively in finite period. Steady state analysis underlying ctmc is done by SPNP [3], and few measures related to steady state are not considered as their values can be obtained by steady state probability. Software rejuvenation process can be done in different methods namely cold and warm .In cold OS reboot process, the system is rebooted immediately at rejuvenation point. Rejuvenation point is a point where memory consumption of system reaches a threshold value or predetermined time. When system consumes high amount of ram the OS must be rebooted, clearing all internal states. Memory consumption may be done by applications or error prone codes which run for long time consuming large amount of ram or OS itself. In OS warm reboot process, before rebooting the kernel state is saved, including all applications running on kernel, their sates are saved .saving the kernel state is done by creating a complete image of kernel. Approaches to Software Rejuvenation
  3. 3. Software rejuvenation can be divided broadly into twoapproaches as follows. Time based approach [4][5][6]: In this approach, rejuvenationis performed without any feedback from the system. Rejuvenation in this case, can be based just onelapsed time (periodic rejuvenation) and/or instantaneous cumulative number of jobs on the system. Time and workload approach [4][5][6]: In this approach, rejuvenation is performed based on information on thesystem “health”. The system is monitored continuously (in practice, at small deterministic intervals) anddata is collected on the operating system resource usageand system activity. This data is then analyzedto estimate time to exhaustion of a resource whichmay lead to a component or an entire system degradationcrash. This estimation can be based purely on time or can be based on both time and systemworkload. Time is optimized based on workload applied and it is updated to system rejuvenation time. 1.2 Motivation: • Current systems are reactive if a failure occurs necessary steps will be taken to handle it but they can’t detect such a failure beforehand. Our project aims to detect such failure proactively using Time and workload techniques and take action before a given node crashes. • Our project determines if a node is going to fail based on RAM utilization and then it rejuvenates the failing node. Our project analyze and identifies these failing nodes using both Time and load balancing Rejuvenation techniques. 1.3 State of Art Development Complex system is a form of ubiquitous computing deals with providing everything as a service. Complex system mainly used in business and IT industry it offers heavy outsourcing model computational resource, where service availability, security and quality are essential features. In
  4. 4. Complex system High service availability is the most important requirement increasingly being demanded in commercial computer, and communication systems. In recent years many research efforts have been going to find the optimal infrastructure size and configuration that guarantee the desired availability level. Software fault tolerance is often found to be the bottleneck. A failure in software’s is mainly due to certain elusive error conditions which it leads to resource exhaustion. Software systems appear to age as error conditions arise and accumulate with operational time due to certain elusive faults in system software and application software. Software rejuvenation is a proactive fault management technique aimed at cleaning up the internal states in order to prevent the occurrence of severe crash failures in the upcoming years the simplest way to emulate software rejuvenation is to reboot the system or restart the aging application. It is a cost effective technique dealing with software faults that includes protection not only against hard failures and also due to degradation over time of application performance. 1.3.1 Classification of software faults: Faults, in both hardware and software, can be classified according to their phase of creation or occurrence, system boundaries (internal or external), domain (hardware or software). In this section, we limit ourselves to the classification of software faults based on their phase of creation .some studies have suggested that since software is not a physical entity, it is not focusing to transient physical phenomena (as opposed to hardware), hence software faults are stable in nature [1].some other studies organizes software faults as both permanent and transient. Gray [2] categorizes software faults into Bohrbugs and Heisenbugs. Bohrbugs is essentially stable design faults and hence, approximately it is deterministic in nature. They can be recognized easily and weeded out during the testing and debugging phase (or early deployment phase) of the software life cycle. A software system with Bohrbugs is related to a faulty deterministic finite state machine. Heisenbugs, on the other hand, fit into the class of temporary internal faults and are intermittent. They are essentially stable faults whose conditions of creation occur rarely or are not easily recreated. Hence these types of faults result in transient failures i.e., failures which may not occur again if the software is restarted. Heisenbugs are extremely difficult to identify through testing. Hence a piece of software which is developed in the operational phase gets released after
  5. 5. its development and testing phase, is more likely to be experienced with failures caused by Heisenbugs than due to Bohrbugs. Most modern studies on failure data have reported that a large percentage of software failures are transient in nature caused by phenomena such as overloads or timing and exception errors. The revise of failure data from Tandem’s fault tolerant computer system indicate that 70% of the failures were transient failures, caused by faults similar to race conditions and timing problems. We designate faults attributed to software aging as aging related faults. Aging related faults fall under Bohrbugs or Heisenbugs depending on whether the failure is deterministic (repeatable) or transient [3]. Foraging-related bugs, environment diversity can be particularly effective if utilized proactively in the form of software rejuvenation. Rejuvenation operation can be triggered either by time based (on deterministic intervals) or by using measurement and analysis of data of the system condition that undergoes software aging problems in various workstation environments. 1.3.2. Basic concepts of software aging and software rejuvenation: Software aging is defined as the state of the software that degrades with time. The primary causes of this degradation are the exhaustion of operating system resources, data corruption, and accumulation of numerical errors, which eventually may lead to performance degradation of the software, crash/hang failure, or both. A typical example of software aging is progressive increase in memory consumption which conclusively causes a memory leak. Since software aging can be observed only in the software execution, it is difficult to find aging related problems until the software is deployed and executed in a specific environment. This figure describes the threads which lead to aging related failure in the system.
  6. 6. Figure 1.2 Aging Related failure The accumulation of AR (aging related) errors may tend to AR failure or fault. Aging effects can also be classified into volatile and non-volatile effects. They are considered volatile if they are isolated by re-initialization of the system or process affected, for example via a system reboot. In contrast, non-volatile aging effects still exist after reinitializing of the system/process. Physical memory division and OS resource outflow are examples for volatile aging effects. File system schema and database metadata fragmentation are examples for non- volatile aging effects [4]. The fault tolerance technique which is used to mitigate the aging effects of system is known as software rejuvenation. Software rejuvenation is defined as occasionally stopping the running software, cleaning its internal state or its environment and restarting it. Such a technique known as software rejuvenation was proposed by Huang which counteract the aging phenomenon in a proactive‖ manner by removing the accumulated error conditions and freeing up of operating system resources. Garbage collection, flushing operating system kernel tables, and reinitializing internal data structures are some examples by which the internal state or the environment of the software can be cleaned. There are basically two approaches followed for Software rejuvenation and for finding the optimum rejuvenation schedule: first is by analytic model and measurement based rejuvenation. The analytic modelling approach assumes failure and repair time distributions of a system and obtains optimal rejuvenation Schedule to maximize the availability, or minimize the loss probability or downtime of cost. Measurement-based rejuvenation approach is based on monitoring of resource consumption in a computer system and analysis of that data to determine the point of time when a resource will be completely exhausted, thereby causing the system to hang/crash. Measurement based Software AR Bugs Aging Factors AR Error System Internal Environment AR Failure Activates Propagates
  7. 7. Rejuvenation can follow any of the following policies: Purely Time based Software Rejuvenation Policy (PTSRP) or Purely Prediction based Software. Figure 1.3 Rejuvenation Scheduling 1.3.3 Rejuvenation technique In this section, we review the three VMM rejuvenation techniques. When VMM rejuvenation needs to be performed on a host, the hosted VMs also need to be controlled because the execution environments of VMs are cleared by the VMM rejuvenation. Prior to VMM rejuvenation, we can perform VM shutdown (i.e., Cold-VM rejuvenation), VM suspend (i.e., Warm-VM rejuvenation), or VM migration (i.e., Migrate-VM rejuvenation). These approaches are presented in the next three subsections. 1.3.3.1 Cold-VM rejuvenation The easiest way to deal with the hosted VMs before triggering rejuvenation of VMM is to shut down all the hosted VMs regardless of the execution states of the VMs. The VMs are then Software rejuvenation Scheduling Time-Based Inspection-Based Threshold Based Prediction based Mixed Approach Statistical Structural Models and Statistical Machine Learning Online | Offline Online| Offline Online| Offline Online |Offline Online |Offline Online |Offline
  8. 8. restarted in clean states after the VMM rejuvenation. This approach is called Cold-VM rejuvenation. All the transactions running on VMs are vanished by the Cold-VM rejuvenation [6]. An advantage of the Cold-VM rejuvenation, however, is that the rejuvenation action cleans all the aging states of the VMs in addition to the aging states of the VMM 1.3.3.2 Warm-VM rejuvenation Instead of shutting down the hosted VMs, the hosted VMs are suspended prior to VMM rejuvenation is triggered and the executions of the VMs are resumed at the completion of the VMM rejuvenation. We call this technique Warm-VM rejuvenation [5]. Since the execution states of the hosted VMs are saved prior to VMM rejuvenation, the transactions running on the VMs are not lost due to the VMM rejuvenation. However, Warm-VM rejuvenation retains the aging states of VMs by VM suspend. The aging states in the hosted VMs are not cleared by VMM rejuvenation and hence we need to rely on rejuvenation for VM to clear the aging states of VMs. 1.3.3.3 Migrate-VM rejuvenation Live VM migration is a technique to move a running VM to another host incur a short service interruption and is supported in most modern VMM implementations such as Xen and VMware. Although a shared storage system is required to store a VM image, the downtime overhead caused by a VM migration is less. Using live VM migration, hosted VMs are moved to another host prior to VMM rejuvenation and returned back to the original hosting server after the completion of the rejuvenation of the VMM, by a reverse live VM migration. We call this combined method as Migrate-VM rejuvenation [6]. The VM continues the execution even while the VMM on the original host is being rejuvenated. However, the aging states in the hosted VMs are not cleared by the VMM rejuvenation as in the case of Warm-VM rejuvenation. Live VM migration works only when the migration target server is running and it has a capacity to accept the migrated VM. Comparison of different software rejuvenation policy is described in table 1.1 Table 1.1 Comparison of different software rejuvenation policy Policy Aging Condition Analysis Tool Threshold value Availability Methodology Or
  9. 9. Model Constrained Element Based Software Rejuvenation Policy In Embedded Environment (CESRP) To detect aging CESRP uses CPU frequency - - -- - - - -- - W (Shapiro-walk Detection) = 0.9781 Pvalue = 0.8453 Probability density µ=8450.16 Stack result σ=1830.731 Constrained path and Constrained element with mathematical model Gray’s classification of software faults Hear aging is detected by time base depending on result of an operating system resources SNMP (Simple Network Management Protocol) based on distributed resource monitoring tool • Mean time to recover from a failure = 4 hours • Mean time to rejuvenate the system = 1 hour • Mean time to failure = 41.38 days • Cost of failure = $5000/hour • Cost of rejuvenation = $500/hour σ∗ (Optimal availability) = 36.12 days σ (Down time) = 5.60 days Semi-markov reward model based on workload and resources POMDP (Partially Observable markov decision process) Aging detected based on the degradation level of system - - -- - - - -- - POMDP K = 1 0.9951 POMDP K = 4 0.9932 POMDP K = 9 0.9901 CMTC (Continues Time Markov Chain) model
  10. 10. POMDP K = 99 0.9901 Software rejuvenation based on automated self-healing techniques Aging can be detected based on 1.Online transaction processing (OLTP) servers, 2.Middleware applications and Web/application- servers SAN (Stochastic Active Network) - - -- - - - -- - Basic steady- state availability = 0.824673 Tolerance availability = 0.983678 This policy consists of six methodology • System under test •Fault model. • Fault- remediatio n relationshi p. • Micro- measurem ents. • Macro- measurem ents. • Workload and metric collectors. Component- Dependency based Micro- Rejuvenation Scheduling Policy An aging can detected based on utilization of system resources, such as memory, SAN Model - - -- - - - -- - micro- rejuvenation scheduling 1.4 Problem Statement: • Performance degradation in the complex system running for a long time • They are susceptible to crash because of data corruption, numerical error accumulation and availability of OS resources. • Thus, leading to downtime and non-optimal performance.
  11. 11. • Based on vary in workload the rejuvenation time is optimized to reduce the down time and increase the availability of the system. 1.5 Objectives of the project: • The main objective of this project is to reduce software failure rates, avoid downtime and to improve the system availability using Software rejuvenation policy based on time and load balancing scheme using ITL algorithm. • Availability of the system for various rejuvenation techniques is analyzed. • Analysis of different rejuvenation technique is done, based on values obtained from SPNP 1.6 Scope of the work: 1.6.1 Limitations of the project: • Hardware compatibility is required. • Same hardware configurations are required on end systems. • Worked on open source tools and packages. 1.6.2 Constraints of the project: • Rejuvenation time and memory peak value is set based on the machine learning studies. • Hardware virtualization must be supported. • Systems must support NFS( Network File Shared). 1.7 Methodology : • Implementation of a proactive based appraoch for software rejuvenation using Time and load balancing schema based techniques. • SRN modelled graphs were used for analysis of algorithm on all modules. • Physical Memory utilization is considered for implementing the Time and Workload based approaches.
  12. 12. • Designed ITL algorithm makes use of timer and variable workload policy to present the time- based rejuvenation for performing dynamic adaptation of the rejuvenation timer based on the workload conditions. • ITL algorithm is used to optimize rejuvenation time defined by user when workload is variable. • Availabilty of the system of different modules is derived based on different parameters obtained from SPNP. • Live migration of virtual machine is done using KVM/QEMU. • NFS is configured on two servers to migrate the VM. Figure 1.4 Optimization of rejuvenation time for variable workload Figure 1.4 describes ITL algorithm to optimize the rejuvenation time (VT) with respect to system workload (WL). Based on variation of workload WL+1 the rejuvenation time has to be optimized to VT+1. If the system workload is back to the normal condition WL+2, the optimizer has to optimize the rejuvenation time to VT+2.
  13. 13. 1.8 Organization of the Report: This report contains 8 chapters. • The first chapter deals with the Introduction to the project. It covers the purpose, motivation and scope for the project. It also talks about the methodology adopted and the literature survey undertaken. • The second chapter primarily deals theory and concepts of software rejuvenation in complex system. • The third chapter primarily deals with the software and hardware requirements specifications required for the project. It is the software requirements document. • The fourth chapter explains the design of the system proposed in the project. It gives a detailed overview of the components used in the project. • The fifth chapter describes the implementation of the project. It discusses the difficulties encountered while coding the project and the coding standards used. • The sixth chapter focuses on testing the application so that it is a robust product. Various tests are conducted on the all modules. • The seventh chapter presents the results of the classification and comparison. Analysis of all modules is done here. • The eighth chapter gives the conclusion of the project. It deals with the limitations of the product and the future enhancements possible in the project.
  14. 14. CHAPTER 2 THEORY AND CONCEPTS OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM 2.1 Introduction Software rejuvenation has become a new horizon for increasing the system reliability and availability in a long run. With time, the system outages tend to increase due to the aging of software which may be caused due to numerous factors like memory leaks, unreleased locks, file descriptor leaking and so on. The rejuvenation of the software based on time factor tends to periodically rollback a continuously running application to prevent failures in the future. The time factor is set a particular value after which the software is restarted. Thus the better way to avoid software failure and to increase the availability and reliability of the system is to find the failure probable state and rejuvenate the software prior to the failed state. Project investigates about time based rejuvenation policies in maintaining high reliability of software systems. Software rejuvenation is a process or act of gracefully terminating a running application and restarting it. The main motive behind the rejuvenation process is to prevent any unexpected errors which might be caused due to aging related issues of the software. So the idea of the software rejuvenation is to suspend the application and restart it before it suffers any error. The rejuvenation strategy is primarily intended for servers where the applications are intended to run incessantly for days without any failure. Software aging involves the gradual degradation of application performance over time that may lead to untimely cessation of the program. The main objective of the process is to maintain higher system reliability and availability by cleaning internal system states prior to the failure state of the application. 2.2 Software Rejuvenation Techniques Review Software rejuvenation technique takes in account different types of approaches. Broadly these are classified as: Standard rejuvenation, Delayed rejuvenation and Mixed rejuvenation.
  15. 15. 2.2.1 Standard Rejuvenation In Standard rejuvenation, rejuvenation occurs once triggering interval is reached. This rejuvenation policy does not take workload into consideration i.e., there is no concern of workload. This strategy ignores both i.e. Peak load or off peak load and the rejuvenation happens on triggered time. 2.2.2 Delayed Rejuvenation In delayed rejuvenation, on peak load nodes are scheduled for rejuvenation if the rejuvenation time is reached during peak period, the actual rejuvenation is started as soon as the next off peak period starts. 2.2.3 Mixed Rejuvenation The mixed rejuvenation policy is the combination of standard rejuvenation strategy and delayed rejuvenation strategy. If the rejuvenation is timed early in peak period, rejuvenation of the application is done immediately or else the rejuvenation is delayed till the next off period starts. 2.2.4 Erlang Approximation Based on workload, i.e., peak load and off peak load, different time policy methods are established to solve the quest for finding the interval need for scheduling. In standard rejuvenation, neglecting peak load or off peak load rejuvenation occurs on triggered time. In delayed rejuvenation as defined above the delayed time is obtained by Erlang distribution. DSPN becomes a markovian stochastic Petri net and the solution techniques for markov chains can be applied. The deterministic switching time between peak and off-peak periods is kept as it is, hence this model is a DSPN (Deterministic and stochastic petri nets). The rejuvenation is triggered at every time units, and is modeled by the deterministic transition, with constant firing time. When deterministic transition is fired, and if the immediate transition is enabled at that time, the token will be moved to another place, indicating a beginning of rejuvenation activity. Standard rejuvenation, timer is always enabled, while for delayed rejuvenation, timer is disabled during peak period, and for mixed rejuvenation, timer is enabled for the initial time duration of certain length and disabled thereafter in peak period. After the
  16. 16. rejuvenation finishes, reset will fire to return a token back to its place, hence beginning the next rejuvenation cycle. In order to make the model solvable by SPNP approximate the deterministic transition by an r- stage Erlang distribution. This is achieved by storing r tokens in other place and replacing deterministic timer by an exponentially distributed timed transition with firing rate r/timer. At the same time, they change the multiplicities to r for the output arc of reset timer and the input arc of timer policy. 2.3 Stochastic Reward Nets Model for Time based Software Rejuvenation in Virtualized Environment Here we are mainly focused on the unplanned software outages due to software aging problem. We present a comprehensive availability model for both VM clustering software rejuvenation model and VM migration based software rejuvenation model. In this model captures software aging states of VM and VMM as well as their failures caused by aging. Using analytical modeling as a stochastic reward nets (SRN). In this model we describe our proposal to offer high availability mechanism using time based software rejuvenation methodology. First we present the ways of using virtualization to improve software rejuvenation for addressing the software aging issue. In the proposed system, virtualization technology and software rejuvenation are used to provide the availability of the services. Clustering supports two or more servers running duplicate VMs. Failover technologies also allow a failed VM to load from a storage snapshot and start up on another server. To counteract the software and hardware failure, the rejuvenation schedules for VM and VMM need to determine in proper way for the VM availability, since VMM rejuvenation effects VMs running on the VMM. The following two scenarios are studied in this paper. 2.3.1 VM Clustering Software Rejuvenation (2vms1pm)
  17. 17. Physical machine hosts the virtual machines. One monitoring VM and other operational VMs on the top of the virtualization layer (VMM) are created. The main application server will be running on one VM and the remaining VM will be used for standby server. Some software modules that will be responsible for the detection of software aging are installed in the monitoring VM. The monitoring VM will trigger a rejuvenation operation. If the active VM is about to be rejuvenated, standby VM will be started and then all the new requests and sessions are switched from the active VM to standby VM. So the physical machine itself is a SPOF (single point of failure). 2.3.2 VM Migration Based Software Rejuvenation (2vms2pms) In this scenario Active-standby virtualized clustering architecture is employed. A high available cluster is built between two or more virtual machines, each of them running on different physical machines (2 PMs). Two PM’s consists of Active physical server and standby physical server. Both physical servers can access shared storage. A heartbeat keep-alive system is used to monitor the interaction of VMs and the physical servers. At active physical server, VMs are created as monitoring VM, active VM and standby VM as well as standby physical server. Both VM and VMM time-based rejuvenation mechanism is considered in this scenario. Time based rejuvenation policy for VM is same as active-standby VMs hosted on 1PM. Live VM migration enables a running VM on a host server to move onto the other host server with very small interruption of the execution. When VMM need to be rejuvenation, the hosted VMs can move onto other physical server. It can return back to the original host after the completion of the VMM rejuvenation by live VM migration again. In the event of an active physical server outage, the virtualized recovery server at standby physical server can be activated to take over the running of the workload immediately using live migration. The down time of a VM caused by live VM migration is very small and the VM continues the execution even while the original host is down. CHAPTER 3 SOFTWARE REQUIREMENT SPECIFICATION OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM
  18. 18. 3.1 Project Description Software Rejuvenation in Complex System has six different modules namely OS Cold reboot, OS Warm reboot, VM Cold reboot, VM Warm reboot, VMM reboot, VM migration. Each module consist of unique working method, which is explained below 3.2 Module Description There are mainly six modules, they are: OS cold reboot, OS warm reboot, VM cold reboot, VM warm reboot, VM migration, VMM reboot. 3.2.1 Module for OS Cold rejuvenation: In Cold OS reboot process, the system is rebooted immediately at rejuvenation point. Rejuvenation point is a point where memory consumption of system reaches a threshold value or predetermined time. When system consumes high amount of ram the OS must be rebooted, clearing all internal states. Memory consumption may be done by applications or error prone codes which run for long time consuming large amount of RAM or OS itself In this process the memory left is compared to our pre-determined threshold value, if the memory left is greater than the threshold value, the system is allowed to run in normal state i.e. Systems have not reached the threshold point of consumption. If it is lesser i.e. the system have consumed memory greater than the threshold point, then OS is restarted immediately The amount of free memory left is extracted and compared with predetermined threshold free memory value, on results of comparison obtained; further process is taken care by ITL algorithm. 3.2.2 Module for OS warm rejuvenation: In OS warm reboot process, before rebooting the kernel state is saved, including all applications running on kernel, their sates are saved .saving the kernel state is done by creating a complete image of kernel. OS reboot process is divided in two stages 1) Suspend, 2) Resume. In Suspend stage kernel is called to create a snapshot of current system state later snapshot data is written to disk, finally system is rebooted. In Resume stage, when the system is turned on, grub loader runs from initrd
  19. 19. before mounting any partitions, later all the data of snapshot is read from disk and loaded to kernel, kernel restores the image and thus system runs from same state where it was suspended. 3.2.3 Module for VM cold reboot: In VM cold reboot process [9], the VM is rebooted immediately at rejuvenation point, hypervisor is untouched. Rejuvenation point is a point where memory consumption of system reaches a threshold value or predetermined time. When VM consumes high amount of ram the VM must be rebooted, clearing all internal states. Memory consumption may be done by applications or error prone codes which run for long time consuming large amount of RAM or OS itself ITL algorithm compares the memory left to our pre-determined threshold value, if the memory left is greater than the threshold value; the system is allowed to run in normal state i.e. System have not reached the threshold point of consumption. If it is lesser i.e. the system have consumed memory greater than the threshold point, then rejuvenation time is optimized and updated to predetermined time, when rejuvenation time is equal to system time then VM is restarted immediately without saving any state of running VM. 3.2.4 Module for VM warm reboot: In VM warm reboot process, before rebooting the kernel state of particular failing VM is saved, including all applications running on kernel, their sates are saved .saving the kernel state is done by creating a complete image of kernel. VM Warm reboot process is divided in two stages 1) Suspend, 2) Resume. In Suspend stage kernel is called to create a snapshot of current system state later snapshot data is written to disk, finally system is rebooted. In Resume stage, when the system is turned on, grub loader runs from initrd before mounting any partitions, later all the data of snapshot is read from disk and loaded to kernel, kernel restores the image and thus system runs from same state where it was suspended. Here this module provides decrease in request failures and high availability to the VM. ITL algorithm compares the memory left to our pre-determined threshold value, if the memory left is greater than the threshold value; the system is allowed to run in normal state i.e. System have not reached the threshold point of consumption. If it is lesser i.e. the system have consumed memory greater than the threshold point, then rejuvenation time is optimized and updated to
  20. 20. predetermined time, when rejuvenation time is equal to system time then VM is restarted immediately saving state of running VM. 3.2.5 Module for VMM reboot In VMM cold reboot process, the VMM is rebooted immediately at rejuvenation point, all the VM’s running on VMM are shut down before rebooting VMM. Rejuvenation point is a point where memory consumption of system reaches a threshold value or predetermined time. When VMM consumes high amount of RAM the VMM must be rebooted, clearing all internal states. Memory consumption may be done by applications or error prone codes which run for long time consuming large amount of RAM. In this process ITL algorithm compares the memory left to our pre-determined threshold value, if the memory left is greater than the threshold value, the system is allowed to run in normal state i.e. System have not reached the threshold point of consumption. If it is lesser i.e. the system have consumed memory greater than the threshold point, then rejuvenation time is optimized and updated to predetermined time, when rejuvenation time is equal to system time then VM is restarted immediately without saving any state of running VM, If VMM memory consumption reaches its peak point i.e. VMM tending to crash in soon time then VMM is restarted even if all VM is running in normal state and no state, data is saved but user is given period of one minute user can cancel the rebooting process or shutdown the VMM completely. 3.2.6 Module for VM Migration [10] [11] [12] In this module, VM from the failing server is transferred to preconfigured secondary server before the VM tending to fail, the complete data and application running on the main server is transferred to the secondary with no interruption for application running. When the complete VM is transferred to another server and loaded, all the applications which were running in main server will be in same state even after transferred, with no loss of data of applications running. As this is all done by configuring NFS for both servers and configuring virtual manager and virish packages initially, applying this concept to our project, when the server get huge load of request or high memory is consumed which may lead to hang/crash or failure of the system, when user set the rejuvenation time and threshold memory value, rejuvenation manager checks for aging problem in system and if aging problem is detected then the rejuvenation time predetermined by user is optimized by ITL algorithm and system is rejuvenated at rejuvenation
  21. 21. time, here for rejuvenation we use migration technique to migrate the VM running and reboot the server, hence we provide high availability and decrease in request failure. 3.3 Software requirements: Table 3.1 Software requirements
  22. 22. 3.4 Hardware Requirements: Table 3.2 Hardware Requirements 3.5 Performance Requirements: • Availability The system shall achieve 100 percent availability at all time. • Portability Minimum Requirements OS Cent OS Ubuntu OS Other KVM/QEMU must be installed on both the servers. NFS must be configured on both the system to migrate the VM Note: KVM is a hypervisor or Virtual Machine Monitor, NFS (Network File System) is distributed file system protocol. Language C Minimum Requirements Processor Intel Pentium or better Memory 4 GB RAM Hard Disk 100 GB of hard disk space required. Display 1024x 768 or higher-resolution display with 16 bits colors
  23. 23. The system should be implemented by the java so it can move easily from one system another system because it is purely platform independent. • Scalability The system shall uses in multiple approaches. • Maintainability The sys00tem should be optimize for supportability, or ease of maintenance as for as possible. This may be achieved through the use documentation of coding standard, naming conventions, class libraries and abstraction. 3.6 Functional requirements: As per the functional requirement specifications, the project shall provide following facilities • The system collects the current status of the workload based on the RAM utilized by the running application. • Check the aging factor which degrades the availability to application. If any aging factor detected then it will notify. • The system collects the status of the system periodically. • This system keeps track of the system time and it is compared with fixed rejuvenation schedule. If the tracking time is equal to fixed rejuvenation schedule then the system rejuvenated. • This system stores the current status of the process; it is useful to again resume the processor after system rejuvenation takes place. 3.7 Project Effort Estimation: Assumptions: Average Labor Cost : $680/month Average Line of Code (LOC) : 450LOC/month Average cost for a line of code : $1.5/LOC (680 / 450)
  24. 24. Modules Details:  The Project contain 6 model each model contain around 490 loc/module in which implementation consists of 320 loc/module and analysis consists of 170 loc/module.  Total Project Size = 490 * 6 = 2940 loc Cost Estimation:  For one module, cost = 490 * 1.5 = $ 735  Total cost of Project = 2940 * 1.5 = $ 4410 Effort Estimation:  Effort = Total Project Cost Average people Cost per month = 4410 / 680 = 6.4852 ≈ 7 Persons/month  7 Persons are required to complete this project in one month duration. 3.8 Project Scheduled:
  25. 25. Table 3.3 Required Schedules for each Task Figure 3.1 Gantt chart of Project Schedule
  26. 26. CHAPTER 4 HIGH LEVEL DESIGN OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM A software product is a complex entity. Its development usually follows what is known as Software Development Life Cycle (SDLC). The second stage in the SDLC is the Design stage. The objective of the design stage is to produce the overall design of the software. The design stage involves two sub-stages namely: • High-Level Design • Detailed-Level Design In the High-Level Design, the proposed functional and non-functional requirements of the software are studied. Overall solution architecture of the solution is developed which can handle those needs. 4.1 Development Methods: The development method used in this software design is the modular/functional development method. In this, the system is broken down into different modules, with a certain amount of
  27. 27. dependency among them. The input-output data that flows from one-module to another will show the dependency. Data flow diagrams have been used in the modular design of the system. 4.2 Data Flow Diagrams: Data-flow models are an intuitive way showing how data is processed by a system. At the analysis level, they should be used to model the way in which data is processed in the existing system. The notation used in these models represents functional processing, data stores and data movements between functions. Dataflow models are used to show how data flows through a sequence of processing steps. The data is transformed at each step before moving on to the next stage. These processing steps or transformation are program functions where dataflow diagrams are used to document a software design 4.3 Data Flow Diagram: 4.3.1 Data Flow Diagram For rejuvenation Manager: Level 0 Figure 4.1 DFD: Level 0: module for Rejuvenation system Rejuvenation process 1.0 REJUVENATION MANAGER System
  28. 28. In these figure 4.1 Level 0 modules for rejuvenation describes about main rejuvenation process with variable time and workload policy implemented. The different module is selected initially here and later threshold time and threshold memory is set. 4.3.2 Data Flow Diagram for FTR and FTM: Level 1 Figure 4.2 DFD: Level 1 module for rejuvenation manager In Figure 4.2, the Level1 data flow diagram describes about the working of rejuvenation manager, rejuvenation manager has two modules namely aging detector and optimizer. Aging detector detects the aging factor and invokes optimizer to optimize the rejuvenation time. If aging is not detected then system is rejuvenated at rejuvenation time System 1.1 1.2 Optimizer Aging detector User set Threshold values Rejuvenation Process
  29. 29. 4.3.3 Data Flow Diagram: Level 2 Figure 4.3 Data Flow Diagram: Level 2 module for aging detector System 1.1.1 Call Meminfo () 1.1.2 Checking the aging factor Rejuvenation process System Optimizing FTR value Compare FTR with STM Rejuvenati on Process 1.2. 4 1.2. 1 1.2. 2 1.2. 3 Calculate memory factor Analyze Current FTR
  30. 30. Figure 4.4 Data Flow Diagram: Level 2 module for time optimizer Aging detector detects the free memory left by calling meminfo( )and againg result is given to rejuvenation manager. Rejuvenation manager compares the free memory left to threshold value given by user, and then it calls the optimizer to optimize the rejuvenation time if comparison results are positive. Optimizer fetches the FTR and STM (System Time) and checks for the free memory left. Based on the threshold value, time is optimized, either increased or decreased. 4.4 Sequence Diagram A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called Event-trace diagrams, event scenarios, and timing diagrams A sequence diagram shows, as parallel vertical lines ("lifelines"), different processes or objects that live simultaneously, and, as horizontal arrows, the messages exchanged between them, in the order in which they occur. This allows the specification of simple runtime scenarios in a graphical manner. In fig 4.5, clearly depicts the policy used in this project i.e. variable time and workload policy. Rejuvenation manager request for status of workload applied on the system. System ping the rejuvenation manager with workload applied on it, then the rejuvenation manager calls aging detector [13] to compare with predetermined threshold value if any variations observed then this result is given back to rejuvenation manager, later optimizer is invoked to optimize the rejuvenation time, and system is allowed to rejuvenate to its optimized time. If no variation is observed then system is allowed to rejuvenate at predetermined rejuvenation time.
  31. 31. Figure 4.5 Sequence Diagram for rejuvenation manager 4.5 Detailed Design 4.5.1 Detailed System Design The main aim of the project is to build a simulator used to simulate the Time and Prediction based rejuvenation approaches. In this section, the individual modules that comprise the building blocks of the system are identified and have presented a complete design for them. The details of the design process for each module contains of the following elements: • The purpose of the module • A description of its functionality • A description of the types and number of inputs it accepts • A description of the types and number of outputs it generates 4.5.2 Module 1: OS cold reboot
  32. 32. This module is about OS cold reboot, in cold reboot process the rejuvenation time is entered by user and this time is compared with system time, if there is any variation in workload compared to threshold value given by user then time is optimized and system is rejuvenated at optimized time without Input The input for the module is rejuvenation time and threshold value of memory. Output The output for the module is to rejuvenate at rejuvenation point Figure 4.6 Functioning of OS cold reboot Yes No ART Compar e Mem _cC Declare and retrieve time Reboot STOP
  33. 33. The functioning of the cold reboot is described in the above flow diagram. The figure 4.6 shows the process of how OS cold reboot process works, initially user need to set rejuvenation time and threshold memory value and next comparison of system time with rejuvenation time given by user, if time is equal then system is rejuvenated immediately. If time is not equal then memory usage is compared with threshold memory value in block mem_c, if result is negative then system is rejuvenated if result is positive then time is optimized and updated to rejuvenation time. 4.5.3 Module 2: Module for OS warm reboot process This module is about OS warm reboot Input The input for the module is to set predetermined rejuvenation time and threshold memory value. Output The output of the module is to save the state of the kernel as image and save it on hard disk and rejuvenate at rejuvenation time later system must start with from previous reboot state. The functioning of OS warm reboot is described in the following flow diagram. Yes No Yes No Start Save Rejuvenation Stop Optimize Set FTR & FTM FTR che ck Check FFM
  34. 34. Figure 4.7 Module of OS warm reboot The figure 4.7 shows how OS warm reboot works. First user have to set the predetermined rejuvenation time and threshold value of memory, next comparison of system time with rejuvenation time given by user, if time is equal then kernel state is saved and stored in hard disk and system is rejuvenated. If time is not equal then memory usage is compared with threshold memory value in block check FFM (Fixed Free Memory) if result is negative then system time is checked with rejuvenation time. If result is positive then time is optimized and updated to rejuvenation time. System is rejuvenated at the rejuvenation time. 4.5.4 Module 3: VM cold reboot. This module describe about VM cold reboot.
  35. 35. Input The input for the module is to set predetermined rejuvenation time and threshold memory value. Output Output for the module is to rejuvenate the VM at the rejuvenation time The functioning of the VM cold reboot module is described in the following flow diagram. Mem _C NoYes START Compa re Declare and retrieve time Reboot STOP Yes No
  36. 36. Figure 4.8 Module for VM cold reboot The figure 4.8 shows module for VM cold reboot is shown, initially user need to set rejuvenation time and threshold memory value. Next, compare system time with rejuvenation time given by user, if time is equal then VM is rejuvenated immediately. If time is not equal then memory usage is compared with threshold memory value in block mem_c, if result is positive then system is rejuvenated if result is negative then time is optimized and updated to rejuvenation time. 4.5.5 Module 4: module for VM warm reboot Input The input for the module is to set predetermined rejuvenation time and threshold memory value. Output The output of the module is to save the state of the fault VM’s kernel as image and save it on hard disk and rejuvenate at rejuvenation time later VM must start from previous reboot state. The functioning of the VM warm reboot module is described in the following flow diagram.
  37. 37. Figure 4.9 VM warm reboot Module The figure 4.9 shows VM warm reboot process , First user have to set the predetermined rejuvenation time and threshold value of memory, next comparison of system time with Yes No Optimize START Save Rejuvenation STOP Set FTR and FTM FTM Che ck FTR Che ck FTR Check FTM Check FTM Resume Yes No
  38. 38. rejuvenation time given by user, if time is equal then kernel state is saved and stored in hard disk and system is rejuvenated. If time is not equal then memory usage is compared with threshold memory value in block named check FTM (Fixed Threshold Memory), if result is negative then system time is checked with rejuvenation time. If result is positive then memory usage is compared with peak memory value in block named check FTM, if result is positive then kernel state is saved and stored in hard disk and system is rejuvenated, if result is positive then time is optimized and updated to rejuvenation time. System is rejuvenated at the rejuvenation time 4.5.6 Module 5: module for VM migration This module describe about VM migration. Input The input for the module is to set predetermined rejuvenation time and threshold memory value. Output Output for the module is to migrate the VM from server which is tending to fail to the another server at the rejuvenation time The functioning of the VM migrate module is described in the following flow diagram.
  39. 39. Figure 4.10 VM migration module NoYes Yes No Optimize Migrate STOP Set FTR and FTMSet FTR and FTM Che ck FTR Che ck FTR Check FTM Check FTM STARTSS TART No Yes
  40. 40. The figure 4.10 shows flow chart of VM migration clearly depicts it working, initially admin need to set the rejuvenation time and threshold memory value where the VM must be migrated, here whatever the application running and dynamic data entered in VM will be migrated successfully to the secondary server configured, so this module will provide most availability to the server. Once when rejuvenation time is set and if heavy workload applied to the server in mean time then the rejuvenation time is optimized so the server will be protected from hang/crash failure. When rejuvenation time is reached the complete VM will be migrated to another server configured, as data and application running are non-corrupted this module provide no request failure and high availability, which is in great need to current corporate world. 4.5.7 Module 6: Module for VMM reboot This module describe about VMM reboot. Input The input for the module is to set predetermined rejuvenation time and threshold memory value. Output Output for the module is to rejuvenate the VMM at the rejuvenation time The functioning of the VMM reboot module is described in the following flow diagram. Mem _C NoYes START Compa re Set FTR and FTM Reboot STOP Yes No
  41. 41. Figure 4.11 VMM reboot model The figure 4.11 shows the process of VMM reboot, initially user need to set rejuvenation time and threshold memory value and next comparison of system time with rejuvenation time given by user, if time is equal then system is rejuvenated immediately. If not then depending on the workload system will optimize the rejuvenation time, at an optimized time VMM reboot takes place. In this module before VMM reboot, all the VM’s running are shut down. CHAPTER 5 IMPLEMENTATION OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM The implementation phase of any project development is the most important phase as it yields the final solution, which solves the problem at hand. The implementation phase involves the actual materialization of the ideas, which are expressed in the analysis document and developed in the design phase.
  42. 42. Project has six modules OS Cold reboot, OS Warm reboot, VM Cold reboot, VM Warm reboot, VM Migration and VMM reboot, based on Time and Workload rejuvenation policies and also analysis of all the modules are done using SPNP (Stochastic Petri Nets Package). 5.1 Platform Selection: 5.1.1 KVM/QEMU: KVM (Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko. KVM also requires a modified QEMU although work is underway to get the required changes upstream. Using KVM, one can run multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualized hardware: a network card, disk, graphics adapter, etc. The kernel component of KVM is included in mainline Linux, as of 2.6.20.KVM is open source software. A wide variety of guest operating systems work with KVM, including many flavours of Linux, BSD, Solaris, Windows, Haiku,ReactOS, Plan 9, and AROS Research Operating System. In addition Android 2.2, GNU/Hurd (Debian K16), Minix 3.1.2a, Solaris 10 U3, Darwin 8.0.1 and more OSs and some newer versions of these with limitations are known to work. A modified version of QEMU can use KVM to run Mac OS X. 5.1.2 SPNP (Stochastic Petri Net Package) [12][13] [14]: This package was developed by Ciardo et.al. The model type used for input is a SRN (Stochastic Reward Net). SRNs incorporate several structural extensions to GSPNs such as marking dependencies (marking dependent arc cardinalities, guards, etc.) and allow reward rates to be associated with each marking. The reward function can be marking dependent as well. They are specified using CSPL (C based SRN Language) which is an extension of the C programing language with additional constructs for describing the SRN models. SRN specifications are automatically converted into a Markov reward model which is then solved to
  43. 43. compute a variety of transient, steady-state, cumulative, and sensitivity measures. For SRNs with absorbing markings, mean time to absorption and expected accumulated reward until absorption can be computed. The interface increases the power of SPNP (Stochastic Petri Net Package) [15] by providing a means of rapidly developing stochastic reward nets (SRNs); the model type used for input. Input to SPNP is specified using CSPL (C based SPN Language), but the interface removes this burden from the user by providing an interface for graphical representation of the model. The first interface was implemented with Tcl/Tk. Then JAVA was used develop the new version, which makes the look and feel of the interface. 5.1.3 CentOS (OS selection) The CentOS Linux distribution is a stable, predictable, manageable and reproducible platform derived from the sources of Red Hat Enterprise Linux (RHEL). The process delivered has a clear governance model, increased transparency and access. Since March 2004, CentOS Linux has been a community-supported distribution derived from sources freely provided to the public by Red Hat. As such, CentOS Linux aims to be functionally compatible with RHEL. CentOS change packages to remove upstream vendor branding and artwork. CentOS Linux is no-cost and free to redistribute. CentOS Linux is developed by a small but growing team of core developers. In turn the core developers are supported by an active user community including system administrators, network administrators, managers, core Linux contributors, and Linux enthusiasts from around the world. We adopt this OS because it is highly compatible and stable, it is very easy to install KVM and configure it. Moreover configuring NFS is easy for beginners and ports can be resolved properly. The forums of this OS had all the solutions to problems we have faced in other OS like Ubuntu, fedora. Moreover it is open source and codes are available online. 5.2 Programming Language Used (Language Selection): C is a general-purpose programming language initially developed by Dennis Ritchie. C is an imperative (procedural) language. It was designed to be compiled using a relatively straightforward compiler, to provide low-level access to memory, to provide language constructs that map efficiently to machine instructions, and to require minimal run-time support. C was
  44. 44. therefore useful for many applications that had formerly been coded inassembly language, such as in system programming. Despite its low-level capabilities, the language was designed to encourage cross- platform programming. A standards-compliant and portably written C program can be compiled for a very wide variety of computer platforms and operating systems with few changes to its source code. The language has become available on a very wide range of platforms, from embedded microcontrollers to supercomputers. Table 5.1 Methods used in code Methods used in code Description void meminfo(void) Used to check system free memory status. FILE_TO_BUF(meminfo.file, memif_id) Used to store intermediate results of memory status stroul( ) Used to convert string to unsigned long integer time( ) Used to get current system time. This function return time_t type variable. memcopy( ) Used to convert time_t struct variable totm_d struct variable. loacaltime( ) Used to fetch system local time. Sizeof To get object size System( ) This function is used to invoke system command fprintf( ) This function used to write to file. fopen( ) This function is used to create a file 5.3 Installing and configuring KVM on cent OS
  45. 45. 5.3.1 Check Hardware Virtualization support KVM requires hardware virtualization support such as Intel VT or AMD's AMD-V, which are instruction set extensions for hardware-assisted virtualization. Check if hardware virtualization support is available on CentOS host machine: $ egrep -i 'vmx|svm' --color=always /proc/cpuinfo If CPU flags contain "vmx" or "svm", it means hardware virtualization support is available. 5.3.2 Configure FQDN for local host Configure FQDN (Fully Qualified Domain Name) for local host. Otherwise, you may get warnings while launching libvirtd daemon such as "getaddrinfo failed for 'myhost': Name or service not known". To configure FQDN, edit the following configuration file: $ sudo -e /etc/sysconfig/network HOSTNAME=xxx.yyy 5.3.2.1 Disable SELinux Before installing KVM, be aware that there are several SELinux Booleans that can affect the behavior of KVM and libvirt. Here we set Selinux to 0 "Permissive" for demonstration purpose. If you do not wish to change SELinux mode. To disable SELinux on CentOS: $sudo -e /etc/selinux/config SELINUX=permissive 5.3.2.2 Reboot the machine for the change to take effect.
  46. 46. 5.4 Install KVM, QEMU and user-space tools To install KVM, QEMU and user-space tools use the following steps: Step1: Install KVM and virtinst (a tool to create VMs) as follows: $sudo yum install kvm libvirt python-virtinst qemu-kvm Step2: Start libvirtd daemon, and set it to auto-start: $sudo service libvirtd start $sudo chkconfig libvirtd on Step3: Check if KVM has successfully been installed. You should see no error as follows. $ sudo virsh -c qemu:///system list Id Name State ---------------------------------------------------- Step4: Configure Linux Bridge for VM Networking Installing KVM alone does not allow VMs to communicate with each other or access external networks. You need to configure VM networking separately. Here, we set up "bridged networking" via Linux Bridge.  Install a package needed to create and manage bridge devices: $sudo yum install bridge-utils  Disable Network Manager Service if it's enabled, and switch to default net manager as follows. $sudo service NetworkManager stop $sudo chkconfig NetworkManager off $sudo chkconfig network on $sudo service network start
  47. 47. To configure a new bridge, you have to pick an active network interface (e.g., eth0), and enslave it to the bridge. Depending on whether the network interface is assigned an IP address via DHCP or statically, there are two different ways to configure a new bridge.  To configure bridge br0 via DHCP: $sudo -e /etc/sysconfig/network-scripts/ifcfg-eth0 • Modify the file ifcfg-etho as shown below: DEVICE=eth0 TYPE=Ethernet ONBOOT=yes NM_CONTROLLED=yes BRIDGE=br0 $sudo -e /etc/sysconfig/network-scripts/ifcfg-br0 • Modify the file ifcfg-br0 as shown below: DEVICE=br0 NM_CONTROLLED=yes ONBOOT=yes TYPE=Bridge BOOTPROTO=dhcp  You should now see br0 bridge interface with a proper IP address as follows. $ifconfig Step5: Install VirtManager The final step is to install a desktop UI called VirtManager for managing virtual machines (VMs) through libvirt. To install VirtManager:
  48. 48. $ sudo yum install virt-manager libvirt qemu-system-x86 openssh-askpass libcanberra-devel 5.5 Setup a minimal CentOS 6 NFS configuration To setup an NFS (Network File System) configuration for two systems, basically we have to consider one system as a server and another one as a client. The following steps show the NFS Server configuration: 5.5.1 SERVER CONFIGURATION: Step1: Checking for yum updates and installing NFS utils • To setup the server: 172.16.30.48/255.255.255.254 • Before setup the server system needs update the packages: "yum update" • Once update is completed reboot the system. "shutdown -r now" • Install nfs-utils rpcbind system configuration package. "yum install nfs-utils rpcbind system-config-firewall-tui" • Modify the selinux file to disable SELINUX "vi /etc/sysconfig/selinux" and set "SELINUX=disabled". "setenforce 0" Step2: Make a folder to be shared In an NFS sharing we have to create folder, that folder is shared with the both the server and client. That folder holds all the data which is transferred between server and client
  49. 49. Here we are creating and sharing a folder called image, in the below command we are giving the path in which where that folder is present. $ mkdir /var/lib/libvirt/images Step3: Checking the configuration of nfs, nfslock, and rpcbind: $ chkconfig nfs on $ chkconfig nfslock on $ chkconfig rpcbind on Step4: Configure the firewall setting: $ "system-config-firewall-tui" Step5: Modify the exports file to add the shared storage to make live migration from source to destination system /var/lib/libvirt/images 172.16.30.48/255.255.255.254 (rw, sync, no_root_squash) Step6: Modify the hosts.allow file by following lines: $ sudo /etc/hosts.allow mountd: 172.16.30.46/255.255.255.254 Step7: Modify the hosts.deny file by following lines portmap:ALL lockd:ALL mountd:ALL rquotad:ALL statd:ALL Step8: Restart the following services on Server machine once you completed all the above steps:
  50. 50. $ sudo service rpcbind restart $ sudo service nfs restart $ sudo service nfslock restart Once you finish serve configuration, immediately follow the client configuration. To configure the NFS client follow the following steps: 5.5.2 CLIENT CONFIGURATION: Step1: To Setup the Client: 172.16.30.46/255.255.255.254 • Before we setup the client, system need to be updated with other packages: $ sudo yum update • Once update is completed reboot the system. $ sudo shutdown -r now • Install nfs-utils rpc bind system configuration package. $ sudo yum install nfs-utils rpcbind system-config-firewall-tui • Modify the selinux file to disable SELINUX $ sudo gedit /etc/sysconfig/selinux and set SELINUX=disabled $ setenforce 0 Step2: Make a folder to be the mount point. In an NFS sharing we have to create sharable folder, this folder is shared with the both the server and client. This folder holds all the data which is transferred between server and client Here we create and share a folder called image, in the below command we give the path in which where that folder is present. $ sudo mkdir /var/lib/libvirt/images Step3: Start the following services
  51. 51. $ sudo chkconfig nfs on $ sudo chkconfig nfslock on $ sudo chkconfig rpcbind on Step4: Restart the following services on Server machine once you completed all the above steps: $ sudo service rpcbind restart $ sudo service nfs restart $ sudo service nfslock restart Once you finish the both server and client NFS configuration, we have to mount the folder which is created during the NFS server and client configuration. To mount a folder we have to use the following steps: Step1: Append the following line to fstab file: $ sudo gedit /etc/fstab <Shared directory> <mount point> <type> <auto> 0 0 172.16.30.48://var/lib/libvirt/images /var/lib/libvirt/images nfs auto 0 0 172.16.30.48: Server name 172.16.30.48:/var/lib/libvirt/images: mount File /var/lib/libvirt/images: Mounting point on client machine (172.16.30.46) nfs : Type Step2: Mount shared nfs file on client machine: $ sudo mount -t nfs 172.16.30.48://var/lib/libvirt/images /var/lib/libvirt/images. 5.6 ITL Algorithm.
  52. 52. ITL algorithm is designed to optimize the rejuvenation time predefined by user when workload is variable. Rejuvenation time is decreased when workload increases and rejuvenation time is increased when workload decreases. Working of algorithm is described in below steps Step 1: Begin Step 2: Set variable FTR (Fixed Time Rejuvenation) Step 3: Fetch the system Free Memory and assign to variable FM (Free Memory) Step 4: Set the Threshold Free Memory value to variable FFM (Fixed Free Memory) Step 5: if (FTR==SystemCurrentTime) Then Reboot Else If (FM < FFM) then Reset the FTR= FTR-(1*(FM-FFM)) Step 6: Go to Step 5 Step 7: End CHAPTER 6 TESTING OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM 6.1 Testing There are essentially three main domain and six modules in our project. In this section the results of all six modules are being tested with different OS, VM or VMM. The purpose of this section
  53. 53. is to ensure that the resulting system meets the system requirements and there is a seamless transition of data flowing through each of the systems as well as in between one another. These testing provide a sort of "living document". Clients and other developers looking to learn how to use the module can look at these tests to determine how to use the module to fit their needs and gain a basic understanding of the modules. 6.1.1 Testing Strategy The following points are indicative of the testing strategy for unit testing followed in the project. • Review the design specifications and source code for modules to be tested. • Perform a peer review on the module Test Plan. • Create any test "stubs" required to provide input to or receive output from the code module. • When it's time to test particular modules, compile the code in the test environment to check for any missing files required for test plan execution. • Execute the tests. Compare information/values received out of the tested software to those expected, as documented in the Test Plan. • Retest code when an updated version is available. Record results on the module Test Report Form. • When the module is considered to have passed all tests, archive the final Report form(s).
  54. 54. Table 6.1: Cold reboot based on Time Table 6.2: Cold reboot based on Workload Test Case ID T-2 Test Case ID T-1 Purpose The system should rejuvenate at given rejuvenation time(TTR) Pre- Conditions System time Inputs Time to Rejuvenate(TTR) Expected Output Reboot Post- Conditions After rebooting the system the current state should not be saved Execution History Date Result Version Remark 17-02-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 15-03-2014 Pass 1.0 Testing passed in CentOS operating system 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system
  55. 55. Purpose System should rejuvenate at given memory threshold value Pre- Conditions System free memory Inputs Memory threshold value Expected Output Reboot Post- Conditions After rebooting the system the current state should not be saved Execution History Date Result Version Remark 17-02-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 15-03-2014 Pass 1.0 Testing passed in CentOS operating system 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system . Table 6.3: Cold reboot based on both Time and Workload Test Case ID T-3
  56. 56. Purpose The system optimize the rejuvenation time based on the workload and then system rejuvenates at an optimized time Pre- Conditions System time and Free memory Inputs Time to Rejuvenate(TTR) and Memory threshold value Expected Output Reboot Post- Conditions After rebooting the system the current state should not be saved Execution History Date Result Version Remark 17-02-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 15-03-2014 Pass 1.0 Testing passed in CentOS operating system 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.4: Warm reboot based on Time Test Case ID T-4 Purpose The system should rejuvenate at given rejuvenation time(TTR) Pre- Conditions System time
  57. 57. Inputs Time to Rejuvenate(TTR) Expected Output Reboot Post- Conditions After rebooting the system the current state should be saved Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in CentOS operating system due to OS is not compatible 28-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 27-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 04-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 Table 6.5: Cold reboot based on Workload Test Case ID T-5 Purpose The system should rejuvenate at given Memory threshold value Pre- Conditions System Free memory Inputs Memory threshold value Expected Reboot
  58. 58. Output Post- Conditions After rebooting the system the current state should be saved Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in CentOS operating system due to OS is not compatible 28-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 27-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 04-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 Table 6.6: Warm reboot based on both time and Workload Test Case ID T-6 Purpose The system optimize the rejuvenation time based on the workload and then system rejuvenates at an optimized time Pre- Conditions System Time and Free memory Inputs Time to Rejuvenate and Memory threshold value Expected Output Reboot
  59. 59. Post- Conditions After rebooting the system the current state should be saved Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in CentOS operating system due to OS is not compatible 28-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 27-03-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 04-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing passed in Ubuntu 12.04 09-04-2014 Pass 1.0.1 Testing is passed in Ubuntu 12.04 Table 6.7: VM cold reboot based on Time Test Case ID T-7 Purpose The Virtual Machine(VM) should rejuvenate at given rejuvenation time(TTR) Pre- Conditions System time Inputs Time to Rejuvenate(TTR) Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should not be saved
  60. 60. Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.8: VM cold reboot based on Workload Test Case ID T-8 Purpose The Virtual Machine(VM) should rejuvenate at given Memory threshold value Pre- Conditions System Free memory Inputs Memory threshold value Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should not be saved Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system
  61. 61. 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.9: VM cold reboot based on both Time and workload Test Case ID T-9 Purpose The system optimize the rejuvenation time based on the workload and then Virtual Machine(VM) rejuvenates at an optimized time Pre- Conditions System time and Free memory Inputs Time to Rejuvenate(TTR) and Memory threshold value Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should not be saved Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system
  62. 62. Table 6.10: VM warm reboot based on Time Test Case ID T-10 Purpose The Virtual Machine(VM) should rejuvenate at given rejuvenation time(TTR) Pre- Conditions System time Inputs Time to Rejuvenate(TTR) Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should be saved Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system
  63. 63. Table 6.11: VM warm reboot based on Workload Test Case ID T-11 Purpose The Virtual Machine(VM) should rejuvenate at given Memory threshold value Pre- Conditions System Free memory Inputs Memory threshold value Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should be saved Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.12: VM warm reboot based on both Time and workload
  64. 64. Test Case ID T-12 Purpose The system optimize the rejuvenation time based on the workload and then Virtual Machine(VM) rejuvenates at an optimized time Pre- Conditions System time and Free memory Inputs Time to Rejuvenate(TTR) and Memory threshold value Expected Output Virtual Machine(VM) Reboot Post- Conditions After rebooting the Virtual Machine(VM) the current state should be saved Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.13: VMM reboot based on Time Test Case ID T-13
  65. 65. Purpose The Virtual Machine Monitor(VMM) should rejuvenate at given rejuvenation time(TTR) Pre- Conditions System time Inputs Time to Rejuvenate(TTR) Expected Output Virtual Machine Monitor(VMM) Reboot Post- Conditions After rebooting the Virtual Machine Monitor(VMM) connection between VMM and VM’s should loss Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.14: VMM reboot based on Workload Test Case ID T-14 Purpose The Virtual Machine Monitor(VMM) should rejuvenate at given Memory threshold value Pre- Conditions System Free memory
  66. 66. Inputs Memory threshold value Expected Output Virtual Machine Monitor(VMM) Reboot Post- Conditions After rebooting the Virtual Machine Monitor(VMM) connection between VMM and VM’s should loss Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.15: VMM reboot based on both Time and Workload Test Case ID T-15 Purpose The system optimize the rejuvenation time based on the workload and then Virtual Machine Monitor(VMM) rejuvenates at an optimized time Pre- Conditions System time and Free memory Inputs Time to Rejuvenate(TTR) and Memory threshold value Expected Virtual Machine Monitor(VMM) Reboot
  67. 67. Output Post- Conditions After rebooting the Virtual Machine Monitor(VMM) connection between VMM and VM’s should loss Execution History Date Result Version Remark 27-03-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 04-04-2014 Pass 1.0 Testing passed in CentOS operating system 09-04-2014 Pass 1.0 Testing passed in Ubuntu 12.04 operating system 09-04-2014 Pass 1.0 Testing passed in CentOS operating system Table 6.16: VM migration based on Time Test Case ID T-16 Purpose The Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2) at given rejuvenation time(TTR) Pre- Conditions System time Inputs Time to Rejuvenate(TTR) Expected Output Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2)
  68. 68. Post- Conditions After the migration Virtual Machine(VM) from Physical Machine(PM1) should reboot Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 05-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 10-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 30-03-2014 Pass 1.0.1 Testing passed in CentOS operating system 04-04-2014 Pass 1.0.1 Testing passed in CentOS operating system 09-04-2014 Pass 1.0.1 Testing passed in CentOS operating system Table 6.17: VM migration based on workload Test Case ID T-17 Purpose The Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2) at given Memory threshold value Pre- Conditions System Free memory Inputs Memory threshold value Expected Output Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2) Post- Conditions After the migration Virtual Machine(VM) from Physical Machine(PM1) should reboot
  69. 69. Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 05-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 10-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 30-03-2014 Pass 1.0.1 Testing passed in CentOS operating system 04-04-2014 Pass 1.0.1 Testing passed in CentOS operating system 09-04-2014 Pass 1.0.1 Testing passed in CentOS operating system Table 6.18: VM migration based on both Time and workload Test Case ID T-18 Purpose The system optimize the rejuvenation time based on the workload and then Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2) based on optimized time Pre- Conditions System Free memory and System Free memory Inputs System Time and Memory threshold value Expected Output Virtual Machine(VM) should migrate from one Physical Machine(PM1) to another Physical Machine(PM2) Post- Conditions After the migration Virtual Machine(VM) from Physical Machine(PM1) should reboot Execution History Date Result Version Remark 27-02-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible
  70. 70. 05-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 10-03-2014 Failed 1.0 Testing Failed in Ubuntu 12.04 operating system due to OS is not compatible 30-03-2014 Pass 1.0.1 Testing passed in CentOS operating system 04-04-2014 Pass 1.0.1 Testing passed in CentOS operating system 09-04-2014 Pass 1.0.1 Testing passed in CentOS operating system CHAPTER 7 RESULTS OF SOFTWARE REJUVENATION IN COMPLEX SYSTEM 7.1 Results ITL algorithm implemented on all modules is analyzed using SPNP, which help to get the value of MTTR and MTTF, from these values, we calculate the availability and downtime factor for particular algorithm implied to the system. Availability value is found out for all modules and based on these values we can analyze how much time the system will be available for usage without any failure. In SPNP we need to develop a petri net diagram for particular algorithm and for this diagram we are supposed to code in CSPL (C language based on stochastic petri net) to define the transition of tokens from one place to another through timed transitions or immediate transitions. Token are deposited in place and are transmitted from one place to another by timed or immediate transitions. To check that petri net diagram is having proper flow, SPNP provide the animation option where we are supposed to code for guiding token transitions, when and where to move i.e. from one place to another place. when the code is executed the animated petri net diagram will show how the transition are taking place , if any error occur during this animated transition then it is clear that the algorithm or petri net diagram for that algorithm is error prone. Table 7.1: symbol conventions.
  71. 71. Figure 7.1: Memory model Figure7.2: Clock model Table 7.2: Clock and Memory SRN model description Places & Transitions Description Pclock Place where clock is initialized or reset. Ptpolicy This place indicates the rejuvenation time is reached when token is present in it. Symbol Conventions Place Timed transition Immediate transition Arc Tpolicy Ttrigger Pclock Ptrigger Ptpolicy Tclock Ptpolicy Tmem Tmemv Pmem
  72. 72. Ptrigger This place is point for rejuvenation. Pmem Place where RAM utilization is compared with threshold value predefined. Pmemv Place which indicates RAM utilization reached its threshold point Tclock Timed transition, it is enabled when the given time is reached Tpolicy Timed transition, it is enabled when the token is present in Ptpolicy Ttrigger Immediate transition, it is enabled when the token is present in Ptrigger and if the given time is reached. Tmem Timed transition, it is enabled when the given time is reached Tmemv Immediate transition, it is enabled when the token is present in Ptpolicy and if the given time is reached. Figure 7.3: OS Cold SRN model. Table 7.3: OS Cold SRN model description Trej tarej Working Twork Rej Taginig aging
  73. 73. Figure 7.4: OS Warm SRN model Table 7.4: OS Warm SRN model description Places & Description Places & Transitions Description Working Place which indicates system is in normal working state. Aging Place which indicates system is suffering from aging problem. Rej Place which indicates system is under rejuvenation process. Trej Immediate transition, it is enabled when the given time is reached Twork Timed transition, it is enabled when the token is present in Rej Taging Timed transition, it is enabled when the token is present in working and Pmemv. tarej Immediate transition, it is enabled when the token is present in aging. Tsave Taging Working Toptiwork optimize Toptiaging saveTrejRej Tresume Treworking Resume
  74. 74. Transitions Working Place which indicates system is in normal working state. Aging Place which indicates system is suffering from aging problem. Rej Place which indicates system is under rejuvenation process. Optimize Place where time is optimized based on workload. Save Place where image of kernel is created and saved. Resume Place where kernel image saved is retrieved. Taging Timed transition, it is enabled when the token is present in working and Pmemv. Topti Immediate transition, it is enabled when the token is present in aging Toptiwork Timed transition, it is enabled when the token is present in optimize. Tsave Timed transition, it is enabled when the token is present in working and Ptpoicy. Trej Immediate transition, it is enabled when the token is present in save. Tresume Timed transition, it is enabled when the token is present in rej. Trewoking Immediate transition, it is enabled when the token is present in resume. Figure 7.5: VM cold SRN model Table 7.5: VM Cold SRN model description T_opt Optimize T_aging AgingRejuvenation T_rej
  75. 75. Places & Transitions Description Working Place which indicates system is in normal working state. Aging Place which indicates system is suffering from aging problem. Rejuvenation Place which indicates system is under rejuvenation process. Optimize Place where time is optimized based on workload. Taging Timed transition, it is enabled when the token is present in working and Pmemv. Trej Timed transition, it is enabled when the token is present in Ptpolicy and if the given time is reached. T_aging Timed transition, it is enabled when the token is present in aging. T_opt Timed transition, it is enabled when the token is present in Optimize. T_rej Timed transition, it is enabled when the token is present in Rejuvenation. Tmem Aging Taging Optimize Topt Working Tsave Tsave SaveTrej Rejuvenation Resume Tres Tmem Working Aging Taging Optimize Topt Working Tsave Tsave SaveTrej Rejuvenation Resume Tres Tmem Aging Taging Optimize Topt Working Tsave Tsave SaveTrej Rejuvenation Resume Tres Memory
  76. 76. Figure 7.6 VM warm SRN model Table 7.6: VM Warm SRN model description Places & Transitions Description Working Place which indicates system is in normal working state. Aging Place which indicates system is suffering from aging problem. Rejuvenation Place which indicates system is under rejuvenation process. Optimize Place where time is optimized based on workload. Save Place where image of VM is created and saved. Resume Place where VM image saved is retrieved. Memory Place where indicates the vary in memory Taging Timed transition, it is enabled when the token is present in aging. Tmem Timed transition, it is enabled when the token is present in memory. Tsave Timed transition, it is enabled when the token is present in Ptpolicy and if the given time is reached. T_save Timed transition, it is enabled when the token is present in memory and pmemv==2. Trej Timed transition, it is enabled when the token is present in save. Tres Timed transition, it is enabled when the token is present in Rejuvenation. Twmem Timed transition, it is enabled when the token is present in Pmemv and if the given time is reached. Trevert Timed transition, it is enabled when the token is present in Resume. Topt Timed transition, it is enabled when the token is present in optimize. Tmem Aging Taging Optimize Topt Working Tsave Tsave SaveTrej Rejuvenation Resume Tres T afail Tnormal agingfailed T afail aging T aging migrate T migrate Working 1 T rej1 T rejre2 Rej T rejre1 T rej2 T revert Working 2 T hyper
  77. 77. Figure 7.7 VM migration SRN model Table 7.7: VM Migration SRN model description Places & Transitions Description Working1 Place which indicates system is in normal working state. Working2 Place which indicates system is in normal working state. Aging Place which indicates system is suffering from aging problem. Rej Place which indicates system is under rejuvenation process. Optimize Place where time is optimized based on workload. migrate Place which indicates VM is getting migrated Tmaging Timed transition, it is enabled when the token is present in aging. Tmigrate Timed transition, it is enabled when the token is present in working1. Trej1 Timed transition, it is enabled when the token is present in working1==1 and if the given time is reached. Trej2 Timed transition, it is enabled when the token is present in working1 and pclock and if the given time is reached. Trejre1 Timed transition, it is enabled when the token is present in working1 and rej.
  78. 78. Trejre2 Timed transition, it is enabled when the token is present in rej and working2==0. Trevert Timed transition, it is enabled when the token is present in Working2==2. Tnormal Timed transition, it is enabled when the token is present in optimize. Taging Timed transition, it is enabled when the token is present in Pmemv and if the given time is reached. Tafail Timed transition, it is enabled when the token is present in aging. Thyper Timed transition, it is enabled when the token is present in migrate. 7.2 Discussion On developing above petri net diagram in SPNP and coding in CSPL for transition of token, help us analyze the availability value for each module. On giving the transition time for transition to happen and transition time took in real-time implementation to move from one state to another state, based on values in Table 7.9 MTTR and MTTF value can be calculated. From these values availability of the module can be calculated from the formula below Availability = MTTR ÷ (MTTR + MTTF). Availability for all the modules are analyzed in this project and their respective availability values are calculated on an average for 30 days Table 7.8. From the availability value of 30 days we can calculate availability of the system for any number of years. For all token to move from one place to another, need to pass through the transition by accepting all guard function conditions. Table 7.9 has three parameters namely transition, and value is time for particular transition to take place and mean value gives the value in terms of 1/hour. Mean value is used as standard format of time in analysis using SPNP
  79. 79. Table 7.8: Availability values of rejuvenation methods We have considered many key parameters like aging rate, rejuvenation rate, aging rate, failure rate, suspend rate, resume rate, restart rate etc. and assumed safety thresholds for each of modules as given in Table 7.9. Based on these values we detect the availability of the system using Time and variable workload policy. In all modules we just set the rejuvenation time for their safe levels at a certain interval of time and set the threshold memory value and if aging is detected then the rejuvenation time is optimized by ITL algorithm and on that optimized time rejuvenation occurs. Table 7.9: Cold OS rejuvenation transition rates Rejuvenation Methods Days Steady State Availability Downtime Cold OS Rejuvenation Warm OS Rejuvenation Cold VM Rejuvenation Cold VMM Rejuvenation Warm VM Rejuvenation VM Migration 30 30 30 30 30 30 0.998824 0.998983 0.998633 0.998846 0.998799 0.999219 0.001176 0.001017 0.001367 0.001154 0.001201 0.000781 Transition Value Mean time OS aging rate OS Rejuvenation rate OS Failure rate OS Suspend rate OS Restart rate VM Resume rate VM rejuvenation rate VM aging rate VM failure rate VM failure recovery rate 1 week 1 month 1 week 1 month 30 sec 15sec 1 month 1 week 1 week 1 min 0.005952381 0.001388889 0.005952381 0.001388889 120 240 0.001388889 0.005952381 0.005952381 60
  80. 80. Graphs are plotted based on transition rate and availability value. Graphs clearly depicts the availability value at particular time, all graphs are plotted for thirty days interval. All the graphs below have X-axis as availability value and Y- axis as time (1/hour). In general in all modules, if rate of rejuvenation is high then the system will be rebooted repeatedly in short intervals which lead to high downtime and hence availability value is low initially in all graphs plotted. Graph 1: OS Cold availability In cold OS reboot process availability factor is low as system takes more time to reboot, hence we have high downtime. In this module the system is restarted normally at rejuvenation time, for this process downtime depends on the processor speed of the system, normally it might take average of one to three minutes to get back to normal working state. Availability value of this module is 0.998824 thirty days
  81. 81. Graph 2: OS Warm reboot In OS warm reboot module availability value when compared to cold reboot module is high, because here complete kernel is saved as image and stored on hard disk, after reboot grub loader extract this image and kernel image will be loaded. Hence we provide no loss of data and no interruption of applications running even after reboot. Availability value of this module is 0.998983 for 30 days Graph 3: OS Warm reboot and Cold reboot comparison
  82. 82. Comparison graph give us the variation of availability value in cold and warm reboot of OS, initially both modules have same availability value due to high rejuvenation rate which have less availability value, the graph clearly depicts warm reboot of OS has high availability compared to cold reboot of OS. Graph 4: VM Cold reboot In VM cold reboot, again the availability value decreases as rebooting the system takes much time and therefore it provide the low availability to the user using the system and chances of losing the data and request failure is high. This module has 0.998633 availability value for thirty days.
  83. 83. Graph 5: VMM reboot Again this module has same fault as it was in other cold reboot processes and hence it has availability value of 0.998846 for thirty days. Graph 6: comparison graph for VMM and VM reboot Comparing cold reboot module of VM and VMM. We have better availability value for VMM Cold reboot module. Both module give almost same availability to the system. Graph 7: comparison graph for OS and VM cold reboot
  84. 84. From the graph we clearly come to know that VM cold reboot module has better availability value when compared to OS cold reboot module. Graph 8: Graph for VM warm reboot As similar to OS warm reboot, VM warm reboot as high availability compared to cold reboot modules, availability value of this module is 0.998799 Graph 9: Graph for VM migration
  85. 85. VM migration module has availability value of 0.999219 which is highest of all modules done in this project, in this module as no reboot and no images are saved but the complete virtual machine is migrated to another server conFigureured, hence it has no data loss and no request failure or error in running applications. Graph 10: Graph for VM comparison This graph gives the comparison result of all modules in VM. VM migration module has high availability and VM cold has lowest availability.
  86. 86. GRAPH 11: Comparison graph of all modules This graph is main graph of our analysis, which has comparison of availability value of all modules with respect to rejuvenation time, comparing availability value of all modules clearly tell us that VM migration module has high availability and OS cold reboot module has very low availability. 7.3 Snapshots: 7.3.1 Snap shots of OS Cold Reboot
  87. 87. Snap shot: 1 Snap shot: 2 Snap shot: 3 Snap shot: 4 7.3.2 Snap shots of OS Warm Reboot
  88. 88. Snap shot: 5 Snap shot: 6 Snap shot: 7 Snap shot: 8
  89. 89. 7.3.3 Snap shots of VM Cold Reboot Snap shot: 9 Snap shot: 10 Snap shot: 11 Snap shot: 12
  90. 90. Snap shot: 13 7.3.4 Snap shots of VM Warm Reboot Snap shot: 14 Snap shot: 15 Snap shot: 16 Snap shot: 17
  91. 91. Snap shot: 18 7.3.5 Snap shots of VM Migration Snap shot: 19 Snap shot: 20 Snap shot: 21 Snap shot: 22
  92. 92. Snap shot: 23 Snap shot: 24 7.3.6 Snap shots of VMM Reboot Snap shot: 25 Snap shot: 26 Snap shot: 27 Snap shot: 28
  93. 93. Snap shot: 29 CHAPTER 8 CONCLUSION 8.1 Conclusion Intelligent Time and Load (ITL) balancing policy accepts time from user and optimize the rejuvenation time whenever workload is variable, otherwise the system is rejuvenated at its rejuvenation point. ITL policy avoids software failure and it helps to achieve high availability of complex system. ITL policy is used in experimenting on six module namely OS Cold Reboot, OS Warm Reboot, VM Cold Reboot, VM Warm Reboot, VM Migration and VMM Reboot. Over the course of experiment VM Migration achieves the best study state availability as long as VM live migration is fast enough and other server have capacity to receive the migrated VM. In the existing policy rejuvenation is proposed based on various parameters such as hardware failures, memory leaks, CPU utilization, request failures and so on. ITL policy considers Physical memory as a primary factor for rejuvenation, hence it is better way to avoid performance degradation and to increase the availability of system. 8.2 Limitations Some important limitations are as follows: • Project is restricted to linux platform. • OS Warm reboot is compatibilty on Ubuntu but not with other operating systems. • VM migration is compatible on CentOS but not with other operating systems. • Complete(100%) availabilty is not provided in all modules.
  94. 94. 8.3 Future Enhancement The basic idea is to accomplish request processing on the same node in which the rejuvenation is taking place. The combination of reboot and failover, enables a system to continue processing requests during the reboot. Before rebooting an OS or an application running on one node of the clustered environment, requests to the node are redirected to the other nodes of the system. This technique improves availability of systems. Figure 8.1 Sharing of request Figure 8.1 describe the simultaneous execution of request processing and rejuvenation on the same node requires an alternative request processing environment. The alternative environment takes over processing of all requests from the original environment, and then the rejuvenation of the original environment is started. Hence by this 100% availability can be achieved.

×