Your SlideShare is downloading. ×
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems

597

Published on

This paper presents a study of the fault-tolerant nature of Genetic Algorithms (GAs) on a real-world Desktop Grid System, without implementing any kind of fault-tolerance mechanism. …

This paper presents a study of the fault-tolerant nature of Genetic Algorithms (GAs) on a real-world Desktop Grid System, without implementing any kind of fault-tolerance mechanism.
The aim is to extend to parallel GAs previous works tackling fault-tolerance characterization in Genetic Programming.
The results show that GAs are able to achieve a similar quality in results in comparison with a failure-free system
in three of the six scenarios under study despite
the system degradation. Additionally, we show that a small increase on the initial population size is a successful method to
provide resilience to system failures in five of the scenarios. Such
results suggest that Paralle GAs are inherently and naturally
fault-tolerant.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
597
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Characterizing Fault Tolerance of Genetic Algorithms in Desktop Grid Systems ˜ ´ ´ Daniel Lombrana Gonzalez Juan Luis Jimenez Laredo ´ ´ Francisco Fernandez de Vega Juan Julian Merelo ´ Guervos April 8, 2010 ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 2. Outline 1 Introduction 2 Motivation 3 Methodology 4 Experiments and Results 5 Conclusions ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 3. Introduction ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 4. Introduction Parallel Genetic Algorithms (PGA) Sometimes Evolutionary Algorithms (EAs) require large execution times. One solution is to use: Parallel Computing and Distributed Platforms. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 5. Introduction Parallel algorithms can be run in ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 6. Introduction Parallel algorithms can be run in ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 7. Introduction Failures in distributed platforms Distributed platforms are prone to errors. Failures are expected events rather than catastrophic exceptions. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 8. Introduction Fault Tolerance Fault Tolerance is the ability of a system to behave in a well-defined manner once a failure occurs. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 9. Introduction Fault Tolerance Different techniques have been developed to cope with failures: Redundancy, S. Ghosh. Distributed systems: an algorithmic approach. Chapman & Hall/CRC, 2006. Checkpointing, E. Elnozahy, L. Alvisi, Y. Wang, and D. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408, 2002. Rejuvenation frameworks, A. T. Tai and K. S. Tso. A performability-oriented software rejuvenation framework for distributed applications. In DSN ’05, pages 570–579, Washington, DC, USA, 2005. IEEE Computer Society. etc. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 10. Introduction Fault Tolerance The use of a fault tolerance technique mandates that: the application has to be modified, and even the parallel algorithm. Thus, this modification can represent a heavy burden for the developer. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 11. Motivation ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 12. Motivation Parallel EAs and Fault Tolerance To the best of our knowledge there has been little research about the fault tolerance features of PEAs in general and of PGA applications in particular. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 13. Motivation Previous Works We firstly studied the Fault-Tolerance nature of Parallel Genetic Programming (PGP) on: Real World Desktop Grid Systems. Concluding that PGP is fault-tolerant by default. ˜ ´ ´ Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova. Characterizing fault tolerance in genetic programming. Future Generation Computer Systems, 2010. DOI: 10.1016/j.future.2010.02.006. ˜ ´ ´ Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, and Henri Casanova. Characterizing fault tolerance in genetic programming. In Workshop on Bio-Inspired Algorithms for Distributed Systems, pages 1–10. Barcelona, Spain, 2009. ISBN 978-1-60558-564-2. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 14. Motivation Proposal Based on this insight This work builds on top of the previous ones, and extends the study of fault-tolerance in EAs to PGAs, using the same methodology. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 15. Methodology ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 16. Methodology Master-Worker ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 17. Methodology Desktop Grid platforms (DGs) DGs exhibit large numbers of failures. DGs failure behavior has been studied in literature. DGs are low-cost when compared to clusters of comparable scale. And, PGA applications are loosely coupled and thus well-suited to DGs. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 18. Methodology Desktop Grid Platforms DGs are very promising for PGA applications, and their high failure rate make them a great test case for studying and characterizing the fault tolerance abilities of PGA. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 19. Methodology Experiments In order to characterize the fault-tolerant nature of PGA we run two kind of experiments: a failure-free environment, and replaying and simulating failure traces from real-world DG platforms. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 20. Methodology DG traces We perform simulations of DG platforms and of host availability based on three real-world traces: entrfin, ucb, xwtr. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 21. Methodology DG traces Trace Hosts Venue Time Entrfin 275 San Diego 1.0 months Ucb 85 UC Berkeley 1.5 months Xwtr 100 ´ Univeriste Paris-Sud 1.0 months ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 22. Methodology Using the traces We consider two cases: hosts that become unavailable never become available again (worst case assumption), and the complete host-churn (unavailable hosts can be re-acquired afterwards). For two different days of each trace. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 23. Methodology Host availability for 1 day of the ucb trace 25 20 15 Computers 10 5 0 0 50 100 150 200 250 300 Time Step Original Trace Trace without return ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 24. Experiments and Results ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 25. Experiments and Results Problems We conduct experiments with a 3-trap instance: a → − → − → − trap(u( x )) = z (z − u( x )), if u( x ) ≤ z (1) b → − l−z (u( x ) − z), otherwise ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 26. Experiments and Results GA Parameters for 3-Trap instance Trap instance Size of sub-function (k ) 3 Number of sub-functions (m) 10 Individual length (L) 30 GA settings GA GGA Population size 3000 Selection of Parents Binary Tournament Recombination Uniform crossover, pc = 1.0 1 Mutation Bit-Flip mutation, pm = L ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 27. Experiments and Results Population size vs. generation 4000 0 3500 3000 25 2500 Individuals % of Loss 2000 50 1500 1000 75 500 0 100 0 10 20 30 40 50 Generations entrfin 1 ucb 1 xwtr 1 entrfin 2 ucb 2 xwtr 2 ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 28. Experiments and Results Obtained Fitness for 3-Trap Day1 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.30 W = 6093, p-value = 0.002688 yes Entrfin 10% 23.47 W = 5408.5, p-value = 0.2535 no Entrfin 20% 23.48 W = 5360, p-value = 0.3137 no Entrfin 30% 23.49 W = 5283.5, p-value = 0.4271 no Entrfin 40% 23.57 W = 4923.5, p-value = 0.8286 no Entrfin 50% 23.59 W = 4910.5, p-value = 0.7994 no Ucb 23.22 W = 6453, p-value = 6.877e-05 yes Ucb 10% 23.27 W = 6098.5, p-value = 0.002753 yes Ucb 20% 23.37 W = 5837.5, p-value = 0.02051 yes Ucb 30% 23.40 W = 5664, p-value = 0.06588 no Ucb 40% 23.51 W = 5186.5, p-value = 0.6004 no Ucb 50% 23.42 W = 5623, p-value = 0.08335 no Xwtr 23.56 W = 5056, p-value = 0.8748 no Xwtr 10% 23.57 W = 4923.5, p-value = 0.8286 no Xwtr 20% 23.68 W = 4474, p-value = 0.1245 no Xwtr 30% 23.73 W = 4259.5, p-value = 0.02812 yes Xwtr 40% 23.68 W = 4502, p-value = 0.1466 no Xwtr 50% 23.71 W = 4356.5, p-value = 0.05817 no ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 29. Experiments and Results Obtained fitness for 3-Trap Day2 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.57 W = 4979.5, p-value = 0.9546 no Entrfin 10% 23.69 W = 4397.5, p-value = 0.07682 no Entrfin 20% 23.67 W = 4522.5, p-value = 0.1645 no Entrfin 30% 23.70 W = 4405, p-value = 0.08086 no Entrfin 40% 23.69 W = 4453.5, p-value = 0.11 no Entrfin 50% 23.75 W = 4162.5, p-value = 0.01234 yes Ucb 23.09 W = 6672.5, p-value = 7.486e-06 yes Ucb 10% 23.12 W = 6826, p-value = 6.647e-07 yes Ucb 20% 23.14 W = 6654, p-value = 7.223e-06 yes Ucb 30% 23.26 W = 6371, p-value = 0.0001507 yes Ucb 40% 23.37 W = 5893.5, p-value = 0.01316 yes Ucb 50% 23.32 W = 6108, p-value = 0.002166 yes Xwtr 23.60 W = 4806, p-value = 0.5791 no Xwtr 10% 23.62 W = 4765, p-value = 0.5002 no Xwtr 20% 23.69 W = 4453.5, p-value = 0.11 no Xwtr 30% 23.60 W = 4806, p-value = 0.5791 no Xwtr 40% 23.63 W = 4688.5, p-value = 0.3695 no Xwtr 50% 23.77 W = 4065.5, p-value = 0.004877 yes ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 30. Experiments and Results Obtained fitness with host-churn Table: Day1 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.52 W = W = 5222, p-value = 0.5322 no Ucb 21.31 W = 9708.5, p-value < 2.2e-16 yes Xwtr 23.64 W = 4640, p-value = 0.2982 no Table: Day2 Error Free fitness = 23.56 Trace Fitness Wilcoxon Test Significantly different? Entrfin 23.58 W = 4931, p-value = 0.8452 no Ucb 23.03 W = 7038.5, p-value = 4.588e-08 yes Xwtr 23.7 W = 4405, p-value = 0.08086 no ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 31. Conclusions ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 32. Conclusions Summary of Results PGA applications are fault-tolerant by nature in DG platforms. PGA features the well-known fault-tolerant technique known as graceful degradation in DG platforms. We provided a new method to mitigate the effect of failures by increasing the initial population. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 33. Conclusions Conclusions We have studied and characterized the behavior of PGA applications running in distributed platforms with high failure rates. We have tested the PGA fault-tolerance using three real-world DG traces. Our main conclusion is that PGA inherently provides graceful degradation. ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010
  • 34. Conclusions Questions daniellg@unex.es juanlu@geneura.ugr.es fcofdez@unex.es jmerelo@geneura.ugr.es Icons from Tango Desktop project and Gnome Desktop (Creative Commons & GPL License) ˜ ´ D. Lombrana, JJ. Jimenez, F. Fernandez, JJ. Merelo Evocop 2010

×