0
Dealing with Byzantine Faults CS 686 Final Project brought to you by  Chris Sosa
Overview <ul><li>Motivation in Dependable Systems </li></ul><ul><li>Common Types of Byzantine Faults </li></ul><ul><li>Sol...
The Myths <ul><li>Hardware cannot be “traitorous”! </li></ul><ul><ul><li>Anthropomorphic model </li></ul></ul><ul><ul><li>...
The Awful Truth <ul><li>Time-Triggered Architecture </li></ul><ul><ul><li>Radioactive Fault injection to one node </li></u...
Trends in Dependable Systems <ul><li>Device Physics </li></ul><ul><ul><li>Smaller and faster not always better </li></ul><...
Common Types of Observed Faults <ul><li>Value </li></ul><ul><ul><li>Issues related to digital values being the extreme of ...
Solutions?
Solutions (1) <ul><li>Full Exchange </li></ul><ul><ul><li>Uses classical Byzantine agreement </li></ul></ul><ul><ul><li>SP...
Solutions (2) <ul><li>Hierarchical </li></ul><ul><ul><li>Uses hierarchy of different fault tolerant techniques including B...
Solutions (3) <ul><li>Filtering </li></ul><ul><ul><li>Targets propagation of Byzantine faults </li></ul></ul><ul><ul><li>T...
Ignorance is not Bliss <ul><li>Can invalidate failure model </li></ul><ul><ul><li>Propagation of one fault can be disastro...
Conclusions <ul><li>Byzantine faults are real!  </li></ul><ul><li>Problems with Ignoring them </li></ul><ul><li>No amount ...
Questions?
BGP Quick Review <ul><li>Algorithm is expensive: </li></ul><ul><ul><li>Each processor has to broadcast its values for  m a...
Upcoming SlideShare
Loading in...5
×

Handling Byzantine Faults

1,004

Published on

My presentation on handling byzantine faults in distributed systems given for my graduate dependability course

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,004
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Handling Byzantine Faults"

  1. 1. Dealing with Byzantine Faults CS 686 Final Project brought to you by Chris Sosa
  2. 2. Overview <ul><li>Motivation in Dependable Systems </li></ul><ul><li>Common Types of Byzantine Faults </li></ul><ul><li>Solutions in Real Systems </li></ul>
  3. 3. The Myths <ul><li>Hardware cannot be “traitorous”! </li></ul><ul><ul><li>Anthropomorphic model </li></ul></ul><ul><ul><li>Any system with consensus is susceptible </li></ul></ul><ul><li>It’s never happened before </li></ul><ul><ul><li>Often misclassified </li></ul></ul><ul><ul><li>Legionnaire's Disease </li></ul></ul>
  4. 4. The Awful Truth <ul><li>Time-Triggered Architecture </li></ul><ul><ul><li>Radioactive Fault injection to one node </li></ul></ul><ul><ul><li>Messed up timing protocol (SOS) </li></ul></ul><ul><ul><li>Formed Cliques until system failed </li></ul></ul><ul><li>Quad Redundant Control System </li></ul><ul><ul><li>No message exchange </li></ul></ul><ul><ul><li>Lots of redundancy </li></ul></ul><ul><ul><li>One fault propagated to look like many </li></ul></ul>Professor Knight’s Computer
  5. 5. Trends in Dependable Systems <ul><li>Device Physics </li></ul><ul><ul><li>Smaller and faster not always better </li></ul></ul><ul><ul><li>Cosmic Rays, etc. </li></ul></ul><ul><li>Movement to Distributed Topologies </li></ul><ul><li>Usage of Commercial off-the-shelf (COTS) Technology </li></ul>
  6. 6. Common Types of Observed Faults <ul><li>Value </li></ul><ul><ul><li>Issues related to digital values being the extreme of analog </li></ul></ul><ul><ul><li>Propagation </li></ul></ul><ul><li>Temporal </li></ul><ul><ul><li>Different observations at same time </li></ul></ul><ul><ul><li>Synchronization doesn’t help very much </li></ul></ul><ul><li>Value + Temporal </li></ul>
  7. 7. Solutions?
  8. 8. Solutions (1) <ul><li>Full Exchange </li></ul><ul><ul><li>Uses classical Byzantine agreement </li></ul></ul><ul><ul><li>SPIDER – bus (ROBUS) design </li></ul></ul>
  9. 9. Solutions (2) <ul><li>Hierarchical </li></ul><ul><ul><li>Uses hierarchy of different fault tolerant techniques including Byzantine Agreement </li></ul></ul><ul><ul><li>Seen with Fail-Stop processors </li></ul></ul><ul><ul><li>SAFEbus </li></ul></ul><ul><ul><ul><li>Communication backplane for Boeing 777 </li></ul></ul></ul><ul><ul><ul><li>Uses two buses which are themselves dual redundant –different forms of parity detect errors </li></ul></ul></ul><ul><ul><ul><li>Uses self-checking pairs on top of buses </li></ul></ul></ul>
  10. 10. Solutions (3) <ul><li>Filtering </li></ul><ul><ul><li>Targets propagation of Byzantine faults </li></ul></ul><ul><ul><li>Tries to either </li></ul></ul><ul><ul><ul><li>Mask faults by forcing output to some straight value (removes value-type faults) </li></ul></ul></ul><ul><ul><ul><li>Segments system into Fault Containment Regions (FCR’s) where we put protections to stop propagation </li></ul></ul></ul>
  11. 11. Ignorance is not Bliss <ul><li>Can invalidate failure model </li></ul><ul><ul><li>Propagation of one fault can be disastrous </li></ul></ul><ul><ul><li>No amount of redundancy can help </li></ul></ul><ul><li>Large Economic Factor </li></ul><ul><ul><li>Possible costs of recall and redeployment </li></ul></ul>
  12. 12. Conclusions <ul><li>Byzantine faults are real! </li></ul><ul><li>Problems with Ignoring them </li></ul><ul><li>No amount of Redundancy can tolerate them w/out message exchange </li></ul><ul><li>Three categories of solutions to deal with them </li></ul>
  13. 13. Questions?
  14. 14. BGP Quick Review <ul><li>Algorithm is expensive: </li></ul><ul><ul><li>Each processor has to broadcast its values for m any rounds </li></ul></ul><ul><ul><li>Chooses majority value </li></ul></ul><ul><ul><li>Requires n > 3f where f is # of failures and n is the # of processors </li></ul></ul><ul><li>With signed messages </li></ul><ul><ul><li>Can tolerate more failures </li></ul></ul><ul><ul><li>Still expensive </li></ul></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×