Safety-Crtical Embedded Systems


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Safety-Crtical Embedded Systems

  1. 1. Safety-­‐Cri+cal  Embedded  Systems   Lecture  1:     Introduc+on  
  2. 2. Lecture  outline   •  Course  informa+on   –  Examina+on:  project   •  What  is  a  “safety-­‐cri+cal  embedded  system”?   –  Embedded  systems   –  Real-­‐+me  systems   –  Safety-­‐cri+cal  systems   •  Fundamental  concepts  of  dependability   –  The  “dependability”  concept   –  Threats:  fault,  error,  failure   –  ALributes:  reliability,  availability   Lecture  1/2  
  3. 3. Course  informa+on   •  Contact   –  Paul  Pop,  course  leader  and  examiner   •  Email:   •  Phone:  4525  3732   •  Office:  building  322,  office  228   •  Webpage   –  All  the  informa+on  is  on  CampusNet   Lecture  1/3  
  4. 4. Course  informa+on,  cont.   •  Textbook:   Israel  Koren  and  C.  Mani  Krishna,   Fault-­‐Tolerant  Systems   Morgan  Kaufmann   •  Full  text  available  online,   see  the  link  on  CampusNet   Lecture  1/4  
  5. 5. Course  informa+on,  cont.   •  Lectures   –  Language:  English   –  12  lectures   •  Lecture  notes:  available  on  CampusNet  as  a  PDF  file  the  day  before   •  Dec.  1  is  used  for  the  project   •  Two  invited  lectures,  from  Novo  Nordisk  and  Danfoss   •  Examina+on   –  Project:  70%  report  +  30%  presenta+on   •  7.5  ECTS  points   Lecture  1/5  
  6. 6. Project   •  Milestones   –  End  of  September:  Group  registra+on  and  topic  selec+on   •  Email  to   –  End  of  October:  Project  report  drae   •  Upload  drae  to  CampusNet   –  End  of  November:  Report  submission   •  Upload  final  report  to  CampusNet   –  Last  lecture:  Project  presenta+on  and  oral  opposi+on   •  Upload  presenta+on  to  CampusNet   Lecture  1/6  
  7. 7. Project,  cont.   •  Project  registra+on   –  E-­‐mail  Paul  Pop,   •  Subject:  02228  registra+on   •  Body:   –  Name  student  #1,    student  ID   –  Name  student  #2,  student  ID   –  Project  +tle   –  Project  details   •  Notes   Project  approval   –  Groups  of  max.  3  persons   Lecture  1/7  
  8. 8. Project,  cont.   •  Topic  categories   1.  Literature  survey   •  See  the  “references”  and  “further  reading”  in  the  course  literature   2.  Tool  case-­‐study   •  Select  a  commercial  or  research  tool  and     use  it  on  a  case-­‐study   3.  Soeware  implementa+on   •  Implement  a  technique,     e.g.,  error  detec+on  or  fault-­‐tolerance  technique   –  Suggested  topics  available  on  CampusNet   Lecture  1/8  
  9. 9. Project,  cont.   •  Examples  of  last  years’  projects   –  ARIANE  5:  Flight  501  Failure     –  Hamming  Correc+ng  Code  Implementa+on  in     Transmimng  System   –  Applica+on  of  a  Fault  Tolerance  to  a  Wind  Turbine   –  Guaranteed  Service  in  Fault-­‐Tolerant  Network-­‐on-­‐Chip   –  Fault  tolerant  digital  communica+on   –  Resilience  in  Mobile  Mul+-­‐hop  Ad-­‐hoc  Networks   –  Fault  tolerant  ALU   –  Reliable  message  transmission  in  the  CAN,  TTP  and  FlexRay   Lecture  1/9  
  10. 10. Project  deliverables   1.  Literature  survey   2.  Tool  case-­‐study   –  WriLen  report   –  Case-­‐study  files   •  Structure   –  Report   –  Title,  authors   •  Document  your  work   –  Abstract   –  Introduc+on   –  Body   4.  Soeware  implementa+on   –  Conclusions   –  Source  code  with  comments   –  References   –  Report   •  Document  your  work   Deadline  for  drae:   Deadline  for  final  version   End  of  October   End  of  November   Lecture  1/10  
  11. 11. Project  presenta+on  &  opposi+on   •  Poster  presenta+on  of  project   Deadline:   –  15  min.  +  5  min.  ques+ons   Last  lecture   •  Note!   –  During  the  presenta+on  you  might  be  asked     general  ques+ons  that  relate  to  any  course  topic   Lecture  1/11  
  12. 12. Embedded  systems   •  Compu+ng  systems  are  everywhere   •  Most  of  us  think  of  “desktop”  computers   –  PC’s   –  Laptops   –  Mainframes   –  Servers   •  But  there’s  another  type  of  compu+ng  system   –  Far  more  common...   Lecture  1/12  
  13. 13. Embedded  systems,  cont.   •  Embedded  compu+ng  systems   –  Compu+ng  systems  embedded  within   electronic  devices   –  Hard  to  define.  Nearly  any  compu+ng   system  other  than  a  desktop  computer   –  Billions  of  units  produced  yearly,  versus   millions  of  desktop  units   –  Perhaps  50  per  household  and  per   automobile   Lecture  1/13  
  14. 14. What  is  an  embedded  system?   •  Defini+on   –  an  embedded  system  special-­‐purpose  computer  system,   part  of  a  larger  system  which  it  controls.   •  Notes   –  A  computer  is  used  in  such  devices  primarily  as  a  means  to   simplify  the  system  design  and  to  provide  flexibility.     –  Oeen  the  user  of  the  device  is  not  even  aware  that  a   computer  is  present.     Lecture  1/14  
  15. 15. Characteris+cs  of  embedded  systems   •  Single-­‐func+oned   –  Dedicated  to  perform  a  single  func+on   •  Complex  func+onality   –  Oeen  have  to  run  sophis+cated  algorithms  or  mul+ple  algorithms.   •  Cell  phone,  laser  printer.   •  Tightly-­‐constrained   –  Low  cost,  low  power,  small,  fast,  etc.   •  Reac+ve  and  real-­‐+me   –  Con+nually  reacts  to  changes  in  the  system’s  environment   –  Must  compute  certain  results  in  real-­‐+me  without  delay   •  Safety-­‐cri+cal   –  Must  not  endanger  human  life  and  the  environment   Lecture  1/15  
  16. 16. Func+onal  vs.  non-­‐func+onal  requirements   •  Func+onal  requirements   –  output  as  a  func+on  of  input   •  Non-­‐func+onal  requirements:   –  Time  required  to  compute  output   –  Reliability,  availability,  integrity,     maintainability,  dependability   –  Size,  weight,  power  consump+on,  etc.   Lecture  1/16  
  17. 17. Real-­‐+me  systems   •  Time   –  The  correctness  of  the  system  behavior  depends  not  only  on   the  logical  results  of  the  computa+ons,  but  also  on  the  !me   at  which  these  results  are  produced.   •  Real   –  The  reac+on  to  the  outside  events  must  occur  during  their   evolu+on.  The  system  +me  must  be  measured  using  the   same  +me  scale  used  for  measuring  the  +me  in  the   controlled  environment.   Lecture  1/17  
  18. 18. Real-­‐+me  systems,  cont.   Lecture  1/18  
  19. 19. Safety-­‐cri+cal  systems   •  Defini+ons   –  Safety  is  a  property  of  a  system  that  will  not  endanger   human  life  or  the  environment.   –  A  safety-­‐related  system  is  one  by  which  the  safety  of  the   equipment  or  plant  is  ensured.   •  Safety-­‐cri?cal  system  is:   –  Safety-­‐related  system,  or   –  High-­‐integrity  system   Lecture  1/19  
  20. 20. System  integrity   •  Defini+on   –  The  integrity  of  a  system  is  its  ability  to  detect  faults  in  its   own  opera+on  and  to  inform  the  human  operator.   •  Notes   –  The  system  will  enter  a  failsafe  state  if  faults  are  detected   –  High-­‐integrity  system   •  Failure  could  result  large  financial  loss   •  Examples:  telephone  exchanges,  communica+on  satellites   Lecture  1/20  
  21. 21. Failsafe  opera+on   •  Defini+on   –  A  system  is  failsafe  if  it  adopts  “safe”  output  states  in  the   event  of  failure  and  inability  to  recover.   •  Notes   –  Example  of  failsafe  opera+on   •  Railway  signaling  system:  failsafe  corresponds  to  all  the  lights  on  red   –  Many  systems  are  not  failsafe   •  Fly-­‐by-­‐wire  system  in  an  aircrae:  the  only  safe  state  is  on  the  ground   Lecture  1/21  
  22. 22. Preliminary  topics   •  Fundamental  concepts  of  dependability   •  Means  of  achieving  dependability     •  Hazard  and  risk  analysis     •  Reliability  analysis     •  Hardware  redundancy     •  Informa+on  and  +me  redundancy     •  Soeware  redundancy     •  Checkpoin+ng     •  Fault-­‐Tolerant  Networks     Lecture  1/22  
  23. 23. Dependability:  an  integra+ng  concept   •  Dependability  is  a   Availability   property  of  a  system   Reliability   that  jus+fies  placing   Safety   one’s  reliance  on  it.   aEributes   Confiden?ality   Integrity   Maintainability   Fault  preven?on   Fault  tolerance   dependability   means   Fault  removal   Fault  forecas?ng   Faults   threats   Errors   Failures   Lecture  1/23  
  24. 24. Threats:  Faults,  Errors  &  Failures     Error   Fault   Failure   Unintended   Cause  of  error   internal  state   Devia+on  of  actual  service   (and  failure)   of  subsystem   from  intended  service   Lecture  1/24  
  25. 25. Threats:  Faults,  Errors  &  Failures,  cont.   •  Fault   –  Physical  defect,  imperfec+on,  of  flaw  that  occurs  within  some  hardware   or  soeware  component.   –  Examples   •  Shorts  between  electrical  conductors   •  Physical  flaws  or  imperfec+ons  in  semiconductor  devices   •  Program  loop  that  when  entered  can  never  be  exited   –  Primary  cause  of  an  error  (and,  perhaps,  a  failure)   •  Does  not  necessarily  lead  to  an  error   e.g.,  a  bit  in  memory  flipped  by  radia+on   –  can  cause  an  error  if  next  opera+on  on  memory  cell  is  “read”   –  causes  no  error  if  next  opera+on  on  memory  cell  is  “write”   Lecture  1/25  
  26. 26. Threats:  Faults,  Errors  &  Failures,  cont.   •  Error   –  An  incorrect  internal  state  of  a  computer   •  Devia+on  from  accuracy  or  correctness   –  Example   •  Physical  short  results  in  a  line  in  the  circuit  permanently  being  stuck   at  a  logic  1.  The  physical  short  is  a  fault  in  the  circuit.  If  the  line  is   required  to  transi+on  to  a  logic  0,  the  value  on  the  line  will  be  in   error.   –  The  manifesta+on  of  a  fault   –  May  lead  to  a  failure,  but  does  not  have  to   Lecture  1/26  
  27. 27. Threats:  Faults,  Errors  &  Failures,  cont.   •  Failure   –  Denotes  a  devia+on  between  the  actual  service  and  the   specified  or  intended  service     –  Example   •  A  line  in  a  circuit  is  responsible  for  turning  a  valve  on  or  off:  a  logic  1   turns  the  valve  on  and  a  logic  0  turns  the  valve  off.  If  the  line  is  stuck   at  logic  1,  the  valve  is  stuck  on.  As  long  as  the  user  of  the  system   wants  the  valve  on,  the  system  will  be  func+oning  correctly.   However,  when  the  user  wants  the  valve  off,  the  system  will   experience  a  failure.   –  The  failure  is  an  event  (i.e.  occurs  at  some  +me  instant,  if   ever)  caused  by  an  error   Lecture  1/27  
  28. 28. The  pathology  of  failure   Lecture  1/28  
  29. 29. Three-­‐universe  model   1.  Physical  universe:  where  the  faults  occur   –  Physical  en++es:  semiconductor  devices,  mechanical  elements,   displays,  printers,  power  supplies   –  A  fault  is  a  physical  defect  or  altera+on  of  some  component  in  the   physical  universe   2.  Informa?onal  universe:  where  the  error  occurs   –  Units  of  informa+on:  bits,  data  words   –  An  error  has  occurred  when  some  unit  of  informa+on  becomes   incorrect   3.  External  (user’s  universe):  where  failures  occur     –  User  sees  the  effects  of  faults  and  errors   –  The  failure  is  any  devia+on  from  the  desired  or  expected  behavior   Lecture  1/29  
  30. 30. Causes  of  faults   •  Problems  at  any  stages  of  the  design  process  can  result  in  faults  within  the  system.   Lecture  1/30  
  31. 31. Causes  of  faults,  cont.   •  Specifica+on  mistakes   –  Incorrect  algorithms,  architectures,  hardware  or  soeware  design   specifica+ons   •  Example:  the  designer  of  a  digital  circuit  incorrectly  specified  the  +ming   characteris+cs  of  some  of  the  circuit’s  components   •  Implementa+on  mistakes   –  Implementa+on:  process  of  turning  the  hardware  and  soeware  designs   into  physical  hardware  and  actual  code   –  Poor  design,  poor  component  selec+on,  poor  construc+on,     soeware  coding  mistakes   •  Examples:  soeware  coding  error,  a  printed  circuit  board  is  constructed  such   that  adjacent  lines  of  a  circuit  are  shorted  together   Lecture  1/31  
  32. 32. Causes  of  faults,  cont.   •  Component  defects   –  Manufacturing  imperfec+ons,  random  device  defects,     component  wear-­‐out   –  Most  commonly  considered  causes  of  faults   •  Examples:  bonds  breaking  within  the  circuit,  corrosion  of  the  metal   •  External  disturbance   –  Radia+on,  electromagne+c  interference,  operator  mistakes,   environmental  extremes,  baLle  damage   •  Example:  lightning   Lecture  1/32  
  33. 33. Failure  modes   Lecture  1/33  
  34. 34. Failure  modes,  cont.   •  Failure  domain   –  Value  failures  :  incorrect  value  delivered  at  interface   –  Timing  failures  :  right  result  at  the  wrong  +me  (usually  late)   •  Failure  consistency     –  Consistent  failures  :  all  nodes  see  the  same,  possibly  wrong,  result   –  Inconsistent  failures  :  different  nodes  see  different  results   •  Failure  consequences   –  Benign  failures  :  essen+ally  loss  of  u+lity  of  the  system   –  Malign  failures  :  significantly  more  than  loss  of  u+lity  of  the  system;   catastrophic,  e.g.  airplane  crash     •  Failure  oRenness  (failure  frequency  and  persistency)   –  Permanent  failure  :  system  ceases  opera+on  un+l  it  is  repaired   –  Transient  failure  :  system  con+nues  to  operate   •  Frequently  occurring  transient  failures  are  called  intermiLent   Lecture  1/34  
  35. 35. Failure  modes,  cont.   •  Consistent  failures   –  Fail-­‐silent   •  system  produces  correct  results  or  remains  quiet  (no  delivery)   –  Fail-­‐crash   •  system  produces  correct  results  or  stops  quietly   –  Fail-­‐stop   •  system  produces  correct  results  or  stops  (made  known  to  others)   •  Inconsistent  failures   –  Two-­‐faced  failures,  malicious  failures,  Byzan+ne  failures   Lecture  1/35  
  36. 36. Propor+on  of  failures   Lecture  1/36  
  37. 37. Dependability  aLributes   •  Availability:  readiness  for  correct  service   •  Reliability:  con+nuity  of  correct  service   •  Safety:  absence  of  catastrophic  consequences  on  the  user(s)   and  the  environment   •  Confiden?ality:  absence  of  unauthorized  disclosure  of   informa+on   •  Integrity:  absence  of  improper  system  altera+ons   •  Maintainability:  ability  to  undergo,  modifica+ons,  and  repairs   •  Security:  the  concurrent  existence  of  (a)  availability  for   authorized  users  only,  (b)  confiden+ality,  and  (c)  integrity  with   ‘improper’  taken  as  meaning  ‘unauthorized’.   Lecture  1/37