Integrating Cyber Security Alerts into the Operator Display


Published on

Presented by: Michael Toecker, Digital Bond

Abstract: Control Systems are responsible for the safe and reliable governing of physical processes, and are designed to report conditions that could affect reliable operations to operators for action. These conditions may vary in their severity, from minor inconveniences to those that can bring the process to a full halt. While engineers have predicted certain events and consequences, others are “unknown unknowns”, and may only be detected due to variances from normal function.

Cyber security conditions are similar in nature. Cyber security conditions can vary in severity and cyber security professionals can classify and alert on some, but not all cyber security events. In this presentation, Michael Toecker will discuss cyber security conditions that are known, and that could be integrated into the operational display.

Treating cyber security events as analogous to control system events has many benefits and drawbacks, and Toecker will expand on criteria for determining what is appropriate for an operator display, and what is not. The purpose of this presentation is to demonstrate that cyber security can have a place in operational decisions, so long as conditions are carefully analyzed and response actions developed beforehand.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Integrating Cyber Security Alerts into the Operator Display

  1. 1. { Integrating  Cyber  Security   Alerts  into  the  Operator  Display Digital  Bond,  Inc. Michael  Toecker,  PE ddddddddd EnergySec   9th  Annual  Security  Summit
  2. 2. Ñ  Michael  Toecker Who  Am  I?
  3. 3. Monitoring   and   Response   of   Cyber   Security   Events   Originating   from   the   Control  System  Parallels  the  Monitoring   and  Response  of  Process  Events The  Premise
  4. 4. { Ñ  ICS  Operations  was  similar  to  Security   Operations Ó  ICS  had  alarms,  SecOps  had  alarms Ó  ICS  had  events,  SecOps  had  events Ó  ICS  had  historical  points,  SecOps  had   voluminous  logs Ó  ICS  had  24/7  Operators,  SecOps  had   Analysts  (some  24/7,  others  not) Ó  ICS  had  a  responsibility  for  monitoring   safe  and  effective  productions,  SecOps  had   responsibility  for  ensure  secure  and   trusted  operations I  Spent  a  Year   working  as  a   Security  Guy   in  an   Operations   Environment ICS  Ops  vs.  SecOps
  5. 5. { Task SecOps ICSOps Visualizing  Data  using  Graphs,   Charts,  etc X X Providing  Status  Indicators  when   parameters  went  out  of  normal X X Directed  Field  Personnel  to  Take   Specific  Actions  based  on  Events  or   Alarms X X Reviewing  of  Logs,  Records,  and   Other  Data  to  Improve  Efficiency   and  Locate  Problem  Areas X X Investigate  for  Compliance  and   Effect  on  Process,  and  find  ways  to   Prevent,  Detect  and  Respond X X What  I  Often   Saw  in  ICS   Operations  was   Paralleled  in   What  I  was   Doing Parallels
  6. 6. { Ñ  …was  the  data. Ó  My  data  was  security  logs,  their  data   from  process  points. Ó  But  we  were  both  identifying   conditions  that  could  impact  our   production  or  compliance,  and  taking   some  action  to  correct I  was  an   Engineer  with   Specialized   Knowledge  of   Specific   Equipment What  was  Different…
  7. 7. { Ñ  There  is  an  emphasis  on  procedure,   and  process  when  faced  with  issues Ñ  Troubleshooting  where  advanced   knowledge  is  required  is  conducted   by  those  with  the  knowledge   Ñ  Operators  follow  known  actions  that   will  return  a  system  to  a  stable  state,   usually  developed  by  process   engineers. Operators   Monitor  &   Respond,  but   Do  Not  Always   Possess   Specific   Knowledge The  Role  of  Operators
  8. 8. Why  not  add  some   Security?
  9. 9. Ñ  Lack  of  Understanding  and  Confusion  about   Computer  Security Ñ  Owner  A]itude  is  that  Security  has  nothing  to   Do  with  Operations Ñ  Leads  to  Reduction  in  Situational  Awareness Ñ  Operators  Don’t  Know  What  Actions  To  Take The  Problems
  10. 10. Ñ  Proper  Notification  Reduces  Response  Time  to   Security  Incidents Ñ  Regulatory  Requirements  can  be  Met  With   Existing  Personnel Ñ  Alerts  and  Events  directly  to  24/7  Personnel   look  Awesome  as  Compensating  Controls The  Benefits
  11. 11. { Cyber  Security   Events  &   Incidents Detectable  w/   Security   Monitoring Security  Events   Operators   Could  Respond   To Not  a   Substitute  for  a   Focused   Security   Monitoring   Program The  Limitations
  12. 12. { Monitor,   and   Analyze Identify   Security   Conditions Identify   Operational   Events Develop   Procedures   for  Action Implement   Condition   and   Procedure Security   Monitoring   Program   Should  Feed   into   Conditions  for   Operator  Alerts The  Role  of  Security
  13. 13. { Monitor   Data  Points Identify   Process   Conditions Identify   Operational   Events Develop   Procedures   for  Action Implement   Condition   and   Procedure This  looks  a  lot   like  Process   Intelligence   Process,  the   only  difference   is  the  Analysis   and   Knowledge ….wait  a  minute.
  14. 14. Identify  Specific  Clear  Cyber  Security  Events Determine  Events  Appropriate  for  Operator   A]ention Create  Operations  Procedures  for  Actions Develop  a  Detection  and  Presentation  Strategy The  Process
  15. 15. { Part  1 Identify   Operational     Events  
  16. 16. { …Clear • No  Ambiguity • Straightforward  Yes/No   Decision  Point …Derivable • Sourced  Directly  from  Control   Systems  Security  Data,  not  from   Intuition  or  Analysis …Actionable • Specific  Actions  can  be  taken  on   receipt  of  the  Event • Not  Dependent  on  Other   Events,  or  on  Further  Analysis An  Operational     Cyber  Security   Event  Should   Be.. Identify  and  Define
  17. 17. { Ñ  Questions  to  Ask Ó  What  do  my  regulations  tell  me  to  be   concerned  with? Ó  What  do  various  standards  bodies  tell   me  to  be  concerned  with? Ó  Do  I  have  specific  policy  statements   that  suggest  alerting,  or  24/7  response? Ó  What  Lessons  Learned  Do  I  have   related  to  Cyber  Security? Identify  Cyber   Security   Conditions  to   Alert  On Identifying  Events This  is  my  polite  way  of  saying   “If  You  Got  Hacked,  How  Did   it  Happen?”
  18. 18. List  of   Security   Conditions   Regulations   Require   Monitoring   and  Action Standards   suggest  an   Approach Security   Policy  may   Specify   Conditions   Lessons   Learned  from   Security   Incidents Determine  Conditions
  19. 19. { Ñ  CIP-­‐‑007  R4  –  Malicious  Software  Prevention Ó  Paraphrase:  ~..shall  use  anti-­‐‑virus  software   to  detect  malware  on  all  Cyber  Assets  within   the  ESP~ Ó  Conclusion:  I  should  alert  on  anti-­‐‑virus   detections Ñ  CIP-­‐‑007  R5  –  Monitoring  Electronic  Access Ó  Paraphrase:  ~monitoring  processes  shall   detect  and  alert  for  a]empts  at  or  actual   unauthorized  access~ Ó  Conclusion:  A]empts  at  unauthorized  access   include  incorrect  passwords,  alert  on  that. Regulations,   such  as  NERC   CIP,  may   provide  clues   as  to  what   events  should   be  monitored Regulations Well,  I  did  say  clues… Source:  NERC  CIP  Standards,  V3
  20. 20. { Ñ  Section  3.2.2  –  Signs  of  an  Incident Ó  ~Too  many  indicators  exist  to   exhaustively  list  them~ Ó  ~Common  ones  include  multiple  failed   login  a]empts,  deviations  from   normal  network  traffic,  filenames  with   unusual  characters..~ Standards  can   help  as  well,   but  still  are   clues  not  firm   guidance Standards Source:  NIST  SP-­‐‑800-­‐‑61
  21. 21. { Ñ  What  I’ve  seen  in  the  past: Ó  ~Addition  and  Modifications  of  Users   shall  be  conducted  through  the  change   control  process~ Ó  ~New  Software  on  Control  Systems   requires  approval  by  the  Senior   Manager~ Conditions   may  exist  in   your   corporations  IT   Security   Policies Policy  Remarks
  22. 22. { Ñ  Good  Security  Comes  with   Experience,   Ó  Most  Experience  Comes  from   Failures  in  Security Ñ  ….but  it  doesn’t  have  to  be  YOUR   Failures  in  Security Ó  Talk,  Listen,  Learn Why   Information   Sharing  is   Important. Lessons  Learned
  23. 23. { There  are  tons   of  events   available,  but   not  all  are   relevant  or   appropriate  for   Operations Complex,  Irrelevant
  24. 24. { • Start  with  from  general   security  conditions • Trim  to  Specific  Events   within  those  categories Top   Down • Start  with  Every  Potential   Event  that  Could  Be   Generated • Trim  to  Specific  Events  from   the  Potentials Bo]om   Up There  are  Two   Main  Methods   for  Identifying   Events Methods  to  Identify
  25. 25. { Ñ  Specific  Classes  of  Computer  Security   Events Ó  Virus  Detection,  Failed  Logins,   Disallowed  Ports,  etc Ó  Good  Source  of  Some  Classes  –  NIST   SP-­‐‑800-­‐‑53 Ñ  Useful  for  PC  based  systems,  which   often  have  a  huge  amount  of  capacity   for  security Top  Down  is   Good  For   Systems  with   Many  Potential   Events Top  Down  Approach
  26. 26. { Cyber  Security   Event  Class:   Virus   Detection Top  Down  Example
  27. 27. { Ñ  Enumerate  the  Security  Capabilities  of   the  Device.  Examples: Ó  Provides  Specific  Syslog  Evidence Ó  Sets  a  Point  when  a  Login  Threshold   has  been  reached Ñ  Useful  for  Devices,  where  Capability   is  often  limited BoYom  Up  is   Good  for   Systems  with   Limited   Capability  for   Security Bo]om  Up  Approach
  28. 28. { Review  of  Manuals  and  Datasheets  can  identify   detectable  Cyber  Security  Events Bo]om  Up  Example Source:  S&C  IntelliRuptor  Instruction  Sheet  766-­‐‑560
  29. 29. { Ñ  Top  Down Ó  Allows  you  to  set  criteria,  and  then  delve   into  system  to  find  triggers  to  meet  it Ó  Avoids  the  complexity  of  ge]ing  into  the   weeds  of  system  events Ó  May  miss  important  conditions  due  to   avoiding  those  same  weeds Ñ  Bo]om  Up Ó  Complex,  but  most  Detailed Ó  Requires  analysis  of  many  events  that  will   likely  never  make  it  in  front  of  an  operator There  are   advantages  and   disadvantages   of  each   Approach Compare  and     Contrast
  30. 30. { Ñ  Windows  Based  Computers  are  the   obvious  systems  to  use  Top  Down Ó  Event  Heavy,  Highly  Complex Ó  Events  were  designed  from  an   incident  response  perspective,  not   from  an  alert  perspective Use  Top  Down   when  a  system   is  highly   capable  of   reporting   security  events   to  narrow  your   range When  to  Use  an     Approach
  31. 31. { Ñ  Systems  like  PLCs,  Controllers,  some   Network  Devices  have  limited   capability  to  report  security  status Ó  Won’t  be  able  to  simply  define  events,   you’ll  have  to  work  with  what’s  there Use  BoYom  Up   when  working   with  devices   that  report  on   few  security   conditions When  to  Use  an     Approach
  32. 32. { Condition Source Anti-­‐‑Virus  Detection NERC  CIP-­‐‑007  R4 User  Modified  or  Added NERC  CIP-­‐‑007  R5 Security  Logs  Deleted NERC  CIP-­‐‑007  R6 Security  Logs  Full NERC  CIP-­‐‑007  R6 Excessive  Incorrect  Login NERC  CIP-­‐‑007  R6 Use  of  Removable  Media Good  Practice New  Software  Installed IT  Policy Logging  Options   Changed IT  Policy The  End  Result   of  this  Analysis   is  a  List  of   Conditions  to   Alert  On List  of  Conditions Note:  This  list  is  far  from  comprehensive
  33. 33. { Part  2 Appropriate  for     Operators
  34. 34. { Ñ  Is  the  Condition  a  Clear  Cyber   Security  Event? Ñ  Is  the  Condition  Derivable  directly   from  Logs,  Alerts,  and  other   evidence? Ñ  Is  the  Condition  Actionable  by   Operators? Not  Every   Condition  is   Appropriate   for  Operator   Notification Appropriate  for     Operators
  35. 35. { Condition Source Anti-­‐‑Virus  Detection NERC  CIP-­‐‑007  R4 User  Modified  or  Added NERC  CIP-­‐‑007  R5 Security  Logs  Deleted NERC  CIP-­‐‑007  R6 Security  Logs  Full NERC  CIP-­‐‑007  R6 Excessive  Incorrect  Login NERC  CIP-­‐‑007  R6 Use  of  Removable  Media Good  Practice New  Software  Installed IT  Policy Logging  Options   Changed IT  Policy Unclear   Conditions  are   Removed  from   the  List Is  it  Clear?   Note:  This  list  is  far  from  comprehensive
  36. 36. { Condition Source Anti-­‐‑Virus  Detection NERC  CIP-­‐‑007  R4 Security  Logs  Deleted NERC  CIP-­‐‑007  R6 Security  Logs  Full NERC  CIP-­‐‑007  R6 Excessive  Incorrect  Login NERC  CIP-­‐‑007  R6 Use  of  Removable  Media Lesson  Learned Remove   Conditions   Incapable  of   being  Derived   from  Evidence,   or  Require   Analysis Is  it  Derivable? Note:  This  list  is  far  from  comprehensive
  37. 37. { Condition Detection   Method Reliability Anti-­‐‑Virus   Detection Windows  Event   Log Very  Reliable,  Test  Indicates   an  event  generated  on  each   detection  in  SYSTEM  log Security  Log   Deleted Windows  Event   Log Very  Reliable,  an  explicit   event  is  created  on  clearing Excessive   Incorrect  Login Windows  Event   Log Reliable,  so  long  as  the   account  lockout  se]ings  in   SECPOL.msc  are  set  correctly Use  of   Removable   Media May  require  3rd   party  program. Not  Always  Possible  without   3rd  Party  Program How  Reliable   are  the   Detection   Methods?  Do   they  have   potential  false   positives? Reliable  and  Unreliable   Conditions Note:  This  list  is  far  from  comprehensive
  38. 38. { Remove   Conditions  that   an  Operator   cannot   Realistically   take  Action  On Is  it  Actionable? Condition Source Anti-­‐‑Virus   Detection NERC  CIP-­‐‑007  R4 Security  Logs   Deleted NERC  CIP-­‐‑007  R6 Security  Logs   Full NERC  CIP-­‐‑007  R6 Excessive   Incorrect  Login NERC  CIP-­‐‑007  R6
  39. 39. { Why  were  some  of  the  conditions  removed? An  Aside… Ñ  User  Modified  or  Added Condition Reason  for  Removal User  Modified   or  Added Not  Clear,  as  there  are  legitimate  reasons  for  adding,  or   modifying  a  User  and  these  reasons  aren’t  apparent   without  analysis. Security  Log   Full Not  Actionable,  as  operators  should  be  doing   maintenance  and  admin  functions. Removable   Media Not  Derivable,  on  most  systems  as  is.  May  require  a  3rd   Party  program  to  do  a  decent  job  of  this.
  40. 40. { Ñ  Example:  Removable  Media  Detection Ó  Wasn’t  able  to  do  this  in  Native   Windows  in  a  Clear  and  Derivable   manner Ó  Use  of  Third  Party  tools  can  change   this,  making  it  possible  to  monitor  and   alert A  Previously   Rejected   Condition  can   become  valid   with  New   Information  or   Technology When  Conditions     Change
  41. 41. { Ñ  USB  Based  Infection  Lesson  Learned Ó  New  USB  Showed  up  in  Registry  Change Ó  Auto-­‐‑Run  Shows  up  in  Registry  Change Ó  Addition  of  Programs  to  the  “Run”  and   “RunOnce”  keys  in  the  Registry Ó  Copying  of  Files  into  “System”,   “System32” Ñ  Is  this  Clear?  Definable?  Actionable? Some  of  the   More   Advanced   Conditions   That  We  Can   Define Let’s  Get  Crazy…
  42. 42. { List  of   Conditions  has   been   generated,   what  next? What  Comes  Next? Condition Detection   Method Reliability Anti-­‐‑Virus   Detection Windows   Event  Log Very  Reliable,  Test   Indicates  an  event   generated  on  each   detection  in  SYSTEM  log Event  Log   Was  Cleared Windows   Event  Log Very  Reliable,  an  explicit   event  is  created  on   clearing Excessive   Incorrect   Login Windows   Event  Log Reliable,  so  long  as  the   account  lockout  se]ings   in  SECPOL.msc  are  set   correctly
  43. 43. { Part  3 Create  Operations   Procedures
  44. 44. { Ñ  Notifying  Operators  of  Cyber  Security   Events  is  useless  if  the  Operator  has   no  action  to  take Ñ  This  guidance  typically  takes  the  form   of  Operational  Procedures Ñ  Each  Event  must  have  an  appropriate   action  to  be  taken This  is  Now  a   Procedure   Exercise Operator  Actions
  45. 45. { Ñ  “Notify  Lead  I&C  Engineer  by   Phone” Ñ  “Isolate  Infected  System  From   Network  by  Disconnecting  Ethernet” Ñ  “Call  Out  via  Radio  to  check  if  invalid   login  is  from  authorized  user” Be  Succinct   and  Specific Guidelines  for  Actions
  46. 46. { Ñ  No  IT  Administrative  Functions Ñ  No  Maintenance  Functions Ñ  Limit  the  Analysis  Necessary Ñ  …and  don’t  give  them  someone  else’s   work Keep  the   Guidance   within   Operator’s   Authorized   Abilities Guidelines  for  Actions
  47. 47. { Personnel   Responsible Trigger Actions Documentation An  Operating   Procedure  has  a   few  common   characteristics Operating  Procedures
  48. 48. Example     Operating  Procedure Bring  up  Example  Procedure
  49. 49. { Ñ  Case  in  Point  –  Conficker  (MS08-­‐‑67) Ó  Highly  Aggressive  Worm  which   impacts  network  communication Ó  Makes  use  of  very  reliable  exploit  in   Server  service Ó  A]empts  to  brute  force  accounts Ó  Spreads  over  USB  and  removable   media  as  well Some  Cyber   Security  Events   may  Cause   Production   Impacts Worst  Case  Scenario
  50. 50. { Ñ  A  Highly  Aggressive  worm  like  Conficker   can  have  production  consequences.   Ó  Continuing  to  operate  while  this  is  going   on  is  risky. Ó  Who  makes  the  decision  to  halt   production?  Operator?  Shift  Supervisor?   Plant  Manager? Ñ  Make  sure  the  information  gets  to  those   make  the  decision. What  guidance   would  prepare   an  operator  for   these  Alarms? Worst  Case  Scenario
  51. 51. { Section  5 Present  to     Operator
  52. 52. { Ñ  Most  Cited: Ó  The  Alarm  Management  Handbook   The  High-­‐‑Performance  HMI   Handbook. Ñ  Wri]en  by  Bill  Hollifield  and  Paul   Gruhn Ó  Of  Course,  Nothing  Specific  on   Security There  is   already  a  lot  of   guidance  on   development   of  Operator   Displays Operator  Displays
  53. 53. { Ñ  Help  Operators  Perceive  the  Important   Security  Data Ñ  Give  Operators  Data-­‐‑in-­‐‑Context Ñ  Help  Them  Comprehend  the  Situation  in   Terms  of  the  Process Ñ  Help  Predict  Future  Status  by  Providing   Trending Guidelines  for   Cyber  Security   Displays Operator  Displays -­‐‑  Tough  right  now…  At  least  without  giving  access  to  an  SIEM
  54. 54. Cyber  Security   Master  Display Anti-­‐‑Virus   Status  Display Users  Status   Display Removable   Media  Status   Display Event  Log   Status  Display Concept  Operator     Display
  55. 55. Mock  Up
  56. 56. { Ñ  Many  HMIs  can  accept  SNMP  Traps Ó  Often  used  for  alerting  when  hosts   stop  communicating Ó  Security  tools  can  feed  this,  in  certain   conditions Ñ  Security  Logs  don’t  Translate  Well   into  traditional  displays Ó  How  do  you  ‘trend’  when  you  have   thousands  of  event  ids? Summary: Limited,  and   Nowhere  Near   Ideal Integration  with     the  HMI
  57. 57. { Thanks, Mike Questions?
  58. 58. More  Research  at  S4 Ñ  Digital  Bond’s  S4   Conference  in  Miami   Beach,  January  2014 Ñ  Got  an  Idea?   Ó  Submit  a  presentation! Ñ  Details  on