Functional
Safety and
Security
What Are the Real Issues and What
Should We Be Doing About It?
ICS Cyber Security Conference 2013

Walt Boyes
Editor in Chief
Control and
ControlGlobal.com
“Careful, we don’t want to learn from this!”
Functional
Security, both Cyber
and Physical, is a
Subset of Functional
Safety
Why are Security and Safety
so HARD?
Why is Safety so HARD?
Insanity is doing the same thing
over and over and expecting
different results!
Now, Back to BP…
Clearly, it is not enough
to “mean well”…

Former BP CEO
Tony Haywood
…and the Olympic Pipeline
Disaster…

A cyber incident that cost lives… and
destroyed a company
The problem isn’t just safety
SIS
Security
Alarm Management
Operations
Training
Company Goals
Building SIS in a vacuum
SIS has to be part of an
overall proactive safety
strategy—one that includes
cyber security and training
Building SIS in a vacuum
SIS must also be part of an
overall proactive security
strategy: Security is a safety
issue!
Alarm Management…really
Alarm management: cure or
symptom?
Make the operator more
effective
Using operators correctly
Optimizing the HMI and using
operators correctly are all
part of what we’re calling
alarm management
Operators are professionals…
Operators need to be in
charge of the process
Operators are not clerks or
technicians
Functional alarm management
Like safety, alarm
management must be a
continuous process…
A Fish Stinks from the Head
For security as well as safety,
there must be support from
highest management levels…
Physical Security
• Perimeter security
• Personnel location
Functional Cyber Security
How do you protect
systems that were
designed to be
inherently open?
Call it “Functional
Security” to
differentiate its
needs…
Training that means something
90 days on nights isn’t enough
Training for the future…how?
…and who?
Why are Safety and Security so HARD?
And Then There Was
Stuxnet…
Security is a
Safety issue. If
you didn’t
believe that,
now you do...or
maybe not.
What we know can be done
Attacks from outside
Network attacks
Device attacks
Physical attacks
Attacks from inside
Network, device and system
Combined cyber and physical
attacks
Is It Flight or
Fight?
Why are Security and Safety
so HARD?
So, just
where does
that leave
YOU?
Hero or Goat?

Functional Safety and Security: ICS Cyber Security is Part of Functional Safety

Editor's Notes

  • #2 {"16":"When I was a young man, I worked for a very badly run company. The problem was the founder and CEO…who had long reached the Peter Principle stage. We tried to run the company around him, but it didn’t work. A management consultant we talked to explained why. “A fish stinks from the head,” she said, quoting a Yiddish proverb.\nSince the money comes from highest management, it doesn’t matter fundamentally if we can change our thinking about alarm management, optimization, functional safety and functional security projects and begin thinking of them as issues that require continuous process control and process optimization themselves.\nIf we cannot communicate the importance of this paradigm shift to higher management, nothing will change, and people will continue to die.\nAnd as the difference between Tony Haywood’s results and those of Andrew Livelis shows, we have to “really, really, really mean it.”\n","5":"BP, according to a discussion I had with John Sieg, head of Group Operations, just before the disaster, spent nearly $2 billion from 2005 to just before the Deepwater Horizon incident in a formal attempt to improve the safety culture of the corporation. This was mandated by then-CEO Tony Haywood himself, who reviewed progress weekly. Based on the facts unearthed by the final report, the attempt failed to prevent catastrophic failure after catastrophic failure, each of which, as Béla Lipták has pointed out in my magazine (www.controlglobal.com/articles/2010/OilBlowouts1008.html), could easily have been prevented. The report points out that in order to execute one fail-safe operation, 36 buttons needed to be pushed. And, the operators told investigators that they didn't feel they had permission to push the panic button by themselves.\nThat is as much social engineering as a spearphishing attack, or a USB stick in the parking lot.\n","22":"So, we know that there are many attack vectors. There are attacks from outside the plant. There are attacks from inside the plant, both intentional and not. There are combined cyber and physical attacks that could come in waves. We know these things are possible. We know these things are practical (they can be done). We know that there are some unknown trigger points that will make bad actors commit these attacks. One of the things we need to keep in mind is the intersection of plant safety and security with politics outside the plant and outside the corporation. \n","11":"Safety systems are part of the control systems in the plant, and safety systems must be considered in any cyber security strategy we implement. Even a traditional standalone SIS system could be penetrated and damaged if connected in any way to a control system or to the plant information network, even by means of a serial data line. At the 2008 ICS Cyber Security Conference, Bryan Singer, then co-Chair of the ISA99 cyber security standard committee, and Nate Kube of Wurldtech Security Technologies demonstrated a hack of an integrated safety system (one that had already received TUV approval and is still being sold). In less than 25 seconds, he was able to futz the system and force it to fail unsafely. Other hacks have been demonstrated to work against traditional stand-alone systems, too. And then there was Stuxnet. And there was Aurora. And there will be more.\nThis illustrates the absolute fact that safety and security interrelate. And so do perimeter security, fire and gas safety controls, and personnel locating technologies. In this age of integrated systems, nothing stands truly alone.\nWe often speak of “Functional Safety.” We are also going to have to begin speaking of “Functional Security.” Security is a safety issue.\n","17":"Physical security is another interrelated issue. How many fewer people would have died at BP Texas City if the operators had been able to detect the drivers of the diesel pickup truck as they moved into the danger zone, and stopped them? No one will ever know. But in emergency situations it is critical to have firm control of the perimeter and to know where all of your people and your assets are. Knowing where a fire truck or a staff member with first aid or CPR training is, and being able to vector them to the area of most need can make the difference between deaths and survival. And where better to have the information than in the control room, as part of the operators’ gestalt?\nSeveral years ago, a combined physical and cyber attack occurred at the largest coal-fired power plant in London. An attacker scaled the fence in a blind-spot for the surveillance cameras, raced to the control room and damaged the control system to the point where the plant had to be shut down for some time. The attacker escaped the same way he came, and has never been caught.\n","6":"The definition of insanity is doing the same thing over and over and expecting different results. It is clear that we are doing the same things to try to make our process workplaces safer and more secure—and we're not getting different results.\nVariously, we've tried alarm management strategies, graphical user interface designs, high fidelity simulations and operator training. We’ve tried patches, firewalls, and security by obscurity.\nIn most cases, what we have not tried is mandating safety and security as "Job One,” and really, really meaning it. \nFor example, J. M. "Levi" Leathers, when he was general manager of Dow Chemical Company's Texas division in the 1960s reacted to a major accident with a number of deaths at a Dow plant, and simply noted that safety was more profitable than unsafety. This dedication to safety before all else, including profit, has held true for Dow in the 50 years since. Andrew Liveris, Dow’s CEO says, “We operate a ‘safety first, production second’ mindset at Dow, and our ‘Drive to Zero’ global safety goals will always come first because it’s what’s right for our employees, our communities, our customers, our business partners and our environment.”\nIf Dow can do it, why can’t everyone? And Dow has taken a similar attitude toward cyber security. Why can’t everyone?\n","23":"So, even in the face of logic and data, both management and employees try to skate on safety and security issues all the time. Why?\nWe humans appear to engage regularly in a sort of magical thinking about danger and threats. This kind of magical thinking was clearly in the driver's seat on the day the Deepwater Horizon sank. The two top executives from BP's Gulf of Mexico drilling and exploration division were actually on board the rig when the vessel started to explode. They were there to present an award to the employees for being the safest rig in the fleet—seven years without a lost-time accident. That fact was used overtly prior to the sinking and the disaster to not only excuse, but actually permit violations of safety protocols that were directly responsible for both the accident and the lack of response to it.\nIf the danger is real and imminent, and the threat is life-or-death, people react with stunning capability. But the further away the threat is perceived, the more unreal it becomes. We find ourselves discounting the seriousness of any threat if it doesn't affect us personally and immediately. And we are clearly doing that for both safety and security issues. We perceive the threat to be less serious than it actually is. It's a perception issue.It could be that we are hardwired to think this way as a reaction to the fact that the world is and always has been a fairly scary and dangerous place.\n","12":"In the 1960s and 1970s, operators were able to get an instant grasp of the operating condition of the plant or the part of the plant they were responsible for by looking at the panel wall. We gave up that viewpoint by migrating to small screens where only a part of the process could be seen. Now we are moving back to screen walls and working on visibility issues. But for years, we’ve had real problems with giving operators more ability to see the eagle’s eye view of their processes and we wonder why their tunnel vision leads to accidents that certainly should have been prevented.\nWe also have a history of treating operators as less important to the operation of the plant than the engineers and managers, when, in fact the operator of a process plant is much more like a pilot of an aircraft. They are responsible for the safe operation of the plant, and the optimization of the plant, and they should not be burdened, as so many of them are, with busy-work because some top manager noticed that they weren’t busy when he entered the control room. Really, good operators earn their salary during those thirty seconds of terror when the plant is in upset, just like pilots do when something goes wrong with the plane. \nIf we just were able to look at operators in this way, and make improving their HMIs and training critical, we could reduce accidents and save lives.\n","1":"My name is Walt Boyes, and I am not a journalist. I edit a magazine, yes, but the reason I do is that I am an engineer and I can spell and do grammar. For over forty years, I have been working in the industries served by automation, and industrial control systems. I don’t think there is an industry vertical that I have not done some automation work in. I am a Life Fellow of ISA, the International Society of Automation, and a Fellow of the Institute for Measurement and Control in the UK, and I am a Chartered Measurement and Control Technologist. I have been a member of ISA99 and other standards committees, so I think I know enough about Safety and Cyber Security to have opinions.\nI’m sorry that this is a lunch keynote, because my objective is to disturb your digestion. What I want to do is to make you think, and maybe even make you afraid. But I am not spreading FUD. What I want you to understand is that these issues are real. How likely are they? I don’t know. We will find out how likely they are as these exploits are executed. If they ever are.\nBut more than anything, I want to focus on what to do about this situation. What is a real path to safer and more secure systems?\n","18":"As I said, cyber security issues must be considered in any safety implementation in any process plant, just as safety issues must be considered when administering ICS security issues. As I mentioned, in 2008 a particular safety system hack was accomplished as part of a penetration test by Wurldtech that was requested by the safety system manufacturer.\nSiemens asked Idaho National Labs for the same kind of evaluation for its PCS7 system, and still was vulnerable to Stuxnet.\nIs your safety system secure, or just safe? Can you say it is safe, if you don’t know it is secure, too? Was it engineered in a vacuum, or was it designed in cooperation and understanding of all the interacting factors in your plant? Do the process engineers and operators live, breathe and eat safety and security? Does your management? Are these core values for your company, and if not, why aren’t they?\n","7":"Haywood, however good his intentions were, never successfully communicated to his staff and operating managers that there were no ways to skimp, no ways to hurry, and no short cuts to a safety culture. The result was that the Deepwater Horizon operators felt pressure to cut corners and pay lip service to safety in the service of getting the Macondo well dug, capped and producing as fast as they could. And the ultimate payment for that lip service was that 11 of them died and the financial consequences to BP have yet to be completely totted up. \n","24":"But if that's true, we may have come to the limit of what we can expect corporations and government to do about workplace and process safety and cyber security. Big accidents like Deepwater Horizon, BP Texas City and the Uranium Enrichment Facility in Iran may actually be necessary to force us to treat these threats as imminent danger and do something about them. But there’s more! Dick Morley, who invented the floppy disk and the PLC, among many other things, said to me, when we discussed this talk, that it also has to do with the absolute size of organizations…that the size of companies and governments we have may make it impossible to change, not just merely hard.\nThat may be the scariest thought of all.\nBut all that begs the question. Where does it leave us as process and automation professionals and ICS security professionals? Are we just to accept things as they are, and know that because we work in dangerous places we have to accept that people will die, unnecessarily, from mistakes and bad practices?\nWe can, and we must, change things. We can show our value as individual professionals and as process professionals, and the value of operating safely and securely. We must.\n","13":"We now know that operator response to abnormal situations is highly dependent on how the information is presented to them. The BP accident shows clearly that fact. If the operators had been able to easily see both the flow in and the flow out of the raffinate splitter tower, it is highly likely that they would have intervened long before they did.\nWe understand this but we have not been able to build this into the routine engineering dogma of control systems and safety systems.\nWe have seen the output of the ASM consortium. We have seen the EEMUA guidelines. We now have the ISA18 standard. And we continue to overload operators with too many inputs, too many distractions, and too many jobs to do.\nProperly, the only parts of a process that an operator should be seeing are the ones that aren’t working properly, or that the operator is engaged in optimizing. Yet many HMIs are designed with lots of motion, pretty colors and three dimensional effects– because it is cool and has lots of marketing sizzle.\nMany HMIs are designed and installed with minimal input from the operators, because in many countries the operators are very often considered non-professional labor– to be told where to go, and what to do by engineers and managers.\nAnd the operator has to trust that the HMI cannot be compromised and is delivering correct and appropriate information.\n","2":"These words are from Bill Watterson's great comic strip, "Calvin and Hobbes." They could be the motto of all the accidents in all the process plants that have killed people through just plain bad alarm management and faulty process safety thinking. Based on the results of the past 30 years, we really don't want to learn anything from what has happened from Bhopal to Deepwater Horizon. And clearly, we don’t want to learn about cyber incidents either. We’ve been killing people since 1999 with cyber incidents, that we know of, with no end in sight.\nIf you read the final report on Deepwater Horizon, you will understand that "the blowout was not the product of a series of aberrational decisions made by rogue industry or government officials that could not have been anticipated or expected to occur again. Rather, the root causes are systemic and, absent significant reform in both industry practices and government policies, might well recur.”\nSo this is a cyber security conference…why am I talking about safety?\n","19":"We are about to lose the last generation that knows how to operate our plants manually. We are about to lose a terrific amount of institutional knowledge and we are hard pressed to replace that institutional knowledge that we are losing– and that lack of training has been shown to contribute materially to accidents like the BP Texas City accident. And if we do not provide operators the highest level of training, we will surely see people die.\nWould you want to fly from here to Shanghai with a pilot with 10 years’ experience, or with a pilot whose experience is 90 days in a simulator? How about running your plant?\nAnd how good is the training you provide to your operating personnel? How good is the training you provide to your maintenance personnel? \nSafety and security are cultural, not instinctive. You have to train your operating personnel to operate safely, and to operate securely.\n","8":"In 1999, three people were killed when the Olympic pipeline in Bellingham, Wash., ruptured. To paraphrase the NTSB’s conclusions about the causes of the accident, the rupture occurred because of:\nDamage to the pipe; \nOlympic Pipe Line Company’s inadequate inspection; \nInaccurate evaluation of in-line pipeine inspection results, which led to the company’s decision not to excavate and examine the damaged section of pipe;  \nFailure to test, under approximate operating conditions, all safety devices associated with the Bayview products facility before activating it; \nFailure to investigate and correct the conditions leading to the repeated unintended closing of the Bayview inlet block valve; \nThe practice of performing database development work on the supervisory control and data acquisition system while it was being used to operate the pipeline, leading to the system’s becoming non-responsive at a critical time during operations. \nThis was a cyber incident. So was the San Bruno Gas Pipeline explosion. So, partly, was the Yellowstone River pipeline oil spill.\n","25":"We are, as process, automation and security professionals, in a remarkably different place than we have been over the past 30 years. We are in demand, even in a drastically weakened economy. \nWe are scarce, and we now have the tools to prove that we are not only necessary, but irreplaceable. Imagine what would happen if all of us walked off our jobs for 60 days…but we don’t have to do that.\nWhat we MUST do is to stop thinking like instrument engineers, like control systems people, like safety systems engineers, and like IT security boffins…and start thinking like real automation professionals with business skills. \nWe have a larger, deeper skill set that we need to learn than any other discipline. It isn’t enough to be an engineer…in fact, many automation professionals aren’t engineers. \nWe must be able to engineer, to plan, to manage projects, to understand many kinds of processes in many different industries…in a way, we’re like the dancer Ginger Rogers. She could do everything her partner Fred Astaire could do– and she did it backwards, and in high heels.\nWe are the ones with control. \nIf we don’t change the way things are, who will? If not now, when?\nThank you very much.\n","14":"I encourage you to consider that operators are highly skilled technical professionals, whether they are trained engineers or not. This is the intent of the ISA’s Certified Automation Professional designation– automation isn’t an engineering discipline. Automation is a multidisciplinary profession that includes chemical engineers, process engineers control system engineers, operators, scientists, and technicians, and security professionals trained in Industrial Control Systems. \nThe operator needs to be in charge of the process he or she is operating… and we need to provide the tools to really be in charge.\nOperators cannot handle hundreds of alarms every minute and be in charge of the process. Operators cannot worry that they might be seeing spoofed data or that they might be hacked at any minute, and be in responsible charge and able to handle issues that can happen faster than a flash of lightning.\n","3":"You cannot consider your plant fully safe if there are security vulnerabilities that can be exploited that will cause physical damage, injuries or death. I am going to say this over and over: you cannot operate in a vacuum. Safety, cyber security, physical security, alarm management and operator training are interactive puzzle pieces that we need to keep putting together again and again.\n","20":"These incidents just keep on happening, every day. One death, two deaths, three deaths, sixteen deaths. And that’s not counting the injuries from these incidents, or the injuries from incidents where nobody died.\nWhat’s interesting about the aftermath reports on these accidents is that in hindsight most if not all were easily preventable. So why weren’t they? You tell me.\nWe have been trying very hard to make workplace safety, process safety and cyber security actually work in our plants. Well, at least some of us have been trying very hard. Others not so much. In a recent study of safety professionals done by Kimberly-Clark, 89% said they had observed workers not wearing their protective gear when they should have been, and 29% said they'd seen it happen frequently. The record for process safety is equally dismal. And if anything, the record for cyber security is worse.\nYou can’t buy a box to fix safety. You can’t make your control system more secure by sticking a box on the network. You may need the box, but you need the training and the security and safety culture that makes spending the money on the box worthwhile. Otherwise, you’re wasting your money on the box.\n","9":"There is a highly complex interaction between a large number of those vectors. Safety, security, alarm management, operations, training, and of course, your company goals all interact, and, like any complex system, simply changing one vector makes more changes than can often be visualized or calculated in advance.\nNo one expected the operators to have difficulty seeing both the inlet and the outlet flows to the isomerization process and the raffinate splitter tower at BP’s refinery in Texas City. No one expected ALL the level measurement devices on the tower to fail at the same time. No one expected the safety system to fail. No one expected that the operators would consistently make wrong decision after wrong decision as they tried to recover from the impending disaster. No one expected the diesel pickup truck to be running in the same area as the cloud of hydrocarbon vapor.\nYet all of these things happened. And people died. There have been many more accidents in the years since the BP disaster, and there will be many more. And many more people will die.\nWe need to start thinking about safety, security, alarm management, operations and training as an integrated whole, and we need to have our companies agree that the safe way is the most profitable way. We have not done this yet, and until we do, people will continue to die.\n","15":"Many people have real problems with IEC61508, IEC61511 and ANSI/ISA84.01-2004. They have the same problems with ISA99 and the IEC standard. \nWe’re educated to be project-oriented. We propose projects, we get projects approved, funds are dedicated to those projects, and we do the project and then it is over. And so, with process safety projects, process optimization projects, and alarm management projects, we have first success, then gradual decay after the project people turn the “project” over to the operations staff. The same seems true with security projects.\nBut safety, alarm management, security and operator training are not amenable to project-oriented engineering thinking.\nThey are inherently processes, and they are best managed as continuous processes, and there are very few of us who instinctively think in those terms.\nBut if we are to meet the intent of the safety standards, and security standards like ISA99, and more than that, we are to begin to operate on the basis that all these topics form an inter-related and interdependent interconnected system, we are going to have to think instinctively in process terms rather than project terms. We are going to have to think wholistically, rather than linearly.\n","4":"This month alone, there have been more than two dozen incidents, resulting in injuries and fatalities—in process plants as reported by the ASM Consortium in its RSS feed, echoed by my website, ControlGlobal.com www.controlglobal.com/articles/2010/asm_news.html. And the hits just keep on coming. \nIt's not an excuse to say "the process industries are dangerous." If that were a valid reason to ignore safety, then miners would still be going down in the ground with acetylene lanterns and canaries. Not that mining safety is so wonderful either.\nWe've been doing significant research on safety and promulgating safety standards for nearly 50 years. And the results are poor. We keep trying to figure out how to produce satisfactory alarm management schemas, train operations and maintenance personnel, focus management eyes on safety, and, based on results, none of that seems to work very well.\nWe don’t seem to be able to get a handle on cyber security either. The number of reported incidents just seems to climb higher and higher. Sooner or later, a near miss is going to turn into a big hit. \nWhat is keeping someone from shutting down the Western States Grid for at least 6 months? Probably nothing. Will it happen? If somebody gets angry enough to want to cause the end of the world, it will.\n","21":"Stuxnet produced several incidents that were cyber breaches that caused plant safety issues. While Stuxnet is patched and gone, what Eric Byres calls “Son of Stuxnet” is alive and well…and applicable to any controller anywhere, at any time. And yet, we don’t seem to be that alarmed.\nThe yawn from many process and discrete manufacturing industry CEOs to the cybersecurity dangers from follow-on Stuxnet attacks, the repeated disasters, such as the fact that it took the ExxonMobil RTO (Real-Time Operations Center) in Houston almost an hour to shut the isolation valves after the Yellowstone pipeline rupture, and the SCADA-related findings released annually now at the Black Hat hackers’ conference, makes me wonder why we're fiddling while the infrastructure catches fire.\nThere are several reasons. Some are, obviously, financial. But much scarier is another reason: the way human beings react to threats.\nThe financial reasons to try to ignore safety and security issues are pretty clear. The costs of making plants safer and more secure are substantial, and they all impact the profitability of the enterprise negatively. There are also risk-management issues, but there is no clear and consistent effort by corporate risk managers and their insurance auditors to make mitigating safety and security threats a very high priority. There's no way to include "protected the plant from security and safety threats" on the financial roll-up.\n","10":"Our first attempts to build safety systems took the form of dedicated systems that were stand-alone and completely separate from the basic process control system. This was done to ensure that these multiply redundant systems shared no points of failure with the control system itself.\nThis was both good and bad. The good news was that the safety system could only be used for one thing. It was strictly to shut down the plant if something abnormal occurred. The bad news was that it encouraged safety practitioners to develop a curious tunnel vision, so that the interactions of the safety system to the rest of the plant were often not investigated. And operations focused on spurious trips and how to turn off the system.\nSo, a few years ago, we began integrating safety systems and control systems…with many engineers still unwilling to do that to this day. What we learned immediately was that there were those interactions, and that a Safety Instrumented System cannot be built in a vacuum. It must be part of an overall proactive operations strategy that includes safety, security, plant operations and maintenance. Probably the best example of what I am talking about is what Dow’s Levi Leathers called for, almost fifty years ago, an operating discipline in which safe operation is the most important engineering rule of the company.\n"}