Adversary groups and activities targeting industrial control systems are on the rise. Security teams are now tasked with defending increasingly complex and critical control systems without interrupting operations. This presentation highlights plans and progress of a large public electric utility to extend threat detection capability using PI system data sets. Integration with a threat detection platform improves situational awareness and adds value in three ways. It first provides confidence for quickly eliminating threat activity as a root cause of operational upsets. The second benefit is improved likelihood of detecting malicious tradecraft targeting control systems. Finally, the integrated approach provides data in support of control system incident response and forensic activities.
Video for presentation here: https://youtu.be/Inn6FPaXN1w
Learn more www.dragos.com
It seems like all operational outages are all cyber now…
This transformer fire in July 2004 was the first time anyone asked me if it could have been caused by a cyber attack. My answer at the time, was “I don’t know”.
Direct cause of the fire ended up being a Blue Huron pooping on an insulator and causing a fault.
Gila Power Plant visit
We made some changes in the am … testing complete, went to lunch
Driving back to plant … black smoke coming from Unit 2 transformer
“was it a cyber attack”
Photo credit: APS Substation technician
Talk about adversary capabilities and activity
This is a software fight – to scale it we need data
We want to use threat into to improve our path forward
Value of red teaming
After a breach
Need to be able to quickly identify adversary actions
IT security tools don’t do that well in OT
If we are going to enhance IR, we need data – just like a picture needs more pixels to enhance IRL. We are looking at PI to provide some of that data.
Meme credits:
Battlestar Galactica
The X-Files
So as we follow the scientific method in our efforts to solve this business problems, we look to known ICS attacks.
German Steel Mill
OT team didn’t consider cyber
Insurance company RCA discovered cyber intrusion as cause of equipment damage
Trisis
Vendor contacted 3 times prior to consideration of cyber attack
Crashoverride (Ukraine 2016) Event
Ukraine 2015
Give a quick overview of Ukraine attacks – many attendees won’t know details
What we are doing is based on the tactics of
Explain why this matters
Reference:
Wall Street Journal: http://www.wsj.com/articles/cyberattacks-raise-alarms-for-u-s-power-grid-1483120708
For nine months, the hackers studied the Ukrainian electric system. When the attack finally happened on December 23, 2015, hackers remotely took control of three of Ukraine’s 30 power distribution utilities within a half-hour. During the attack, the first time that power systems had been blacked out through cyber means, control room engineers sat helplessly as ghostly hands moved cursors across their computer screens, opening circuit breakers at 50 substations and shutting off electricity to about 700,000 people.
“Shortly before midnight on December 17, someone started disconnecting circuit breakers through remote means until the electrical substation was completely disabled, Mr. Kovalchuk said.”
“Mr. Kovalchuk said he believes the latest attack was well planned because the targeted substation is one of the utility’s most automated.”
Reference:
Kovalchuk said the outage amounted to 200 megawatts of capacity, equivalent to about a fifth of the capital's energy consumption at night.
After observing adversary instruction into other ICS, I’m left with a simple question – how to prevent, detect, and respond to these attacks.
Prevention is our first line of defense – we believe we can prevent many adversaries and we are working on being a hard target. However, we need to test all levels of defense including response and to do that we have to assume that our preventive barriers fail
Given what we know about ICS adversaries, and assuming that prevention fails I predicted that we will
1. detect some adversary tactics
2. Gather data to respond
After testing the previous theory against existing controls and against the Dragos threat platform, we realized that we still have room for improvement.
Quickly summarize tests of previous efforts:
1 – controls using existing corporate controls was effective but didn’t lower risk enough
2 – adding Threat intel and threat based detections improved detection rates and response
3 – if an event occurs we expect to need faster responses – therefore we are investigating the integration with PI
So let’s talk about how the PI system was applied to help solve this problem.
One step for preparing for a direct attack on our EMS would be to prepare alerts and integrations in advance that would allow a SOC analyst to
Quickly confirm if a breaker operation is/not the result of a cyber attack
Quickly gather necessary data that will help identify the extent of the intrusion and prepare an eviction plan
In a simulated attack the PI event frame would connect to the Dragos platform to display a notification.
When a behavior is associated to a threat behavior, the analyst would start investigating the behavior to determine if it is legitimate or not
Creating an alert is only the first step of preparing for an event such as Ukraine 2015.
We must also plan our response and ensure that the SOC analyst has all the data she needs to confirm the attack and provide necessary data to learn the extent of the intrusion. The pre-planned actions are recorded in a playbook.
Playbook steps SOC analyst through analysis steps:
Confirm communications to breaker is from expected hosts
Confirm all data flows to breaker on uses expected ports/services
Confirm that the DNP3 operations are only the expected operations
Review PI data to observe historical state of breaker
If necessary contact EMS to confirm operation was expected
Review endpoint information from the source computer
Playbook Step 1 – soc analyst validates that the source of the dnp3 breaker operate function was the expected source computer.
Because of the visual diagram of traffic flows the analyst quickly identifies a source computer that began remote communications with the automated breaker just prior to the event.
Playbook Step 2
The analyst reviews the content of the communications and identifies new protocols that are unexpected
The analyst reviews the content of the valid traffic and finds that it uses a different operate command (direct operate) than the normal EMS system uses (select then operate)
SOC analyst reviews history of breaker as part of investigation and finds that the system should be right at the start of an extended operational window and likely should have remained closed.
SOC analyst reviews connections to the source computer
Are there any unexpected remote connections
Review Alerts for the computer – brute force attacks, etc
Alert on EWS for interactive logon
SOC analyst reviews data – assessment is made.
Because all 5 steps in the playbook shows some level of anomalous activity, the analyst is able to provide a high confidence assessment.
Quality of the assessment and amount of data contributes to a more appropriate decision
- without necessary data we might overreact to a squirrel attack or underreact to a cyber adversary
We also need to consider information sharing – how much of what we’ve gathered needs to be shared with partners, government, other utilities, regulators, and customers.
Few companies actually monitor their vendor access in a way that would prevent
Intel is never going to be perfect