Overview of Combined Operations Pipelines 51,000 miles of natural gas, NGL crude oil, refined products and petrochemical pipelines Storage (Salt Dome) 225 million barrels (MMBbls) of NGL, refined products and crude oil storage capacity 14 billion cubic feet (Bcf) of natural gas storage capacity Natural Gas Processing 24 natural gas processing plants Marine Services 63 tow boats 131 barges Fractionation 22 NGL and propylene fractionators Platforms 6 offshore hub platforms NGL Import/Export Terminals Houston Ship Channel Import Terminal offloading capacity – 14 MBbls/hr Houston Ship Channel Export Terminal loading capacity – 7.5 MMBbls/mo
Before Splunk: Basically, control desks would notice inactivity, data not updating, or all data disappears. Talk to each other, shift supervisor. Call NOC, NOC calls SCADA on-call, escalate to SIG. Level of Service Agreement is 5 minutes of downtime per PHIMSA.
99% of data is datacenter based. 1% are workstations. Alarms and events.
Servers: Universal Forwarders – using deployment servers, apps: Windows, WMI, AD, not gathering data yet from the SCADA applications, but is monitoring failover, etc of SCADA applications from WMI. Plan is to test primary SCADA systems (events, alarms, etc). First priority: Errors and warnings (from SCADA process). Will also be adding connectivity via DBConnect 2 to SQL database (events, alarms, authentication, SCADA DB changes), communications statistics.
Infrastructure and Application Operations Biggest piece is insight into when services fail and the IT systems don’t catch it. IT org uses SCOM and there are pieces of information that SCOM misses and Splunk catches. Never missed a significant issue with Splunk. Worked through a number of issues in new system – have to bring back system as quickly as possible – for example – issue shuts down control desk. They are a safety service solution. Quickly act upon this. Before Splunk: Basically, control desks would notice inactivity, data not updating, or all data disappears. Talk to each other, shift supervisor. Call NOC, NOC calls SCADA on-call, escalate to SIG. Level of Service Agreement is 5 minutes of downtime per PHIMSA. After Splunk : They know within 30 seconds. Rigorous escalation group – after hours, if nobody calls in and responds, every time they have had an issue they’ve been able to resolve in 4 mins or less. Immediately alerted on issues. Alerts are prescriptive alerts. Email contains diagnosis, location, etc.
Security Palo Alto Project Looking for ways to do more with less. Palo Alto SCADA monitoring will deliver data to Splunk Support VPN Environment IDS Also watching SCADA firewalls (between SCADA and Corporate LAN) Various types of protocols across firewall. Looking at types of protocol data Modbus Fisher-ROC Allen Bradley (All 4 ver)
Example of a few Alerts we have setup.
As well as an email.
Continue to get a handle on Cyber Security in ICS with Enterprise Security
Add host monitoring to correlate with OS level information to help understand system performance.
Also looking into “System” collections to show system status in Splunk
AppEnsure may be able to be used to look for Critical Data processing bottlenecks or control points.
Managing SCADA Operations and Security with Splunk Enterprise
How We Got Started
Recognizing the operational
differences between OT and IT
Recognizing the technical similarities
between OT and IT
Supporting the SCADA Systems
Difficulties meeting SLA’s
Splunk Enterprise at EPD
AlertsMessages Metrics ChangesScriptsConfiguration
• Infrastructure and
• Cyber Security
• Improving SLAs
Improving SCADA Network Availability and Performance
• Augmenting SCOM
• Need for rapid recovery
• Impacts on safety and
OT and IT are both similar and different
Best practices for managing operations, cyber security
and SLA’s with Splunk Enterprise
How you too can be a SCADA superhero with Splunk