Digital Maintenance and Test
Equipment and Impact on Control
System Security
Mike Toecker, PE
Context Industrial Security
@mtoecker
Introduction
• Professional Computer Engineer (USA-
MO)
– Specialized in Computer Security for
Industrial Systems
• Currently, Owner/Engineer at Context
Industrial Security
– Former Burns and McDonnell
Engineering
– Former NextEra Energy
– Former Digital Bond
• 10 Years in Cyber Security for ICS
– Fossil/Hydro/Nuclear Power Plants,
Transmission, Control Centers, Mine,
Water Treatment, Distribution, Gas
Processing/Transport
First, a Message from the
Goat of Honesty and Truth
I’m unaware of any incidents,
public or private, where
malicious hacked M&TE has
been the cause of an industrial
cyber security incident.
This has been a message from
the Goat of Honesty, Integrity,
and Empirical Evidence.
Maintenance and Test
Equipment (M&TE)
M&TE is a class of industrial
equipment that aids maintenance and
engineering personnel in ensuring the
reliability, efficiency, and profitability
of electrical and mechanical systems
and equipment.
It can be considered part of the on-site
implementation of Reliability
Engineering principles.
BASICALLY…. M&TE IS EQUIPMENT
USED TO MAINTAIN AND TEST OTHER
EQUIPMENT
You’ve Seen M&TE,
Likely Didn’t Realize It
An automobile has lots of control
systems, some are just plain
mechanic, many are digital. The
digital ones are hackable. We know
this for sure, thanks to Miller and
Valasek.
What we don’t think about are the
digital tools used to evaluate
whether or not an automobile is
ready to return to the road.
If some of these look like Control
Systems, it’s cause they are.
Battery Test
Automatic
Alignment
Engine Diagnostic (OBD-II/EOBD)
Computerized Balancing
Emissions
Compliance
I NEED A NEW
ALTERNATOR?!?!
Maintenance and Test Tools were
developed to provide objective
guidance on replacement and repair of
expensive components.
For me, it was the difference between
driving home happy, and driving home
$500 poorer.
I went home $500 poorer, because a
computer told me to.
Industrial Facilities Differ Only in Scale
ICS Compared to M&TE
• Operations Focus
• Networked
• Monitor the Industrial
Process in Real-Time
• Generally Fixed Assets,
Installed in Facility
• Maintenance Focus
• Rarely Networked
• Evaluate Specific
Criteria Associated with
Process Equipment
• Mobile, Often
Handheld, Goes from
Site to Site
THE VULNERABILITY OF M&TE
Trend Towards Digital
Equipment
Digital Equipment is all the rage these
days. Generally, digital measurement
of analog signals is more error-proof
and reliable, with a far greater degree
of accuracy than the older analog
meters.
With digital, you also get the capability
to record your data, compare it to
other recordings, trend it, analyze it
with advanced math packages.
In short, it’s kinda a win.
The Usual Digital
Vulnerabilities
M&TE has taken the same path as ICS,
adoption of the commercial hardware
and software into the products used
on industrial equipment.
Examples:
1. Automated Analyzer using MS
Access and WinXP
2. Firmware updates without code
signing, passwords, or other
means of control
3. Calibrators running BusyBox Linux
4. HART Descriptors Updateable Via
Plain Jane HTTP
Bring Your Own Device;
Industrial Edition
Because of economics, accounting
practices, and work load, this kind of
maintenance and testing work is
routinely outsourced to external
companies.
To the right are snippets of language
from an RFP for Substation Testing
Services, which is reasonably typical.
There was no mention of cyber
security in this RFP.
Demonstrates Known Concern that Testing Agents
have a lot of power to recommend expensive
changes
So, how do you think they are upgrading this
firmware in the relay? (Hint: The answer is laptop)
Consequences of Malicious
Interference are Different
ICS M&TE
Consequences of hacking M&TE and
using it maliciously are going to be
different than hacking ICS.
Digital Calibration and
Interface
Handheld devices and laptops used for
interacting with digital transmitters are
pretty ubiquitous at many sites. These
have some very advanced capabilities,
often outside of the operator’s
purview and cyber security
monitoring.
The most interesting ones get firmware
updates and device descriptors from the
internet, downloaded directly into the
handheld.
This is an area I’m planning to explore
extensively over the next few years.
Motor Condition
Evaluation
There are few industrial facilities that
don’t have motors, these are tested via
automated systems that check the motor
for shorts, dirt, and poor insulation
quality.
Valve Testing
Maintenance of Valves is a big deal in a
lot of industrial facilities. Many valves
MUST close if required due to safety
reasons, so testing is performed to
ensure the valve is capable of closing.
Other valves are important to the
process, and are evaluated for
problems on a consistent basis .
Valves which fail tests are swiftly
replaced.
Facilities with these valves often used
portable test suites to initiate,
monitor, and determine pass/fail of a
valve. Common tests are partial stroke,
full stroke, and valve stability. The green valve has a smart positioner, which is
calibrated using either a HART or Fieldbus
connection, and can aid automated testing.
Relay Testing and
Validation
Utilities and other industrials with
large power requirements must test
protective relays to ensure they work
under all conditions.
Automated rigs sets are the usual way
these tests, often regulatory required,
are performed.
Binary and
Analog Test Leads
USB
IEC/61850
PoE/Ethernet
Bluetooth
Eddy Current Testing
This type of testing is used to find
deformities and weaknesses in tubing,
pipes, tanks and other large metal
components, preferably before leaks,
cracks, or breaks happen.
There are many variations of this,
many industries need ways to inspect
piping without opening/digging it up.
The Illustration above was via the
Non-Destructive Testing Resource
Center.
https://www.nde-
ed.org/EducationResources/educati
onresource.htm
Ground Tests, and other
Major Electrical Tests
0
10
20
30
40
50
60
0 50 100
Resistance vs %Distance
Resistance
We often need to know the electrical
characteristics of a system, such as
whether it has a good or bad ground,
in order to diagnose problems
effectively.
USB Interface to PC
Other M&TE
• Gas Analysis – Transformer
Health
• Ultrasonic Testing - Pipe
Thickness
• Vibration Analysis – Health
of Rotating Equipment
There’s a lot I didn’t cover, I tried to hit
the high points. However, there is a
lot more out there, and a lot of it is
industry specific.
SO, WHAT MIGHT AN M&TE HACK
LOOK LIKE?
Motor Test Hacking
Motors are a major component of any
industrial facility. They don’t last
forever though, and they are pretty
expensive components to replace.
Pricing taken from “W22 Severe Duty Motors – TEFC” on
weg.net and subject to change without notice
Some Motor Tests
• Kelvin Method Winding
• Meg-Ohm
• Polarization Index (PI)
• Step-Voltage
• Surge Test
Major problems in the motor often
result from:
• Ground Wall Insulation
• Turn to Turn Insulation
• Phase to Phase Insulation
Most motor failures on the electrical
side come from insulation going bad,
and getting a short somewhere in the
motor.
Surge Test Process
Charge up
Capacitor to a
Voltage Setpoint
Discharge Capacitor
into Single Phase
(sometimes Two)
Charge Dissipation
Produces a
Characteristic
Waveform
Record the
Waveform and
Associated Data
Compare the Current
Waveform to Previous
Waveform Using
ppEAR or Similar
Increase the
Voltage Setpoint
per Test Spec
The technical explanation is that the
impulse from the capacitor will
discharge into the winding, resulting in
a wave with a characteristic response.
That characteristic response is stored
and compared the previous.
START
STOP
Source: http://www.maintenancetechnology.com/2007/03/dc-step-voltage-and-
surge-testing-of-motors/
The fun
explanation is
that we are hitting
a punch-me clown
repeatedly, harder
each time, to see
if it does
something
different than last
time.
Failure Criteria #1
If the frequency of any resulting
waveform shifts to the left, the motor
is either bad or going bad.
The white line has shifted to the left,
or using other words has a different
zero crossing. This motor is bad, or is
going bad.
This is when the punch-me clown goes
down, loses grip with the floor,
bounces, and/or doesn’t come back up
the same way.
Source: http://www.maintenancetechnology.com/2007/03/dc-step-voltage-and-
surge-testing-of-motors/
Failure Criteria #2
If the changes between each of the
pulses result in a different waveform
(about 4-5% difference is normal),
then the motor is bad, or going bad.
This is measures with an equation
called Error Area Ratio. It’s basically a
percentage difference between two
voltage measures taken at the same
time in the test.
The dotted line in the EAR subgraph is
at 4-5%.
Source: http://www.existest.com/appnotes/Baker/Teoria%20Surge.pdf
That’s the EAR, you see it go off the chart, which signals a
response dramatically different from the previous.
Aside: Why Does This
Work?
Every Played F-Zero? The Cloud Carpet Track of
the Icarus Circuit illustrates this well.
If the racetrack is a motor winding, Point D is a
short circuit. But, you can’t USE Point D until
you gain get enough of a top speed to jump all
the way over. That top speed is where the
insulation in the motor allows a short.
Please Don’t Sue Me Nintendo, I love you.
PROCESS TO FAKE A SIGNAL
Discuss the process.
Consequences
Bad Motor Reports as Good
• Failure of the Motor is a
Given at Some Point
• No Maintenance Activity to
Open It Up For Inspection
• Impossible to Determine
How Much Life is Left
• Impact Depends on Motor
Function
Good Motor Reports as Bad
• Engineering will Evaluate the
Motor for Useful Life
• Motor will be removed, or
scheduled for removal
• Money spent to order, design,
install, and test new motor.
• **Possible the motor will be
sent to 3rd party for testing**
Attackers Want
To Minimize Discovery Risk
Bad Motor Reporting as Good Good Motor Reporting as Bad
It’s Tough to Test Molten Slag for Irregularities
PROTECTION AND MITIGATION
Step 1: Identify Your High
Consequence Equipment
Talk with the engineers and operations,
most of them either know what
equipment they have that, if it fails,
presents a High Consequence. If they
don’t know already, they usually have
the means to do so.
High Consequence, No Particular Order:
1. High Cost of Replacement
2. Personnel Safety Concerns
3. Negative Impacts to the Process
4. Regulatory Requirements
Step 2: Identify Tests Done on
High Consequence Equipment
Likely, this will require discussions with
engineers and maintenance personnel,
as operations may not have the
definite answers for this question like
they normally due.
Tests will depend on the equipment.
Figure out the methodology, accuracy,
what measures are taken to prove a
test, etc.
Step 3: Evaluate Susceptibility
to Malicious Cyber Influence
Looking at each test, and the
equipment used to perform it, identify
how susceptible it is to malicious
influence.
You’re looking for cyber vulnerable
equipment, equipment provided by
third parties, poor firmware update,
etc. This is basically a high level device
risk assessment.
Rate them on a scale:
10 – Highly Susceptible
…
1 – Not Susceptible
Step 4: Identify and Apply
Protective Measures
Physical
Protection
Lock Up When Not
in Use
Allow Use By Only
Qualified
Individuals
Place Tamper
Evident Seals
Create “Test
Checks” to provide
confirmation of
Tests
Cyber
Protection
Block Firmware
Update
Mechanisms
Check Firmware
Signatures w/
Vendor
Remove Network
Access that is Not
Required
Heavily Restrict
Portable Media
Usage
Vendor
Protection
Require Pre-Site
Evaluation of
Equipment
Require Use of
Special Hardened
Laptops
Consider Altering
Calibration
Requirements to
Include Cyber
As these aren’t traditional IT devices, the
usual protections may not be applicable.
Define a set of protective measures that
reduce risk from the identified
vulnerabilities, and then apply them in
order greatest to least susceptibility.
Remember, we’re already focusing on
the most high consequence equipment
already.
Some basic protections are to my left,
this is a min-max area as it’s very easy to
affect maintenance’s processes ($$$).
Are There Any Questions?

Maintenance and Test Equipment Cyber Security

  • 1.
    Digital Maintenance andTest Equipment and Impact on Control System Security Mike Toecker, PE Context Industrial Security @mtoecker
  • 2.
    Introduction • Professional ComputerEngineer (USA- MO) – Specialized in Computer Security for Industrial Systems • Currently, Owner/Engineer at Context Industrial Security – Former Burns and McDonnell Engineering – Former NextEra Energy – Former Digital Bond • 10 Years in Cyber Security for ICS – Fossil/Hydro/Nuclear Power Plants, Transmission, Control Centers, Mine, Water Treatment, Distribution, Gas Processing/Transport
  • 3.
    First, a Messagefrom the Goat of Honesty and Truth I’m unaware of any incidents, public or private, where malicious hacked M&TE has been the cause of an industrial cyber security incident. This has been a message from the Goat of Honesty, Integrity, and Empirical Evidence.
  • 4.
    Maintenance and Test Equipment(M&TE) M&TE is a class of industrial equipment that aids maintenance and engineering personnel in ensuring the reliability, efficiency, and profitability of electrical and mechanical systems and equipment. It can be considered part of the on-site implementation of Reliability Engineering principles. BASICALLY…. M&TE IS EQUIPMENT USED TO MAINTAIN AND TEST OTHER EQUIPMENT
  • 5.
    You’ve Seen M&TE, LikelyDidn’t Realize It An automobile has lots of control systems, some are just plain mechanic, many are digital. The digital ones are hackable. We know this for sure, thanks to Miller and Valasek. What we don’t think about are the digital tools used to evaluate whether or not an automobile is ready to return to the road. If some of these look like Control Systems, it’s cause they are. Battery Test Automatic Alignment Engine Diagnostic (OBD-II/EOBD) Computerized Balancing Emissions Compliance
  • 6.
    I NEED ANEW ALTERNATOR?!?! Maintenance and Test Tools were developed to provide objective guidance on replacement and repair of expensive components. For me, it was the difference between driving home happy, and driving home $500 poorer. I went home $500 poorer, because a computer told me to.
  • 7.
  • 8.
    ICS Compared toM&TE • Operations Focus • Networked • Monitor the Industrial Process in Real-Time • Generally Fixed Assets, Installed in Facility • Maintenance Focus • Rarely Networked • Evaluate Specific Criteria Associated with Process Equipment • Mobile, Often Handheld, Goes from Site to Site
  • 9.
  • 10.
    Trend Towards Digital Equipment DigitalEquipment is all the rage these days. Generally, digital measurement of analog signals is more error-proof and reliable, with a far greater degree of accuracy than the older analog meters. With digital, you also get the capability to record your data, compare it to other recordings, trend it, analyze it with advanced math packages. In short, it’s kinda a win.
  • 11.
    The Usual Digital Vulnerabilities M&TEhas taken the same path as ICS, adoption of the commercial hardware and software into the products used on industrial equipment. Examples: 1. Automated Analyzer using MS Access and WinXP 2. Firmware updates without code signing, passwords, or other means of control 3. Calibrators running BusyBox Linux 4. HART Descriptors Updateable Via Plain Jane HTTP
  • 12.
    Bring Your OwnDevice; Industrial Edition Because of economics, accounting practices, and work load, this kind of maintenance and testing work is routinely outsourced to external companies. To the right are snippets of language from an RFP for Substation Testing Services, which is reasonably typical. There was no mention of cyber security in this RFP. Demonstrates Known Concern that Testing Agents have a lot of power to recommend expensive changes So, how do you think they are upgrading this firmware in the relay? (Hint: The answer is laptop)
  • 13.
    Consequences of Malicious Interferenceare Different ICS M&TE Consequences of hacking M&TE and using it maliciously are going to be different than hacking ICS.
  • 14.
    Digital Calibration and Interface Handhelddevices and laptops used for interacting with digital transmitters are pretty ubiquitous at many sites. These have some very advanced capabilities, often outside of the operator’s purview and cyber security monitoring. The most interesting ones get firmware updates and device descriptors from the internet, downloaded directly into the handheld. This is an area I’m planning to explore extensively over the next few years.
  • 15.
    Motor Condition Evaluation There arefew industrial facilities that don’t have motors, these are tested via automated systems that check the motor for shorts, dirt, and poor insulation quality.
  • 16.
    Valve Testing Maintenance ofValves is a big deal in a lot of industrial facilities. Many valves MUST close if required due to safety reasons, so testing is performed to ensure the valve is capable of closing. Other valves are important to the process, and are evaluated for problems on a consistent basis . Valves which fail tests are swiftly replaced. Facilities with these valves often used portable test suites to initiate, monitor, and determine pass/fail of a valve. Common tests are partial stroke, full stroke, and valve stability. The green valve has a smart positioner, which is calibrated using either a HART or Fieldbus connection, and can aid automated testing.
  • 17.
    Relay Testing and Validation Utilitiesand other industrials with large power requirements must test protective relays to ensure they work under all conditions. Automated rigs sets are the usual way these tests, often regulatory required, are performed. Binary and Analog Test Leads USB IEC/61850 PoE/Ethernet Bluetooth
  • 18.
    Eddy Current Testing Thistype of testing is used to find deformities and weaknesses in tubing, pipes, tanks and other large metal components, preferably before leaks, cracks, or breaks happen. There are many variations of this, many industries need ways to inspect piping without opening/digging it up. The Illustration above was via the Non-Destructive Testing Resource Center. https://www.nde- ed.org/EducationResources/educati onresource.htm
  • 19.
    Ground Tests, andother Major Electrical Tests 0 10 20 30 40 50 60 0 50 100 Resistance vs %Distance Resistance We often need to know the electrical characteristics of a system, such as whether it has a good or bad ground, in order to diagnose problems effectively. USB Interface to PC
  • 20.
    Other M&TE • GasAnalysis – Transformer Health • Ultrasonic Testing - Pipe Thickness • Vibration Analysis – Health of Rotating Equipment There’s a lot I didn’t cover, I tried to hit the high points. However, there is a lot more out there, and a lot of it is industry specific.
  • 21.
    SO, WHAT MIGHTAN M&TE HACK LOOK LIKE?
  • 22.
    Motor Test Hacking Motorsare a major component of any industrial facility. They don’t last forever though, and they are pretty expensive components to replace. Pricing taken from “W22 Severe Duty Motors – TEFC” on weg.net and subject to change without notice
  • 23.
    Some Motor Tests •Kelvin Method Winding • Meg-Ohm • Polarization Index (PI) • Step-Voltage • Surge Test Major problems in the motor often result from: • Ground Wall Insulation • Turn to Turn Insulation • Phase to Phase Insulation Most motor failures on the electrical side come from insulation going bad, and getting a short somewhere in the motor.
  • 24.
    Surge Test Process Chargeup Capacitor to a Voltage Setpoint Discharge Capacitor into Single Phase (sometimes Two) Charge Dissipation Produces a Characteristic Waveform Record the Waveform and Associated Data Compare the Current Waveform to Previous Waveform Using ppEAR or Similar Increase the Voltage Setpoint per Test Spec The technical explanation is that the impulse from the capacitor will discharge into the winding, resulting in a wave with a characteristic response. That characteristic response is stored and compared the previous. START STOP Source: http://www.maintenancetechnology.com/2007/03/dc-step-voltage-and- surge-testing-of-motors/ The fun explanation is that we are hitting a punch-me clown repeatedly, harder each time, to see if it does something different than last time.
  • 25.
    Failure Criteria #1 Ifthe frequency of any resulting waveform shifts to the left, the motor is either bad or going bad. The white line has shifted to the left, or using other words has a different zero crossing. This motor is bad, or is going bad. This is when the punch-me clown goes down, loses grip with the floor, bounces, and/or doesn’t come back up the same way. Source: http://www.maintenancetechnology.com/2007/03/dc-step-voltage-and- surge-testing-of-motors/
  • 26.
    Failure Criteria #2 Ifthe changes between each of the pulses result in a different waveform (about 4-5% difference is normal), then the motor is bad, or going bad. This is measures with an equation called Error Area Ratio. It’s basically a percentage difference between two voltage measures taken at the same time in the test. The dotted line in the EAR subgraph is at 4-5%. Source: http://www.existest.com/appnotes/Baker/Teoria%20Surge.pdf That’s the EAR, you see it go off the chart, which signals a response dramatically different from the previous.
  • 27.
    Aside: Why DoesThis Work? Every Played F-Zero? The Cloud Carpet Track of the Icarus Circuit illustrates this well. If the racetrack is a motor winding, Point D is a short circuit. But, you can’t USE Point D until you gain get enough of a top speed to jump all the way over. That top speed is where the insulation in the motor allows a short. Please Don’t Sue Me Nintendo, I love you.
  • 28.
    PROCESS TO FAKEA SIGNAL Discuss the process.
  • 29.
    Consequences Bad Motor Reportsas Good • Failure of the Motor is a Given at Some Point • No Maintenance Activity to Open It Up For Inspection • Impossible to Determine How Much Life is Left • Impact Depends on Motor Function Good Motor Reports as Bad • Engineering will Evaluate the Motor for Useful Life • Motor will be removed, or scheduled for removal • Money spent to order, design, install, and test new motor. • **Possible the motor will be sent to 3rd party for testing**
  • 30.
    Attackers Want To MinimizeDiscovery Risk Bad Motor Reporting as Good Good Motor Reporting as Bad It’s Tough to Test Molten Slag for Irregularities
  • 31.
  • 32.
    Step 1: IdentifyYour High Consequence Equipment Talk with the engineers and operations, most of them either know what equipment they have that, if it fails, presents a High Consequence. If they don’t know already, they usually have the means to do so. High Consequence, No Particular Order: 1. High Cost of Replacement 2. Personnel Safety Concerns 3. Negative Impacts to the Process 4. Regulatory Requirements
  • 33.
    Step 2: IdentifyTests Done on High Consequence Equipment Likely, this will require discussions with engineers and maintenance personnel, as operations may not have the definite answers for this question like they normally due. Tests will depend on the equipment. Figure out the methodology, accuracy, what measures are taken to prove a test, etc.
  • 34.
    Step 3: EvaluateSusceptibility to Malicious Cyber Influence Looking at each test, and the equipment used to perform it, identify how susceptible it is to malicious influence. You’re looking for cyber vulnerable equipment, equipment provided by third parties, poor firmware update, etc. This is basically a high level device risk assessment. Rate them on a scale: 10 – Highly Susceptible … 1 – Not Susceptible
  • 35.
    Step 4: Identifyand Apply Protective Measures Physical Protection Lock Up When Not in Use Allow Use By Only Qualified Individuals Place Tamper Evident Seals Create “Test Checks” to provide confirmation of Tests Cyber Protection Block Firmware Update Mechanisms Check Firmware Signatures w/ Vendor Remove Network Access that is Not Required Heavily Restrict Portable Media Usage Vendor Protection Require Pre-Site Evaluation of Equipment Require Use of Special Hardened Laptops Consider Altering Calibration Requirements to Include Cyber As these aren’t traditional IT devices, the usual protections may not be applicable. Define a set of protective measures that reduce risk from the identified vulnerabilities, and then apply them in order greatest to least susceptibility. Remember, we’re already focusing on the most high consequence equipment already. Some basic protections are to my left, this is a min-max area as it’s very easy to affect maintenance’s processes ($$$).
  • 36.
    Are There AnyQuestions?

Editor's Notes

  • #6 The best initial example I can give is that of a car. Your car has a control system, in fact there are many of them in a modern automobile. These systems are networked, they are digital, and they are hackable, we know this. You are the operator of these control systems, and you drive it around at insane speeds, making both good and poor decisions about how you treat the physical components of that automobile. Then, you take it in for service, either scheduled or unscheduled. And mechanics swarm over your car, testing various systems to ensure they are within specifications. A general process that they follow is this: Check out recall notices and advisories to plan their work Pull Error Codes from your ODB-II (or, as we are in Europe, the EOBD port) Check out their knowledge base for any error codes they encounter Check your tires, tire pressure, alignment, balance, etc Check your engine, RPMs, Timing, combustion, etc Check the transmission, fluid level, wear and t Check your battery and alternator, voltage and current checks, insulation checks, battery capacity and discharge Check your steering, the power steering belt, fluid, pump and look for leaks
  • #19 https://www.nde-ed.org/EducationResources/CommunityCollege/EddyCurrents/Applications/tubeinspection.htm
  • #24 Most motor failures on the electrical side come from insulation going bad, and getting a short somewhere in the motor. The short may not appear until a certain voltage threshold hits, where it will spark and cause more insulation to be damaged, making it easier the next time. Repeat as necessary.
  • #25  This is a characteristic of the impedance being changed because there is a short in the winding that begins during one of the steps).
  • #26  This is a characteristic of the impedance being changed because there is a short in the winding that begins during one of the steps).
  • #27  This is a characteristic of the impedance being changed because there is a short in the winding that begins during one of the steps).