Seminar Topic on Threats to AI-Driven Industry 4.O From Neural Backdoors ppt presentation
report on Threats to AI-Driven Industry 4.O From Neural Backdoors
Threats to AI-Driven Industry 4.O From Neural Backdoors.pdf
1. Threats to AI-Driven Industry
4.O From Neural Backdoors
Presented By: Penta Tech
Department ofComputer Science and Engineering,
F.O.E.T, K.M.C.LULucknow
2. Introduction
● Industry 4.0 involves integrating advanced digital technologies into manufacturing and
industrial processes.
● It relies on IoT, AI, big data analytics, cloud computing, and robotics to enable automation
and data exchange.
● This study focuses on enhancing cybersecurity in the context of Industry 4.0 and industrial
control systems (ICS).
● It emphasizes the importance of protecting ICS, particularly SCADA systems, from evolving
cyber threats.
● Backdoors, which are covert access points, pose a significant risk to Industry 4.0 systems.
● The paper explores adversarial learning techniques and strategies to enhance model
robustness and mitigate overfitting risks.
3. Industry 4.0 and Industrial Control System
● Industrial Control Systems (ICS) are used to automate and operate industrial processes in
sectors like energy, transportation, and manufacturing.
● Components of ICS systems include process sensors, control logic, actuators, human-
machine interfaces (HMIs), and networks.
● Physical security involves controlling access and protecting the physical environment.
● Network security includes firewalls, intrusion detection systems, and encryption to
secure the network infrastructure.
● ICS security is a complex field that requires staying updated on the latest threats and
trends.
● Application security involves secure coding practices and regularly scanning for
vulnerabilities.
5. ICS Components
● Components of an ICS system include control loops, human interfaces, and remote
diagnostics and maintenance tools.
● Control loops consist of a sensor, controller, and actuator to measure, interpret, and control
the process.
● Design factors for an ICS system include control timing, geographic distribution, hierarchy,
control complexity, availability, and impact of failures.
● Security is a crucial design factor to protect against unauthorized access or disruption.
● Cost-effectiveness is considered to meet security and reliability requirements.
● Performance is important to ensure the ICS system meets process performance needs.
6. SCADA
● SCADA systems are crucial for controlling and monitoring dispersed assets in various industries such as water
distribution, oil and gas pipelines, and electrical utilities.
● The control server stores and processes data from the field sites, while the RTUs and PLCs handle local control.
● SCADA systems use software programs to define monitoring parameters, acceptable ranges, and response
actions for deviations.
● Redundancy and fault tolerance are important considerations in SCADA system design to ensure reliability.
● The control center collects and logs information, displays it on the HMI, and generates actions based on events.
● SCADA communication topologies can vary, such as point-to-point, series, series-star, and multi-drop
configurations.
7. Digital Twins
● A digital twin is a virtual representation of a physical object or system used to monitor,
analyze, and optimize its performance.
● In industrial control systems (ICS), digital twins can improve efficiency, reduce risk, and
enhance decision-making.
● Digital twins can monitor ICS components, simulate performance under different conditions,
and train operators.
● They can help identify potential problems, test new configurations, and optimize ICS
performance.
● Digital twins are a relatively new technology with the potential to revolutionize how we
operate ICS systems.
● Specific uses of digital twins in ICS include real-time monitoring, simulation of performance,
and operator training.
8. Adversarial Learning
● Adversarial learning focuses on understanding and defending against adversarial attacks in
machine learning.
● Adversarial examples are inputs intentionally modified to deceive machine learning models.
● Adversarial learning aims to understand vulnerabilities and enhance model robustness and
security.
● Different types of adversarial attacks include evasion attacks, poisoning attacks, and model
inversion attacks.
● Evasion attacks modify inputs to cause misclassification, while poisoning attacks manipulate
training data to bias the model's learning process.
● Model inversion attacks aim to infer sensitive information about the training data or the
model itself.
10. Overfitting
● Overfitting occurs when a machine learning model performs well on the training data but
fails to generalize to new, unseen data.
● Signs of overfitting include low training error but high test error, a large gap between
training and test performance, and an overly complex model.
● Overfitting is more likely to happen with a small training dataset, a complex model,
excessive training iterations, or noisy or irrelevant features.
● To address overfitting, techniques such as increasing training data, feature selection or
reduction, regularization, cross-validation, and early stopping can be applied.
11. Neural Backdoors
● Neural backdoors are security threats in machine learning where a malicious actor
manipulates a neural network to exhibit undesired behavior.
● Backdoor patterns or triggers are inserted during the training phase to cause specific, often
malicious, outputs.
● Neural backdoors are designed to be inconspicuous and difficult to detect during normal
operation.
● They can be used for misclassification, data exfiltration, or extracting sensitive information
from the model.
● Backdoors are triggered by specific input patterns that are carefully crafted by the attacker.
12. ICS and Adversarial Attack
● Industrial Control Systems (ICS) are critical for managing infrastructure but are increasingly
exposed to cyber attacks due to connectivity and remote access.
● Integrating traditional IT security mechanisms into ICS systems is challenging due to
resource constraints and legacy devices lacking modern security measures.
● Intrusion Detection Systems (IDS) tailored for ICS are being developed to monitor network
and sensor data for attacks and anomalies.
● Adversarial Machine Learning (AML) poses a risk to ICS by manipulating data to bypass IDS,
potentially causing delayed detection, information leakage, financial loss, and safety risks.
● Thorough evaluation of IDS against AML attacks is essential as machine learning-based
detection mechanisms become more prevalent.
● Empirical investigation and analysis of supervised machine learning algorithms in ICS
environments help understand the impact of AML attacks.
13. Industrial control system: Power system
Dataset splitting:dataset is divided into 60% training and 40% testing data points,
Evaluation of machine learning models: Identify the best ml model for intrusion detection in
the ICS system.
Adversarial sample generation: Adversarial samples are generated using the Jacobian-
based Saliency map method, which introduces perturbations to the data points to create
adversarial instances.
Performance evaluation: The trained models identified in step 2 are tested using the
generated adversarial samples to assess their performance under attack conditions.
Adversarial sample inclusion and retraining: A percentage of the generated adversarial
samples from step 3 is included in the training data.
14. Power system framework testbed
for used for generating dataset
● G1 and G2 are the main
generators.
● R1, R2, R3, and R4 are
Intelligent Electronic Devices
(IEDs) responsible for switching
the breakers (BR1, BR2, BR3,
BR4).
● Other network monitoring
devices, such as SNORT and
Syslog servers, are connected
to the testbed.
16. Datasets
● A dataset was generated from a power system testbed, including both benign and malicious
data points
● The data points were classified into three categories: 'no event', 'natural event', and 'attack
event'.
● The 'no event' and 'natural event' instances were grouped together to represent benign
activity.
17. Attacks generated from dataset
Data injection attack
Relay setting change attack
Remote tripping command injection attack
Line maintenance
Short-circuit fault
18. Feature selection
Identify which attributes best describe the dataset.
The data points within the power system dataset contain attributes associated with synchrophasor
measurements.
The dataset contains 128 features.
19. Synchrophasor Measurements:
29 types of measurements from each
synchrophasor measurement unit (PMU).
4 PMUs in the power system testbed, resulting in
116 synchrophasor measurement columns.
These measurements capture electrical
parameters such as voltage, current, power,
frequency, and phase angle at specific locations
in the power system.
Control Panel Logs, Snort Alerts, and Relay Logs:
12 types of measurements derived from control
panel logs, snort alerts, and relay logs.
These measurements come from the four
synchrophasor measurement units and relays.
They provide information about system control
operations, security alerts from the Snort
intrusion detection system, and relay-related
events.
20. Model training
● The power system dataset was used to evaluate supervised machine learning algorithms for detecting
cyber attacks in an ICS environment.
● The choice of algorithm depends on its performance for the specific problem and the data
characteristics.
● Generative models (Bayesian Network, Naive Bayes) and discriminative models (J48 Decision Tree,
Support Vector Machine) were evaluated.
● The dataset was split into 60% for training and 40% for testing.
● Class balancing techniques were applied to address the uneven distribution of class labels in the training
dataset.
● Random Forest and J48 decision tree without pruning showed the highest performance among the
evaluated classifiers.
● The study emphasizes the importance of selecting appropriate classifiers and considering dataset
characteristics for developing machine learning-based IDSs in ICS systems.
21. Attacker Model
● Assumption: The attacker has access to the dataset and its features in the power
system scenario.
● The attacker, as the chief network engineer, knows the features used by the IDS for
classification but lacks knowledge of the exact algorithm configuration.
● Goal: Bypass the IDS to cause further damage or share information with competitors
for harming the organization.
● No protective measures are in place to safeguard against AML attacks or protect
leaked information and the ICS.
● The attack is classified as a grey box attack due to the partial knowledge the adversary
possesses about the IDS.
22. Defending Adversarial machine learning
Adversarial Training: It involves retraining the machine learning model on a dataset that includes both original
and adversarial samples. This technique has shown improved efficiency against adversarial samples, as
demonstrated by Goodfellow et al. in the field of visual computing.
Adversarial Sample Detection: This technique focuses on detecting the presence of adversarial samples using
mechanisms such as direct classification, neural network uncertainty, or input processing. However, these
detection mechanisms have been found to be weak in defending against adversarial machine learning attacks.
23. Robustness Evaluation using Adversarial Training: The paper further evaluates the robustness of supervised
machine learning classifiers against adversarial machine learning using adversarial training. A 10-fold cross-
validation method is employed, and random samples of 10% of the adversarial data points in the testing dataset
are included in the original training dataset. The average F1-score is calculated across the 10 models, and the
results are reported.
Increase in Classification Performance: The experiments show that including adversarial samples in the training
data improves the classification performance of the Random Forest and J48 models for several combinations of
JSMA's parameters. The Random Forest model achieves a greater overall increase in classification performance
compared to the J48 model, indicating its robustness in classifying adversarial samples.
Ensemble Models: The performance improvement of the Random Forest model suggests the robustness of
ensemble machine learning algorithms against adversarial techniques. Random Forests are ensembles of decision
trees, whereas J48 is a single decision tree, explaining the difference in their classification performance increases.
24. Conclusions
Machine learning-based Intrusion Detection Systems (IDSs) are important tools for detecting cyber attacks in
Industrial Control Systems (ICS). However, these systems are vulnerable to attacks known as Adversarial Machine
Learning (AML), where adversaries manipulate data to bypass the IDS and cause damage. To develop more robust
IDSs, it is crucial to understand how AML attacks can be applied in ICS systems and use adversarial training to
make the models more resistant to such attacks.