SlideShare a Scribd company logo
Repairing Learning-Enabled Controllers
While Preserving What Works
Pengyuan Eric Lu1
, Matthew Cleaveland1
, Oleg Sokolsky1
, Insup Lee1
, Ivan Ruchkin2
1
University of Pennsylvania, United States
2
University of Florida, United States
15th
ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS 2024)
May 13, 2024
Outline
1. Motivation: What is NN control policy repair, and why?
2. Problem: Repair with Preservation (RwP)
3. Solution: Incremental Simulated Annealing Repair (ISAR)
4. Evaluation: Mountain car and unmanned underwater vehicle
2
1. Motivation:
What is NN control policy repair, and why?
3
4
• Safety-critical cyber-physical systems (CPS) had a market cap of $86 billion in
2022, with a 7.6% annual growth rate
• Increasingly, safety-critical CPS use learned control policies, implemented with
neural networks (NN)
Learning-Enabled Control Policies in CPS
Example: Unmanned Underwater Vehicle
5
Example: Unmanned Underwater Vehicle
6
• Objective: stay outside the danger zone for 30 sec
• Signal temporal logic (STL) [Maler 2004] specifies this as:
• What if the NN policy fails from some initial states s0
?
• How do we repair the policy to fix those states?
Two Requirements of NN Repair
• Correctness – fulfill a formal specification
• Provably correct repair [Sotoudeh 2021, Cruz 2021, Fu 2022]: require strict assumptions,
e.g. limited to finite & known inputs, or specs like a bound on the NN’s Lipschitz constant
• What if no need to always guarantee repair? Perform best-effort repair instead
• Preservation of knowledge – do not unlearn useful behaviors
• Applicable when a NN only fails on a small subset of inputs/specifications
• Cannot simply retrain with a different loss function
• Not explicitly considered in most literature (details later)
7
Example: Non-Preserving and Preserving Repair
8
A repair algorithm with knowledge preservation
A repair algorithm without knowledge preservation
Challenge of Repairing NN Control Policies
9
• Challenge: complex relationship between NN parameters and performance
• Hard to improve performance by modifying parameters; no true action labels
• Hard to extend input-output NN repair to NN control policy repair
• Hard to preserve previously correct behaviors when modifying parameters
Related Work 1: Input-Output NN Repair
• Sound and Complete Neural Network Repair with Minimality and Locality Guarantees
[Fu and Li, 2021]
• Fixes only the neighborhood of a failed input
• Assumptions: input-output specs, finite failed inputs, piecewise linear (e.g., ReLU) NNs
• Local Repair of Neural Networks Using Optimization
[Majd, Zhou, Amor, Fainekos, and Sankaranarayanan, 2021]
• Fine-tunes a single layer to fit the output into the safe set, while minimizing the original loss
• Assumptions: the goal is specifiable in the output space and achievable within one layer
10
• Runtime-Safety-Guided Policy Repair [Zhou, Gao, Kim, Kang and Li, 2020]
• Trains a NN to imitate a safe MPC, while minimizing the delta of the old and new parameters
• Assumption: small delta in parameters ⇒ small delta in performance (Lipschitz continuity)
• Does not always hold when performance is measured by temporal logic
11
• Summary: limitations of the state of the art
• Goal specs on NN outputs, not closed-loop trajectories
• Finite sets of failed inputs
• Specific to certain NN architectures
• Need for an existing safe controller
• Lipschitz continuity of performance metrics w/r/t NN parameters
Related Work 2: Closed-Loop NN Repair
2. Problem: Repair with Preservation (RwP)
12
Problem Statement (Informal)
13
Design a repair algorithm for NN control policies such that:
- The repair fixes as many previously failed initial states as possible
- The system does not fail on the previously successful initial states
Our Notation and Assumptions
14
STL Robustness (a.k.a. Quantitative Semantics)
• STL robustness [Fainekos & Pappas 2009, Donze and Maler 2010]: a real-valued score
that measures how well a trajectory satisfies an STL specification
15
• Notation: consider different initial states given dynamics and control policy :
Target set
Partitioning the Initial States by Policy
• A control policy partitions the initial state space into successful and failed subsets:
16
✅
×
22
12
10
30
failed
initial states
successful
initial states
Repair with Preservation (RwP) Problem
17
Find an alternative policy that maximizes the quantity of repaired initial states,
while preserving the correctness of successful initial states
3. Solution:
Incremental Simulated Annealing Repair (ISAR)
18
Divide and Conquer: Incremental Repair
19
22
12
10
30
• If a solution is found, add the repaired
region to the successful set.
• Whether a solution is found or not,
repeat for the next failed region.
Tackling the Incremental Repair Challenges
2. In what order to select failed regions for repair?
• Greedy: pick the region with the highest min STL robustness on sampled initial states
3. How to enforce the preservation constraint on successful regions?
• First, add a logarithmic barrier to the objective function
• Second, reject policies that break the constraint on sampled initial states
4. How to avoid getting stuck in local optima?
• Simulated annealing: perturb parameters with increasing randomness [Kirkpatrick 1983]
20
1. How do we know if all initial states in a region satisfy the specification?
• First, estimate robustness on a finite sample of initial states
• Second, verify safety on an infinite set (incomplete but sound) [Ivanov et al. 2019]
21
Refinement of the RwP Problem
Find an alternative policy that
• Maximizes the number of repaired sampled initial states
• Subject to keeping the previously verified initial states still verified
Repairing One Region
22
Objective function to maximize:
Repairing One Region
23
Objective function to maximize:
// perturb NN weights
// measure improvement
// flip a biased coin
// save new NN weights
// increate the temperature
Incremental Simulated Annealing Repair (ISAR)
24
Sort all failed regions by STL
robustness on current policy
Run safeguarded simulated
annealing on the failed region
Append the repaired region
to the successful set
Discard the selected region
if repaired
if not repaired
Initialize:
successful set := verified set
pick a failed region
with highest min
STL robustness
Terminate
if empty
4. Evaluation:
Unmanned underwater vehicle (UUV) and mountain car (MC)
25
Case Studies
26
Unmanned Underwater Vehicle (UUV) Mountain Car (MC)
● Staying between and
for 30 seconds
● Reaching within 110 seconds
27
UUV
MC
Quantitative Results
28
29
UUV
verification
per region
UUV
sim annealing
per iter
UUV
STL rob check
for all regions
MC
verification
per region
MC
sim annealing
per iter
MC
STL rob check
for all regions
Computation Time Ranges
Conclusion
• We resolve conflicts between initial states when repairing NN control policies for STL tasks
• We formulate the Repair with Preservation (RwP) problem and refine it into a solvable version
• We propose Incremental Simulated Annealing Repair (ISAR) to tackle the RwP problem
• We evaluate ISAR on two standard NN control benchmarks
30
Limitations and Future Work
• ISAR has computational inefficiencies:
- It randomly perturbs the NN until fixing a failed region while preserving successful regions
- Verification is done from scratch instead of incrementally
• Future work directions:
- More efficient search: interpolate along the trade-off between repair and preservation
- Resolving conflicts between multiple safety specifications, instead of initial states
- More ablation studies: various selection orders, architectures, hyperparameters, systems
31
References
• [Cohen 2022] Cohen, Dor, and Ofer Strichman. "Automated repair of neural networks." arXiv preprint arXiv:2207.08157 (2022).
• [Fainekos & Pappas 2009] Fainekos, Georgios E., and George J. Pappas. "Robustness of temporal logic specifications for continuous-time
signals." Theoretical Computer Science (2009).
• [Fu 2021] Fu, Feisi, and Wenchao Li. "Sound and complete neural network repair with minimality and locality guarantees." arXiv preprint
arXiv:2110.07682 (2021).
• [Fu 2022] Fu, Feisi, et al. "Reglo: Provable neural network repair for global robustness properties." Workshop on Trustworthy and Socially
Responsible Machine Learning, NeurIPS (2022).
• [Majd et al. 2021] “Local repair of neural networks using optimization”. arXiv preprint arXiv:2109.14041.
• [Donze & Maler 2010] "Robust satisfaction of temporal logic over real-valued signals." International Conference on Formal Modeling and
Analysis of Timed Systems. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010.
• [Ivanov 2019] Ivanov, Radoslav, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. "Verisig: verifying safety properties of hybrid
systems with neural network controllers.“ HSCC (2019).
• [Kirkpatrick 1983] Kirkpatrick, Scott, C. Daniel Gelatt Jr, and Mario P. Vecchi. "Optimization by simulated annealing." Science (1983).
• [Levine and Abbeel 2014] Levine, Sergey, and Pieter Abbeel. "Learning neural network policies with guided policy search under unknown
dynamics." Advances in neural information processing systems 27 (2014).
• [Maler 2004] Maler, Oded, and Dejan Nickovic. "Monitoring temporal properties of continuous signals." FTRTFT (2004).
• [Sotoudeh 2021] Sotoudeh, Matthew, and Aditya V. Thakur. "Provable repair of deep neural networks." SIPLAN (2021).
• [Zhou 2020] Zhou, Weichao, Ruihan Gao, BaekGyu Kim, Eunsuk Kang, and Wenchao Li. "Runtime-safety-guided policy repair." RV (2020).
32
Neural Network (NN) Repair
33
NN Repair
With ground-truth outputs:
adversarial learning
Without ground-truth outputs:
direct param modification
Not modifying the NN
architecture – only
search for alternative
parameters
Modifying the NN
architecture
Majority of
literature focus
on this
Ground-truth
outputs available
Ground-truth outputs
unavailable
• Ultimate goal: NNs that fulfill
formal specifications from all
inputs, e.g.
• NN repair aims to modify NN
parameters to make the
counterexamples satisfy the
spec. [Cohen 2022]
Repairing NN-based Control Policies
• General-purpose NNs vs. NN-based control policies
• The former is usually evaluated on a single pass output
34
• The latter is usually evaluated on a state trajectory of multiple passes,
• E.g.: UUV must maintain its y position between 10 and 50 for 30 seconds.
• E.g.: Robot must first grab object O at position A, and then go to position B.
STL Robustness (a.k.a. Quantitative
Semantics)
35
Incremental Simulated Annealing Repair
(ISAR)
36

More Related Content

Similar to Repairing Learning-Enabled Controllers While Preserving What Works

Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsAcm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsVinayak Hegde
 
Towards Functional Safety compliance of Matrix-Matrix Multiplication
Towards Functional Safety compliance of Matrix-Matrix MultiplicationTowards Functional Safety compliance of Matrix-Matrix Multiplication
Towards Functional Safety compliance of Matrix-Matrix MultiplicationJavier Fernández Muñoz
 
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...James McCombs
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]SubhradeepMaji
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulumIJECEIAES
 
Efficient analytical and hybrid simulations using OpenSees
Efficient analytical and hybrid simulations using OpenSeesEfficient analytical and hybrid simulations using OpenSees
Efficient analytical and hybrid simulations using OpenSeesopenseesdays
 
Seminar_Thoracic EIT_UWO
Seminar_Thoracic EIT_UWOSeminar_Thoracic EIT_UWO
Seminar_Thoracic EIT_UWOPeyman Rahmati
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?Anubhav Jain
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyLionel Briand
 
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...Edge AI and Vision Alliance
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptxthanhdowork
 
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia Laboratory
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia LaboratoryRT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia Laboratory
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia LaboratoryOPAL-RT TECHNOLOGIES
 
Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...Andrea Castelletti
 
hint co hint-based configuration of co-simulations
hint co hint-based configuration of co-simulationshint co hint-based configuration of co-simulations
hint co hint-based configuration of co-simulationsmehmor
 

Similar to Repairing Learning-Enabled Controllers While Preserving What Works (20)

Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1
 
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsAcm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
 
Towards Functional Safety compliance of Matrix-Matrix Multiplication
Towards Functional Safety compliance of Matrix-Matrix MultiplicationTowards Functional Safety compliance of Matrix-Matrix Multiplication
Towards Functional Safety compliance of Matrix-Matrix Multiplication
 
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Use CNN for Sequence Modeling
Use CNN for Sequence ModelingUse CNN for Sequence Modeling
Use CNN for Sequence Modeling
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulum
 
Efficient analytical and hybrid simulations using OpenSees
Efficient analytical and hybrid simulations using OpenSeesEfficient analytical and hybrid simulations using OpenSees
Efficient analytical and hybrid simulations using OpenSees
 
Seminar_Thoracic EIT_UWO
Seminar_Thoracic EIT_UWOSeminar_Thoracic EIT_UWO
Seminar_Thoracic EIT_UWO
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
 
lecture 2 parametric yield.pdf
lecture 2 parametric yield.pdflecture 2 parametric yield.pdf
lecture 2 parametric yield.pdf
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx
 
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia Laboratory
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia LaboratoryRT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia Laboratory
RT15 Berkeley | Optimized Power Flow Control in Microgrids - Sandia Laboratory
 
Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...
 
Fa19_P1.pptx
Fa19_P1.pptxFa19_P1.pptx
Fa19_P1.pptx
 
hint co hint-based configuration of co-simulations
hint co hint-based configuration of co-simulationshint co hint-based configuration of co-simulations
hint co hint-based configuration of co-simulations
 
Deep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLabDeep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLab
 

More from Ivan Ruchkin

Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Ivan Ruchkin
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...Ivan Ruchkin
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Ivan Ruchkin
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsIvan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Ivan Ruchkin
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceIvan Ruchkin
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsIvan Ruchkin
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsIvan Ruchkin
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsIvan Ruchkin
 
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical SystemsOvercoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical SystemsIvan Ruchkin
 
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...Ivan Ruchkin
 
Data Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolData Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolIvan Ruchkin
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Ivan Ruchkin
 
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...Ivan Ruchkin
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Ivan Ruchkin
 
On the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart SystemsOn the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart SystemsIvan Ruchkin
 
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...Ivan Ruchkin
 
Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19Ivan Ruchkin
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsIvan Ruchkin
 
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-AdaptationTowards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-AdaptationIvan Ruchkin
 

More from Ivan Ruchkin (20)

Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical Systems
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical Systems
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification Assumptions
 
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical SystemsOvercoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
 
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
 
Data Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolData Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification Tool
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
 
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
 
On the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart SystemsOn the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart Systems
 
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
 
Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
 
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-AdaptationTowards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
 

Recently uploaded

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxEasyPrinterHelp
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 

Recently uploaded (20)

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 

Repairing Learning-Enabled Controllers While Preserving What Works

  • 1. Repairing Learning-Enabled Controllers While Preserving What Works Pengyuan Eric Lu1 , Matthew Cleaveland1 , Oleg Sokolsky1 , Insup Lee1 , Ivan Ruchkin2 1 University of Pennsylvania, United States 2 University of Florida, United States 15th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS 2024) May 13, 2024
  • 2. Outline 1. Motivation: What is NN control policy repair, and why? 2. Problem: Repair with Preservation (RwP) 3. Solution: Incremental Simulated Annealing Repair (ISAR) 4. Evaluation: Mountain car and unmanned underwater vehicle 2
  • 3. 1. Motivation: What is NN control policy repair, and why? 3
  • 4. 4 • Safety-critical cyber-physical systems (CPS) had a market cap of $86 billion in 2022, with a 7.6% annual growth rate • Increasingly, safety-critical CPS use learned control policies, implemented with neural networks (NN) Learning-Enabled Control Policies in CPS
  • 6. Example: Unmanned Underwater Vehicle 6 • Objective: stay outside the danger zone for 30 sec • Signal temporal logic (STL) [Maler 2004] specifies this as: • What if the NN policy fails from some initial states s0 ? • How do we repair the policy to fix those states?
  • 7. Two Requirements of NN Repair • Correctness – fulfill a formal specification • Provably correct repair [Sotoudeh 2021, Cruz 2021, Fu 2022]: require strict assumptions, e.g. limited to finite & known inputs, or specs like a bound on the NN’s Lipschitz constant • What if no need to always guarantee repair? Perform best-effort repair instead • Preservation of knowledge – do not unlearn useful behaviors • Applicable when a NN only fails on a small subset of inputs/specifications • Cannot simply retrain with a different loss function • Not explicitly considered in most literature (details later) 7
  • 8. Example: Non-Preserving and Preserving Repair 8 A repair algorithm with knowledge preservation A repair algorithm without knowledge preservation
  • 9. Challenge of Repairing NN Control Policies 9 • Challenge: complex relationship between NN parameters and performance • Hard to improve performance by modifying parameters; no true action labels • Hard to extend input-output NN repair to NN control policy repair • Hard to preserve previously correct behaviors when modifying parameters
  • 10. Related Work 1: Input-Output NN Repair • Sound and Complete Neural Network Repair with Minimality and Locality Guarantees [Fu and Li, 2021] • Fixes only the neighborhood of a failed input • Assumptions: input-output specs, finite failed inputs, piecewise linear (e.g., ReLU) NNs • Local Repair of Neural Networks Using Optimization [Majd, Zhou, Amor, Fainekos, and Sankaranarayanan, 2021] • Fine-tunes a single layer to fit the output into the safe set, while minimizing the original loss • Assumptions: the goal is specifiable in the output space and achievable within one layer 10
  • 11. • Runtime-Safety-Guided Policy Repair [Zhou, Gao, Kim, Kang and Li, 2020] • Trains a NN to imitate a safe MPC, while minimizing the delta of the old and new parameters • Assumption: small delta in parameters ⇒ small delta in performance (Lipschitz continuity) • Does not always hold when performance is measured by temporal logic 11 • Summary: limitations of the state of the art • Goal specs on NN outputs, not closed-loop trajectories • Finite sets of failed inputs • Specific to certain NN architectures • Need for an existing safe controller • Lipschitz continuity of performance metrics w/r/t NN parameters Related Work 2: Closed-Loop NN Repair
  • 12. 2. Problem: Repair with Preservation (RwP) 12
  • 13. Problem Statement (Informal) 13 Design a repair algorithm for NN control policies such that: - The repair fixes as many previously failed initial states as possible - The system does not fail on the previously successful initial states
  • 14. Our Notation and Assumptions 14
  • 15. STL Robustness (a.k.a. Quantitative Semantics) • STL robustness [Fainekos & Pappas 2009, Donze and Maler 2010]: a real-valued score that measures how well a trajectory satisfies an STL specification 15 • Notation: consider different initial states given dynamics and control policy : Target set
  • 16. Partitioning the Initial States by Policy • A control policy partitions the initial state space into successful and failed subsets: 16 ✅ × 22 12 10 30 failed initial states successful initial states
  • 17. Repair with Preservation (RwP) Problem 17 Find an alternative policy that maximizes the quantity of repaired initial states, while preserving the correctness of successful initial states
  • 18. 3. Solution: Incremental Simulated Annealing Repair (ISAR) 18
  • 19. Divide and Conquer: Incremental Repair 19 22 12 10 30 • If a solution is found, add the repaired region to the successful set. • Whether a solution is found or not, repeat for the next failed region.
  • 20. Tackling the Incremental Repair Challenges 2. In what order to select failed regions for repair? • Greedy: pick the region with the highest min STL robustness on sampled initial states 3. How to enforce the preservation constraint on successful regions? • First, add a logarithmic barrier to the objective function • Second, reject policies that break the constraint on sampled initial states 4. How to avoid getting stuck in local optima? • Simulated annealing: perturb parameters with increasing randomness [Kirkpatrick 1983] 20 1. How do we know if all initial states in a region satisfy the specification? • First, estimate robustness on a finite sample of initial states • Second, verify safety on an infinite set (incomplete but sound) [Ivanov et al. 2019]
  • 21. 21 Refinement of the RwP Problem Find an alternative policy that • Maximizes the number of repaired sampled initial states • Subject to keeping the previously verified initial states still verified
  • 22. Repairing One Region 22 Objective function to maximize:
  • 23. Repairing One Region 23 Objective function to maximize: // perturb NN weights // measure improvement // flip a biased coin // save new NN weights // increate the temperature
  • 24. Incremental Simulated Annealing Repair (ISAR) 24 Sort all failed regions by STL robustness on current policy Run safeguarded simulated annealing on the failed region Append the repaired region to the successful set Discard the selected region if repaired if not repaired Initialize: successful set := verified set pick a failed region with highest min STL robustness Terminate if empty
  • 25. 4. Evaluation: Unmanned underwater vehicle (UUV) and mountain car (MC) 25
  • 26. Case Studies 26 Unmanned Underwater Vehicle (UUV) Mountain Car (MC) ● Staying between and for 30 seconds ● Reaching within 110 seconds
  • 27. 27
  • 29. 29 UUV verification per region UUV sim annealing per iter UUV STL rob check for all regions MC verification per region MC sim annealing per iter MC STL rob check for all regions Computation Time Ranges
  • 30. Conclusion • We resolve conflicts between initial states when repairing NN control policies for STL tasks • We formulate the Repair with Preservation (RwP) problem and refine it into a solvable version • We propose Incremental Simulated Annealing Repair (ISAR) to tackle the RwP problem • We evaluate ISAR on two standard NN control benchmarks 30
  • 31. Limitations and Future Work • ISAR has computational inefficiencies: - It randomly perturbs the NN until fixing a failed region while preserving successful regions - Verification is done from scratch instead of incrementally • Future work directions: - More efficient search: interpolate along the trade-off between repair and preservation - Resolving conflicts between multiple safety specifications, instead of initial states - More ablation studies: various selection orders, architectures, hyperparameters, systems 31
  • 32. References • [Cohen 2022] Cohen, Dor, and Ofer Strichman. "Automated repair of neural networks." arXiv preprint arXiv:2207.08157 (2022). • [Fainekos & Pappas 2009] Fainekos, Georgios E., and George J. Pappas. "Robustness of temporal logic specifications for continuous-time signals." Theoretical Computer Science (2009). • [Fu 2021] Fu, Feisi, and Wenchao Li. "Sound and complete neural network repair with minimality and locality guarantees." arXiv preprint arXiv:2110.07682 (2021). • [Fu 2022] Fu, Feisi, et al. "Reglo: Provable neural network repair for global robustness properties." Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS (2022). • [Majd et al. 2021] “Local repair of neural networks using optimization”. arXiv preprint arXiv:2109.14041. • [Donze & Maler 2010] "Robust satisfaction of temporal logic over real-valued signals." International Conference on Formal Modeling and Analysis of Timed Systems. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. • [Ivanov 2019] Ivanov, Radoslav, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. "Verisig: verifying safety properties of hybrid systems with neural network controllers.“ HSCC (2019). • [Kirkpatrick 1983] Kirkpatrick, Scott, C. Daniel Gelatt Jr, and Mario P. Vecchi. "Optimization by simulated annealing." Science (1983). • [Levine and Abbeel 2014] Levine, Sergey, and Pieter Abbeel. "Learning neural network policies with guided policy search under unknown dynamics." Advances in neural information processing systems 27 (2014). • [Maler 2004] Maler, Oded, and Dejan Nickovic. "Monitoring temporal properties of continuous signals." FTRTFT (2004). • [Sotoudeh 2021] Sotoudeh, Matthew, and Aditya V. Thakur. "Provable repair of deep neural networks." SIPLAN (2021). • [Zhou 2020] Zhou, Weichao, Ruihan Gao, BaekGyu Kim, Eunsuk Kang, and Wenchao Li. "Runtime-safety-guided policy repair." RV (2020). 32
  • 33. Neural Network (NN) Repair 33 NN Repair With ground-truth outputs: adversarial learning Without ground-truth outputs: direct param modification Not modifying the NN architecture – only search for alternative parameters Modifying the NN architecture Majority of literature focus on this Ground-truth outputs available Ground-truth outputs unavailable • Ultimate goal: NNs that fulfill formal specifications from all inputs, e.g. • NN repair aims to modify NN parameters to make the counterexamples satisfy the spec. [Cohen 2022]
  • 34. Repairing NN-based Control Policies • General-purpose NNs vs. NN-based control policies • The former is usually evaluated on a single pass output 34 • The latter is usually evaluated on a state trajectory of multiple passes, • E.g.: UUV must maintain its y position between 10 and 50 for 30 seconds. • E.g.: Robot must first grab object O at position A, and then go to position B.
  • 35. STL Robustness (a.k.a. Quantitative Semantics) 35