Ivan Ruchkin
Postdoctoral Researcher
University of Pennsylvania
Some Department of Engineering or Science
Some University
Some day and month, Spring 2022
Overcoming Heterogeneity in
Autonomous Cyber-Physical Systems
— Undergrad @ Moscow State University, Moscow, Russia
○ Applied math & computer science
— PhD @ Carnegie Mellon University, Pittsburgh, PA
○ Software engineering & formal methods
○ Advisor: David Garlan
— Postdoc @ University of Pennsylvania, Philadelphia, PA
○ Guarantees for learning-enabled systems
○ Supervisors: Oleg Sokolsky & Insup Lee
— Industry @ NASA JPL, CMU SEI, Google Summer of Code, …
○ Software engineering
My background is heterogeneous
2
3
Autonomy disasters threaten our society
Research area: Cyber-Physical Systems (CPS)
4
— Software-driven systems operating autonomously in the physical world
— Interdisciplinary expertise: control, networks, formal methods, mechanics, AI, ...
CPS research is growing and well-funded
5
I started working on CPS
Safety: “nothing bad happens”
— No harm to humans, infrastructure,
environment, or the CPS itself
— Safety violations → economic losses & deaths
What do we want from CPS?
6
Trustworthiness: “worthy of our trust”
— Able to rely on CPS to perform as expected
— Requires understanding and evidence
— Loss of trust → slower development & adoption
Strong comprehensive guarantees
— Safety in the face of uncertainty
— Absence of modeling errors
— Perception accuracy
My research contribution: engineering frameworks for CPS
Automated engineering tools
— Languages and algorithms
— For design, analysis, and monitoring
— Incorporating domain knowledge
7
Formal Methods:
- Formal logic
- Verification
- Run-time monitoring
Artificial Intelligence:
- Neural networks (NN)
- Probabilistic reasoning
- Statistical bounds
Software/Systems Engineering:
- Model-based engineering
- Design representations
- Domain-specific languages
My research combines different techniques
JSS’21
EMSOFT’20
IEEE SW’19
TAC’14
ICCPS’22a
ICCPS’22b
Submitted’22
WinterSim’21
ACSOS’20
FM’18
SEAMS’17
FGCS’18
CBSE’15
EMSOFT’14
Monterey’12 SYRCoSE’10
FMOS’21
SISSY’21
MARTCPS’16
ACVI’16
CPS-SPC’15
ACES-MB’15
AVICPS’14
CHASE’21
Data
Models
How do we engineer CPS today?
8
Synthesis &
Analysis
Design
System
Control Perception
Monitors
Guarantees
Recovery
Autonomous systems have little awareness
9
Vision: Awareness of
Own Limitations
Heterogeneity leads to safety issues
Data
Models
10
Synthesis &
Analysis
Design
System
Control Perception
Monitors
Guarantees
Heterogeneity:
diversity of components and models
I. Fragmentation
Recovery
II. Inconsistencies
III.
How do we engineer CPS today?
Heterogeneity fragments guarantees & monitors
11
System
Sensor
data
Control
NN
Perception
NN
Dynamics
Simulation
Confidence
monitor
Closed-loop
NN
verification
Fragmented
?
Dive 1: Confidence
Composition
?
?
“Safe”
Recovery
Confidence
monitor
— Correct operation of a component
○ E.g., confidence in NN outputs [Guo’17]
○ Difficult to connect to system-level safety
— Satisfaction of requirements (including safety)
○ E.g., temporal specification monitoring [Deshmukh’17]
○ Violations detected too late to recover
— Future violations of requirements
○ E.g., Simplex architecture [Sha’01]
○ Key idea: assumption monitoring, e.g., of proof premises [Mitsch’14]
What to monitor at run time?
12
Assumptions relate guarantees, monitors & recovery
13
System
Sensor
data
Control
NN
Perception
NN
Dynamics
Simulation
Confidence
monitor
Closed-loop
NN
verification
Fragmented
?
?
?
“Safe”
Recovery
Confidence
monitor
Assumptions relate guarantees, monitors & recovery
14
Recovery
System
Sensor
data
Assn 1
Assn 2
Monitor of
Assn 1
Monitor of
Assn 2
Sufficient
“Safe”
Dynamics
Simulation
Control
NN
Perception
NN
Closed-loop
NN
verification
— Discrete-output monitors: “Yes”, “No”, “Maybe” via sequential detection [Scharf’91, Poor’13]
— Limitations of discrete assumption monitoring:
○ Too coarse for highly uncertain assumptions → uninformative
○ Errors accumulate combinatorially → decreased performance [EMSOFT’20]
▹ M1 ∧ M2 ∧ M3: 10% monitor FPR*
→ 27% composition FPR
How to monitor assumptions?
15
* FPR = false positive rate
— Key idea: Confidence monitoring instead of discrete monitoring [FMOS’21, ICCPS’22a]
○ Confidence C in assumption A is an estimate of Pr(A)
○ Expected calibration error (ECE) for confidence C predicting satisfaction of A:
▹ ECE(C, A) = E[| Pr( A | C ) - C |]
Assumptions relate guarantees, monitors & recovery
16
Recovery
System
Sensor
data
Assn 1
Assn 2
Monitor of
Assn 1
Monitor of
Assn 2
“Safe”
Dynamics
Simulation
Control
NN
Perception
NN
Closed-loop
NN
verification
Sufficient
17
Req: no obstacle collisions
Req: no pipeline loss
Detection
confidence
Dynamics
confidence Detection
confidence
Distance d
Position y
Case study: unmanned underwater vehicle (UUV)
— Build a comprehensive model
○ Probabilistic graphical models: Bayesian/Markov networks [Pearl’88, Koller’09]
○ Copulas: joint distributions for given marginals[Nelsen’06]
Our goal: combine the confidences into a predictive probability of safety
Existing work:
— Aggregate predictions of the same phenomenon
○ Forecast combination and ensemble learning[Ranjan’10, Sagi’18]
Case study: unmanned underwater vehicle (UUV)
18
Req: no obstacle collisions
Req: no pipeline loss
Detection
confidence
Distance d
Position y
Dynamics
confidence Detection
confidence
→ but we have different assumptions
→ but requires dependencies between confidences
Confidence composition framework: overview
19
Design-time phase Run-time phase
Safety requirement
“In next 30s, track pipe unless avoiding obstacles”
□30
( d ≥ 5 ∧ (d ≥ 30 → 10 ≤ y ≤ 50) )
Dynamics
NN controller
Environment
Perception
Model
Closed-loop
NN verification
Confidence in the
safety guarantees
Confidence composition framework: overview
20
Design-time phase Run-time phase
Safety requirement
“In next 30s, track pipe unless avoiding obstacles”
□30
( d ≥ 5 ∧ (d ≥ 30 → 10 ≤ y ≤ 50) )
Dynamics
NN controller
Environment
Perception
A1: “The obstacle is far enough
(d ≥ 30)”
A2: “Currently tracking the pipe
(10 ≤ y ≤ 50)”
A3: “Our observations are
consistent with the dynamics”
A4: “Perception noise is
within known bounds”
Model
Assumption confidence monitors
Obstacle monitor M1
Model invalidator M3
Pipe monitor M2
M4
c1
c2
c3
c4
C = f(M1, M2, M3, M4)
Confidence composition
M2
M1
M3
M4
“Safe”
Closed-loop
NN verification
Combined assumptions
(A1 → A2) ∧ A3 ∧ A4
How to compose confidences?
21
C = f(M1, M2, M3, M4)
Composed confidence
M2
M1
M3
M4
A2
A1
A3
A4
A = (A1 → A2) ∧ A3 ∧ A4
Combined assumptions
Verification
Given: calibrated monitors
ECE(M1, A1) ≤ e 1
, …
Safety outcome
S ∈ { , } Goal: calibrate C to S
ECE(C, S) ≤ e
[ICCPS’22a] Theorem: the Goal is achieved if
C is calibrated to A: ECE(C, A) ≤ g(e)
Monitors have
- Unknown dependencies
- Idiosyncratic inaccuracies
[ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee.
Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022
ECE(M1*M2, A1∧A2) ≤ max[4e1
e2
, (Var[M1]*Var[M2])0.5
+ e1
+ e2
+ e1
e2
]
ECE(w1
*M1+w2
*M2, A1∧A2) ≤ max[e1
+ e2
+ e1
e2
, max[w1
, w2
] + e1
+ e2
− e1
e2
]
How to compose confidences?
22
C = f(M1, M2, M3, M4)
Composed confidence
M2
M1
M3
M4
A2
A1
A3
A4
A = (A1 → A2) ∧ A3 ∧ A4
Combined assumptions
Given: calibrated monitors
ECE(M1, A1) ≤ e 1
, …
Pr( )=?
Problem: conjunctive composition of ECEs: [ICCPS’22a]
Given: ECE(M1, A1) ≤ e1
, ECE(M2, A2) ≤ e2
Find: f, ef
s.t. ECE(f(M1, M2), A1 ∧ A2) ≤ ef
Our solution:
Monitors have
- Unknown dependencies
- Idiosyncratic inaccuracies
[ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee.
Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022
[ICCPS’22a] Theorem: the Goal is achieved if
C is calibrated to A: ECE(C, A) ≤ g(e)
○ Predicts mission outcome
— Case study setup: 194 simulated UUV executions
○ Random independent violations of assumptions
— Composed confidence
○ Improves calibration to safety over individual monitors
Confidence composition is useful in practice
23
[ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee.
Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022
Composed confidence predicts mission outcome
24
Scenario:
— Framework for composing confidence monitors of verification assumptions
— First compositional bounds for calibration error
— Useful predictor of safety & mission outcome
Confidence composition: summary
25
A2
A1
A3
A4
(A1 → A2) ∧ A3 ∧ A4 f(M1, M2, M3, M4)
M2
M1
M3
M4
https://github.com/bisc/coco-case-studies
Vision: Awareness of
Own Limitations
Data
Models
Heterogeneity leads to safety issues
26
Design
System
Control Perception
Monitors
Guarantees
I. Fragmentation
Recovery
II. Inconsistencies
Dive 1: Confidence Composition
Synthesis &
Analysis
III.
Heterogeneity:
diversity of components and models
Models
Design
Inconsistent models lead to safety violations
27
Inconsistent
System
Dive 2: Model Integration
Control Perception
Monitors
Guarantees
Recovery
Case study: power-aware service robot
28
— Built from experimental data
— Graph & linear equations
Example: inconsistencies can lead to failures
29
— Built from first principles
— Markov decision process
mode speed
power
Map & power models Planning model
→ Robot runs out of power
Example: inconsistencies can lead to failures
30
Potential inconsistency:
Models significantly disagree on energy estimates
mode speed
power
Map & power models Planning model
Problem: are the models related in a desirable way?
E.g., “the difference between energy estimates is bounded”
Relating model structures and behaviors is hard
31
mode speed
power
?
Map & power models Planning model
Structural integration → inexpressive
[Sztipanovits’14, Marinescu’16]
Logic combinations → intractable
[Gabbay’96, Barbosa’16]
Behavioral integration → impractical
[Girard’11, Rajhans’13]
Hybrid simulation → no guarantees
[Lee’14, Combemale’14]
Mission plans
Tasks & their energies
Integration properties co-constrain models
32
Integration property
mode speed
power
Map & power models Planning model
Objects, types,
attributes, relations
→ first-order logic
Sets of traces,
past/future/chance
→ modal logic
“For any valid sequence of tasks on a map, its total energy
matches the energy in the corresponding mission plan.”
Integration Property Language (IPL) [FM’18]
Mission plans
Tasks & their energies
[FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for
Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
Integration property
33
mode speed
power
Map & power models Planning model
“For any valid sequence of tasks on a map, its total energy
matches the energy in the corresponding mission plan.”
SMT Solver* Model checker
Integration properties can be verified
Mission plans
Tasks & their energies
Soundness theorem:
Verification returns an answer → it is correct
Termination theorem:
Verification always terminates on finite structures
Implementation:
https://github.com/bisc/IPL
*SMT = Satisfiability Modulo Theories
Verification
algorithm
[FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for
Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
Model integration is useful in practice
34
— Applied to the history of the robot’s models
— Practical utility:
○ Specified & verified 120+ power consistency properties
○ Verification time from seconds to dozens of hours
○ Discovered 17 real-world inconsistencies
[FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for
Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
Example: discovered safety-critical inconsistency
35
Map & power model Planning model
Discovered error:
battery := max(battery – req_energy, 0)
Impact of error:
Some missions would run out of power
Manual fix:
Add a check: battery > req_energy
→ The mission’s last task does not require sufficient battery
— Summary of model integration:
○ Framework to check consistency between structures and behaviors
○ Language & verification algorithm with correctness guarantees
○ Useful for finding inconsistencies in practice
— Applied to the history of the robot’s models
— Practical utility:
○ Specified & verified 120+ power consistency properties
○ Verification time from seconds to dozens of hours
○ Discovered 17 real-world inconsistencies
Model integration is useful in practice
36
Vision: Awareness of
Own Limitations
Data
Models
Heterogeneity leads to safety issues
37
Design
System
Control Perception
Monitors
Guarantees
Recovery
II. Inconsistencies
Dive 1: Confidence Composition
Dive 2:
Model Integration
Reasoning about Monitor Accuracy
Analysis Integration
Synthesis &
Analysis
III.
[EMSOFT’14, AVICPS’14, CPS-SPC’15, ACES-MB’15]
[EMSOFT’20, CHASE’21, ICCPS’22b, Submitted’22]
Heterogeneity:
diversity of components and models
My research agenda develops awareness
Vision:
Frameworks for engineering autonomous CPS that
— Are aware of their limitations
— Intelligently respond to them
— Inform their own engineering
38
A. Comprehensive confidence
Improve trust & inform recovery
B. Recovery guarantees
Maintain safety after failures
C. Assumption engineering
Manipulate assumptions automatically
Thrusts:
Longer
term,
larger
scope
Potential funding:
NSF: Cyber-Physical Systems (CPS), Robust Intelligence (RI), Software and Hardware Foundations (SHF),
Formal Methods in the Field (FMitF), Foundational Research in Robotics, National Robotics Initiative 3.0 (NRI)
DoD: ONR Science of Autonomy, DARPA, AFRL/AFOSR, ARO/ARL
— Status quo: instantaneous, pessimistic confidence
— Direction: diversify sources of confidence
Thrust A: Making confidence comprehensive
39
Time
Project: Statistical run-time
abstractions of testing results
Project: Compositional predictive monitoring
Confidence from predicting
t + 600s
Prediction
confirmed/rejected
Confidence from testing
t - 24h
Tests/simulations
performed
Current
confidence
t
t - 60s
— Status quo: engineers manage assumptions manually
— Direction: intelligent frameworks for discovering,
making, monitoring, and retracting assumptions
Thrust C: Automating assumption engineering
40
Example: robot on pavement → grass
Assumptions:
— Designed for pavement
— Always detects non-pavement
Project: Mining assumptions
— May get stuck on grass
Project: Assumption specs
Thanks to my co-authors & mentors
41
Thanks to funding agencies
42
Heterogeneity threatens the safety and trustworthiness of CPS
→ Two complementary techniques for overcoming heterogeneity:
Confidence composition to combine run-time monitors
Model integration to discover design-time inconsistencies
→ Future: engineering autonomous systems aware of their limitations
Thanks for your attention
43
— [ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee. Confidence Composition for
Monitors of Verification Assumptions. In Proceedings of the International Conference on
Cyber-Physical Systems (ICCPS), 2022.
— [FMOS’21] Ruchkin, Cleaveland, Sokolsky, Lee. Confidence Monitoring and Composition for Dynamic
Assurance of Learning-Enabled Autonomous Systems. In Formal Methods in Outer Space (FMOS):
Essays Dedicated to Klaus Havelund on the Occasion of His 65th Birthday, 2021.
— [EMSOFT’20] Ruchkin, Sokolsky, Weimer, Hedaoo, Lee. Compositional Probabilistic Analysis of
Temporal Properties Over Stochastic Detectors. In Proceedings of the International Conference on
Embedded Software (EMSOFT), 2020.
— [FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for
Multi-Model Cyber-Physical Systems. In Proceedings of the International Symposium on Formal
Methods (FM), 2018.
References: my work in the talk
44
— [Pearl’88] Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
Morgan Kaufmann, 1988.
— [Scharf’91] Scharf. Statistical Signal Processing. Pearson, 1991.
— [Gabbay’96] Gabbay. Fibred Semantics and the Weaving of Logics Part 1: Modal and Intuitionistic
Logics. In the Journal of Symbolic Logic, 1996.
— [Sha’01] Sha. Using simplicity to control complexity. In IEEE Software, 2001.
— [Nelsen’06] Nelsen. An Introduction to Copulas. Springer-Verlag, 2006.
— [Koller’09] Koller, Friedman, Bach. Probabilistic Graphical Models: Principles and Techniques. MIT
Press, 2009.
— [Ranjan’10] Ranjan, Gneiting. Combining probability forecasts. In the Journal of the Royal Statistical
Society: Series B (Statistical Methodology), 2010.
— [Girard’11] Girard, Pappas. Approximate Bisimulation: A Bridge Between Computer Science and
Control Theory. In the European Journal of Control, 2011.
References: related work p.1
45
— [Poor’13] Poor. An Introduction to Signal Detection and Estimation. Springer, 2013.
— [Rajhans’13] Rajhans, Krogh. Compositional Heterogeneous Abstraction. In Proceedings of the
International Conference on Hybrid Systems: Computation and Control (HSCC), 2013.
— [Mitsch’14] Mitsch, Platzer. ModelPlex: Verified Runtime Validation of Verified Cyber-Physical System
Models. In Proceedings of the International Conference on Runtime Verification (RV), 2014.
— [Sztipanovits’14] Sztipanovits, Bapty, Neema, Howard, Jackson. OpenMETA: A Model and
Component-Based Design Tool Chain for Cyber-Physical Systems. In From Programs to Systems - The
Systems Perspective in Computing, 2014.
— [Lee’14] Lee, Neuendorffer, Zhou. System Design, Modeling, and Simulation using Ptolemy II.
Ptolemy.org, 2014.
— [Combemale’14] Combemale, Deantoni, Baudry, France, Jezequel, Gray. Globalizing Modeling
Languages. In IEEE Computer, 2014.
References: related work p.2
46
— [Marinescu’16] Marinescu. Model-driven Analysis and Verification of Automotive Embedded Systems.
PhD Thesis, Malardalen University, 2016.
— [Barbosa’16] Barbosa, Martins, Madeira, Neves. Reuse and Integration of Specification Logics: The
Hybridisation Perspective. In Theoretical Information Reuse and Integration, Springer, 2016.
— [Deshmukh’17] Deshmukh, Donze, Ghosh, Jin, Juniwal, Seshia. Robust Online Monitoring of Signal
Temporal Logic.
— [Guo’17] Guo, Pleiss, Sun, Weinberger. On calibration of modern neural networks. In Proceedings of
the 34th International Conference on Machine Learning (ICML), 2017.
— [Sagi’18] Sagi, Rokach. Ensemble learning: A survey. In Wiley Interdisciplinary Reviews: Data Mining
and Knowledge Discovery, 2018.
References: related work p.3
47

Overcoming Heterogeneity in Autonomous Cyber-Physical Systems

  • 1.
    Ivan Ruchkin Postdoctoral Researcher Universityof Pennsylvania Some Department of Engineering or Science Some University Some day and month, Spring 2022 Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
  • 2.
    — Undergrad @Moscow State University, Moscow, Russia ○ Applied math & computer science — PhD @ Carnegie Mellon University, Pittsburgh, PA ○ Software engineering & formal methods ○ Advisor: David Garlan — Postdoc @ University of Pennsylvania, Philadelphia, PA ○ Guarantees for learning-enabled systems ○ Supervisors: Oleg Sokolsky & Insup Lee — Industry @ NASA JPL, CMU SEI, Google Summer of Code, … ○ Software engineering My background is heterogeneous 2
  • 3.
  • 4.
    Research area: Cyber-PhysicalSystems (CPS) 4 — Software-driven systems operating autonomously in the physical world — Interdisciplinary expertise: control, networks, formal methods, mechanics, AI, ...
  • 5.
    CPS research isgrowing and well-funded 5 I started working on CPS
  • 6.
    Safety: “nothing badhappens” — No harm to humans, infrastructure, environment, or the CPS itself — Safety violations → economic losses & deaths What do we want from CPS? 6 Trustworthiness: “worthy of our trust” — Able to rely on CPS to perform as expected — Requires understanding and evidence — Loss of trust → slower development & adoption Strong comprehensive guarantees — Safety in the face of uncertainty — Absence of modeling errors — Perception accuracy My research contribution: engineering frameworks for CPS Automated engineering tools — Languages and algorithms — For design, analysis, and monitoring — Incorporating domain knowledge
  • 7.
    7 Formal Methods: - Formallogic - Verification - Run-time monitoring Artificial Intelligence: - Neural networks (NN) - Probabilistic reasoning - Statistical bounds Software/Systems Engineering: - Model-based engineering - Design representations - Domain-specific languages My research combines different techniques JSS’21 EMSOFT’20 IEEE SW’19 TAC’14 ICCPS’22a ICCPS’22b Submitted’22 WinterSim’21 ACSOS’20 FM’18 SEAMS’17 FGCS’18 CBSE’15 EMSOFT’14 Monterey’12 SYRCoSE’10 FMOS’21 SISSY’21 MARTCPS’16 ACVI’16 CPS-SPC’15 ACES-MB’15 AVICPS’14 CHASE’21
  • 8.
    Data Models How do weengineer CPS today? 8 Synthesis & Analysis Design System Control Perception Monitors Guarantees Recovery
  • 9.
    Autonomous systems havelittle awareness 9
  • 10.
    Vision: Awareness of OwnLimitations Heterogeneity leads to safety issues Data Models 10 Synthesis & Analysis Design System Control Perception Monitors Guarantees Heterogeneity: diversity of components and models I. Fragmentation Recovery II. Inconsistencies III. How do we engineer CPS today?
  • 11.
    Heterogeneity fragments guarantees& monitors 11 System Sensor data Control NN Perception NN Dynamics Simulation Confidence monitor Closed-loop NN verification Fragmented ? Dive 1: Confidence Composition ? ? “Safe” Recovery Confidence monitor
  • 12.
    — Correct operationof a component ○ E.g., confidence in NN outputs [Guo’17] ○ Difficult to connect to system-level safety — Satisfaction of requirements (including safety) ○ E.g., temporal specification monitoring [Deshmukh’17] ○ Violations detected too late to recover — Future violations of requirements ○ E.g., Simplex architecture [Sha’01] ○ Key idea: assumption monitoring, e.g., of proof premises [Mitsch’14] What to monitor at run time? 12
  • 13.
    Assumptions relate guarantees,monitors & recovery 13 System Sensor data Control NN Perception NN Dynamics Simulation Confidence monitor Closed-loop NN verification Fragmented ? ? ? “Safe” Recovery Confidence monitor
  • 14.
    Assumptions relate guarantees,monitors & recovery 14 Recovery System Sensor data Assn 1 Assn 2 Monitor of Assn 1 Monitor of Assn 2 Sufficient “Safe” Dynamics Simulation Control NN Perception NN Closed-loop NN verification
  • 15.
    — Discrete-output monitors:“Yes”, “No”, “Maybe” via sequential detection [Scharf’91, Poor’13] — Limitations of discrete assumption monitoring: ○ Too coarse for highly uncertain assumptions → uninformative ○ Errors accumulate combinatorially → decreased performance [EMSOFT’20] ▹ M1 ∧ M2 ∧ M3: 10% monitor FPR* → 27% composition FPR How to monitor assumptions? 15 * FPR = false positive rate — Key idea: Confidence monitoring instead of discrete monitoring [FMOS’21, ICCPS’22a] ○ Confidence C in assumption A is an estimate of Pr(A) ○ Expected calibration error (ECE) for confidence C predicting satisfaction of A: ▹ ECE(C, A) = E[| Pr( A | C ) - C |]
  • 16.
    Assumptions relate guarantees,monitors & recovery 16 Recovery System Sensor data Assn 1 Assn 2 Monitor of Assn 1 Monitor of Assn 2 “Safe” Dynamics Simulation Control NN Perception NN Closed-loop NN verification Sufficient
  • 17.
    17 Req: no obstaclecollisions Req: no pipeline loss Detection confidence Dynamics confidence Detection confidence Distance d Position y Case study: unmanned underwater vehicle (UUV)
  • 18.
    — Build acomprehensive model ○ Probabilistic graphical models: Bayesian/Markov networks [Pearl’88, Koller’09] ○ Copulas: joint distributions for given marginals[Nelsen’06] Our goal: combine the confidences into a predictive probability of safety Existing work: — Aggregate predictions of the same phenomenon ○ Forecast combination and ensemble learning[Ranjan’10, Sagi’18] Case study: unmanned underwater vehicle (UUV) 18 Req: no obstacle collisions Req: no pipeline loss Detection confidence Distance d Position y Dynamics confidence Detection confidence → but we have different assumptions → but requires dependencies between confidences
  • 19.
    Confidence composition framework:overview 19 Design-time phase Run-time phase Safety requirement “In next 30s, track pipe unless avoiding obstacles” □30 ( d ≥ 5 ∧ (d ≥ 30 → 10 ≤ y ≤ 50) ) Dynamics NN controller Environment Perception Model Closed-loop NN verification
  • 20.
    Confidence in the safetyguarantees Confidence composition framework: overview 20 Design-time phase Run-time phase Safety requirement “In next 30s, track pipe unless avoiding obstacles” □30 ( d ≥ 5 ∧ (d ≥ 30 → 10 ≤ y ≤ 50) ) Dynamics NN controller Environment Perception A1: “The obstacle is far enough (d ≥ 30)” A2: “Currently tracking the pipe (10 ≤ y ≤ 50)” A3: “Our observations are consistent with the dynamics” A4: “Perception noise is within known bounds” Model Assumption confidence monitors Obstacle monitor M1 Model invalidator M3 Pipe monitor M2 M4 c1 c2 c3 c4 C = f(M1, M2, M3, M4) Confidence composition M2 M1 M3 M4 “Safe” Closed-loop NN verification Combined assumptions (A1 → A2) ∧ A3 ∧ A4
  • 21.
    How to composeconfidences? 21 C = f(M1, M2, M3, M4) Composed confidence M2 M1 M3 M4 A2 A1 A3 A4 A = (A1 → A2) ∧ A3 ∧ A4 Combined assumptions Verification Given: calibrated monitors ECE(M1, A1) ≤ e 1 , … Safety outcome S ∈ { , } Goal: calibrate C to S ECE(C, S) ≤ e [ICCPS’22a] Theorem: the Goal is achieved if C is calibrated to A: ECE(C, A) ≤ g(e) Monitors have - Unknown dependencies - Idiosyncratic inaccuracies [ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee. Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022
  • 22.
    ECE(M1*M2, A1∧A2) ≤max[4e1 e2 , (Var[M1]*Var[M2])0.5 + e1 + e2 + e1 e2 ] ECE(w1 *M1+w2 *M2, A1∧A2) ≤ max[e1 + e2 + e1 e2 , max[w1 , w2 ] + e1 + e2 − e1 e2 ] How to compose confidences? 22 C = f(M1, M2, M3, M4) Composed confidence M2 M1 M3 M4 A2 A1 A3 A4 A = (A1 → A2) ∧ A3 ∧ A4 Combined assumptions Given: calibrated monitors ECE(M1, A1) ≤ e 1 , … Pr( )=? Problem: conjunctive composition of ECEs: [ICCPS’22a] Given: ECE(M1, A1) ≤ e1 , ECE(M2, A2) ≤ e2 Find: f, ef s.t. ECE(f(M1, M2), A1 ∧ A2) ≤ ef Our solution: Monitors have - Unknown dependencies - Idiosyncratic inaccuracies [ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee. Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022 [ICCPS’22a] Theorem: the Goal is achieved if C is calibrated to A: ECE(C, A) ≤ g(e)
  • 23.
    ○ Predicts missionoutcome — Case study setup: 194 simulated UUV executions ○ Random independent violations of assumptions — Composed confidence ○ Improves calibration to safety over individual monitors Confidence composition is useful in practice 23 [ICCPS’22a] Ruchkin, Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee. Confidence Composition for Monitors of Verification Assumptions. To appear in the Int’l Conf. on Cyber-Physical Systems (ICCPS), May 2022
  • 24.
    Composed confidence predictsmission outcome 24 Scenario:
  • 25.
    — Framework forcomposing confidence monitors of verification assumptions — First compositional bounds for calibration error — Useful predictor of safety & mission outcome Confidence composition: summary 25 A2 A1 A3 A4 (A1 → A2) ∧ A3 ∧ A4 f(M1, M2, M3, M4) M2 M1 M3 M4 https://github.com/bisc/coco-case-studies
  • 26.
    Vision: Awareness of OwnLimitations Data Models Heterogeneity leads to safety issues 26 Design System Control Perception Monitors Guarantees I. Fragmentation Recovery II. Inconsistencies Dive 1: Confidence Composition Synthesis & Analysis III. Heterogeneity: diversity of components and models
  • 27.
    Models Design Inconsistent models leadto safety violations 27 Inconsistent System Dive 2: Model Integration Control Perception Monitors Guarantees Recovery
  • 28.
    Case study: power-awareservice robot 28
  • 29.
    — Built fromexperimental data — Graph & linear equations Example: inconsistencies can lead to failures 29 — Built from first principles — Markov decision process mode speed power Map & power models Planning model
  • 30.
    → Robot runsout of power Example: inconsistencies can lead to failures 30 Potential inconsistency: Models significantly disagree on energy estimates mode speed power Map & power models Planning model Problem: are the models related in a desirable way? E.g., “the difference between energy estimates is bounded”
  • 31.
    Relating model structuresand behaviors is hard 31 mode speed power ? Map & power models Planning model Structural integration → inexpressive [Sztipanovits’14, Marinescu’16] Logic combinations → intractable [Gabbay’96, Barbosa’16] Behavioral integration → impractical [Girard’11, Rajhans’13] Hybrid simulation → no guarantees [Lee’14, Combemale’14] Mission plans Tasks & their energies
  • 32.
    Integration properties co-constrainmodels 32 Integration property mode speed power Map & power models Planning model Objects, types, attributes, relations → first-order logic Sets of traces, past/future/chance → modal logic “For any valid sequence of tasks on a map, its total energy matches the energy in the corresponding mission plan.” Integration Property Language (IPL) [FM’18] Mission plans Tasks & their energies [FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
  • 33.
    Integration property 33 mode speed power Map& power models Planning model “For any valid sequence of tasks on a map, its total energy matches the energy in the corresponding mission plan.” SMT Solver* Model checker Integration properties can be verified Mission plans Tasks & their energies Soundness theorem: Verification returns an answer → it is correct Termination theorem: Verification always terminates on finite structures Implementation: https://github.com/bisc/IPL *SMT = Satisfiability Modulo Theories Verification algorithm [FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
  • 34.
    Model integration isuseful in practice 34 — Applied to the history of the robot’s models — Practical utility: ○ Specified & verified 120+ power consistency properties ○ Verification time from seconds to dozens of hours ○ Discovered 17 real-world inconsistencies [FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for Multi-Model Cyber-Physical Systems. In the Int’l Symp. on Formal Methods (FM), 2018
  • 35.
    Example: discovered safety-criticalinconsistency 35 Map & power model Planning model Discovered error: battery := max(battery – req_energy, 0) Impact of error: Some missions would run out of power Manual fix: Add a check: battery > req_energy → The mission’s last task does not require sufficient battery
  • 36.
    — Summary ofmodel integration: ○ Framework to check consistency between structures and behaviors ○ Language & verification algorithm with correctness guarantees ○ Useful for finding inconsistencies in practice — Applied to the history of the robot’s models — Practical utility: ○ Specified & verified 120+ power consistency properties ○ Verification time from seconds to dozens of hours ○ Discovered 17 real-world inconsistencies Model integration is useful in practice 36
  • 37.
    Vision: Awareness of OwnLimitations Data Models Heterogeneity leads to safety issues 37 Design System Control Perception Monitors Guarantees Recovery II. Inconsistencies Dive 1: Confidence Composition Dive 2: Model Integration Reasoning about Monitor Accuracy Analysis Integration Synthesis & Analysis III. [EMSOFT’14, AVICPS’14, CPS-SPC’15, ACES-MB’15] [EMSOFT’20, CHASE’21, ICCPS’22b, Submitted’22] Heterogeneity: diversity of components and models
  • 38.
    My research agendadevelops awareness Vision: Frameworks for engineering autonomous CPS that — Are aware of their limitations — Intelligently respond to them — Inform their own engineering 38 A. Comprehensive confidence Improve trust & inform recovery B. Recovery guarantees Maintain safety after failures C. Assumption engineering Manipulate assumptions automatically Thrusts: Longer term, larger scope Potential funding: NSF: Cyber-Physical Systems (CPS), Robust Intelligence (RI), Software and Hardware Foundations (SHF), Formal Methods in the Field (FMitF), Foundational Research in Robotics, National Robotics Initiative 3.0 (NRI) DoD: ONR Science of Autonomy, DARPA, AFRL/AFOSR, ARO/ARL
  • 39.
    — Status quo:instantaneous, pessimistic confidence — Direction: diversify sources of confidence Thrust A: Making confidence comprehensive 39 Time Project: Statistical run-time abstractions of testing results Project: Compositional predictive monitoring Confidence from predicting t + 600s Prediction confirmed/rejected Confidence from testing t - 24h Tests/simulations performed Current confidence t t - 60s
  • 40.
    — Status quo:engineers manage assumptions manually — Direction: intelligent frameworks for discovering, making, monitoring, and retracting assumptions Thrust C: Automating assumption engineering 40 Example: robot on pavement → grass Assumptions: — Designed for pavement — Always detects non-pavement Project: Mining assumptions — May get stuck on grass Project: Assumption specs
  • 41.
    Thanks to myco-authors & mentors 41
  • 42.
    Thanks to fundingagencies 42
  • 43.
    Heterogeneity threatens thesafety and trustworthiness of CPS → Two complementary techniques for overcoming heterogeneity: Confidence composition to combine run-time monitors Model integration to discover design-time inconsistencies → Future: engineering autonomous systems aware of their limitations Thanks for your attention 43
  • 44.
    — [ICCPS’22a] Ruchkin,Cleaveland, Ivanov, Lu, Carpenter, Sokolsky, Lee. Confidence Composition for Monitors of Verification Assumptions. In Proceedings of the International Conference on Cyber-Physical Systems (ICCPS), 2022. — [FMOS’21] Ruchkin, Cleaveland, Sokolsky, Lee. Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabled Autonomous Systems. In Formal Methods in Outer Space (FMOS): Essays Dedicated to Klaus Havelund on the Occasion of His 65th Birthday, 2021. — [EMSOFT’20] Ruchkin, Sokolsky, Weimer, Hedaoo, Lee. Compositional Probabilistic Analysis of Temporal Properties Over Stochastic Detectors. In Proceedings of the International Conference on Embedded Software (EMSOFT), 2020. — [FM’18] Ruchkin, Sunshine, Iraci, Schmerl, Garlan. IPL: An Integration Property Language for Multi-Model Cyber-Physical Systems. In Proceedings of the International Symposium on Formal Methods (FM), 2018. References: my work in the talk 44
  • 45.
    — [Pearl’88] Pearl.Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. — [Scharf’91] Scharf. Statistical Signal Processing. Pearson, 1991. — [Gabbay’96] Gabbay. Fibred Semantics and the Weaving of Logics Part 1: Modal and Intuitionistic Logics. In the Journal of Symbolic Logic, 1996. — [Sha’01] Sha. Using simplicity to control complexity. In IEEE Software, 2001. — [Nelsen’06] Nelsen. An Introduction to Copulas. Springer-Verlag, 2006. — [Koller’09] Koller, Friedman, Bach. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. — [Ranjan’10] Ranjan, Gneiting. Combining probability forecasts. In the Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2010. — [Girard’11] Girard, Pappas. Approximate Bisimulation: A Bridge Between Computer Science and Control Theory. In the European Journal of Control, 2011. References: related work p.1 45
  • 46.
    — [Poor’13] Poor.An Introduction to Signal Detection and Estimation. Springer, 2013. — [Rajhans’13] Rajhans, Krogh. Compositional Heterogeneous Abstraction. In Proceedings of the International Conference on Hybrid Systems: Computation and Control (HSCC), 2013. — [Mitsch’14] Mitsch, Platzer. ModelPlex: Verified Runtime Validation of Verified Cyber-Physical System Models. In Proceedings of the International Conference on Runtime Verification (RV), 2014. — [Sztipanovits’14] Sztipanovits, Bapty, Neema, Howard, Jackson. OpenMETA: A Model and Component-Based Design Tool Chain for Cyber-Physical Systems. In From Programs to Systems - The Systems Perspective in Computing, 2014. — [Lee’14] Lee, Neuendorffer, Zhou. System Design, Modeling, and Simulation using Ptolemy II. Ptolemy.org, 2014. — [Combemale’14] Combemale, Deantoni, Baudry, France, Jezequel, Gray. Globalizing Modeling Languages. In IEEE Computer, 2014. References: related work p.2 46
  • 47.
    — [Marinescu’16] Marinescu.Model-driven Analysis and Verification of Automotive Embedded Systems. PhD Thesis, Malardalen University, 2016. — [Barbosa’16] Barbosa, Martins, Madeira, Neves. Reuse and Integration of Specification Logics: The Hybridisation Perspective. In Theoretical Information Reuse and Integration, Springer, 2016. — [Deshmukh’17] Deshmukh, Donze, Ghosh, Jin, Juniwal, Seshia. Robust Online Monitoring of Signal Temporal Logic. — [Guo’17] Guo, Pleiss, Sun, Weinberger. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), 2017. — [Sagi’18] Sagi, Rokach. Ensemble learning: A survey. In Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2018. References: related work p.3 47