Inferring behavior model of a running software system is quite useful for several automated software engineering tasks such as program comprehension, anomaly detection, and testing. Most existing dynamic model inference techniques are white-box, i.e., they require source code to be instrumented to get run-time traces. However, in many systems, instrumenting the entire source code is not possible (e.g. when using black-box third-party libraries) or might be very costly. Unfortunately, most black-box techniques that detect states over time are either univariate, or make assumptions on the data distribution, or have limited power for learning over a long period of past behavior. To overcome the above issues, in this paper, we propose a hybrid deep neural network that accepts as input a set of time series, one per input/output signal of the system, and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns, over time. We have applied our approach on a real UAV auto-pilot solution from our industry partner with half a million lines of C code. We ran 888 random recent system-level test cases and inferred states, over time. Our comparison with several traditional time series change point detection techniques showed that our approach improves their performance by up to 102%, in terms of finding state change points, measured by F1 score. We also showed that our state classification algorithm provides on average 90.45% F1 score, which improves traditional classification algorithms by up to 17%.
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
Hybrid Deep Neural Networks to Infer State Models of Black-Box Systems
1. Hybrid Deep Neural Networks to Infer State Models of
Black-Box Systems
▹ Mohammad Jafar Mashhadi
▹ Hadi Hemmati
ASE 2020
2. Problem context: Black-box Automated Systems
Automated systems are all around us.
At their core they use either:
▹ AI: Self driving cars
▹ Feedback loop controllers: Autopilot, cruise
control, plant controllers, etc.
2
3. Problem context: Understanding their behaviour
▹ Understanding their behaviour
▸ Verification
▸ Anomaly detection
▸ Automated Testing
▹ They are often state based
3
5. Problem context: white-box model inference
▹ Most of the prior work focuses on white-box analysis
▹ But, black-box systems are quite often used in practice
5
6. Problem context: white-box vs. black-box analysis
▹ Static analysis is out of question
▹ Dynamic analysis has limitations
6
Test
Instrumenter
Attach
Generate
Execution Traces
Abstractor Abstract
Model
Generates
Runs on
Source Code
Abstractor Abstract
Model
Generates
???
7. Problem: Inferring state models from black-box systems
▹ How to go about inferring states in a black-box controller?
7
Abstractor Abstract
Model
Generates
???
8. Solution Context: Feedback loop controllers
▹ Car cruise control
▸ Goal: Keep current speed ≈ desired speed
▸ Input: Speed
▸ Output: Throttle increase/decrease
8
9. Solution Context: Feedback loop controllers
▹ Autopilot
▸ Goal: Depends on the internal state
▸ Input: Airspeed, AOA, attitude, MSL altitude, etc.
▸ Outputs: Throttle, control surfaces’ deflections
9
Aircraft illustration from Kopyt and Żugaj 20
10. Solution Context: Control loops in Autopilot (PID)
▹ “Goal: depends on the internal state”:
▸ During takeoff:
⬩ Heading: keep constant, align with runway
⬩ Throttle: keep constant, maximum
▸ In cruise:
⬩ Heading: constant, based on flight plan
⬩ Throttle: reacts to altitude changes
⬩ Pitch: reacts to airspeed changes
▹ Hypothesis: The reverse should be possible
▸ Observing which inputs/outputs are coupled at each moment to infer
what the internal state is
10
11. Solution Context: Prior work
▹ Change point detection in time series is promising
▹ But they have limitations:
▸ Assumptions on the data:
⬩ Univariate
⬩ One change point
⬩ Known number of change points
⬩ Specific distributions
▸ Being too general:
⬩ Not exploiting the knowledge about the system
11
12. The Solution: A hybrid deep neural network12
Training a neural network model that can infer the state from
the stream of input and outputs
13. More on Hybrid Deep Neural Network Models
▹ Hybrid in this context: CNN + RNN
▹ Showed to be effective in similar problems
13
Sofma
Convolutional LayersAll Signals Fully Connected Layers
Seq2Seq
Recurrent Layers
Saeoeime
Black-Box
System
Inputs Outputs
14. More on Hybrid Deep Neural Network Models
▹ Convolutional layers
▸ Can detect local features (lines, circles, patterns, etc.)
▸ Can learn preprocessing (rolling average, discreate derivative, etc.)
14
Input
Kernel
Output
Images credit: mlnotebook
15. More on Hybrid Deep Neural Network Models
▹ Recurrent layers
▸ Output depends on current input, and history: It has memory
▸ Can discover long-term relations
▸ Perfect for processing sequential data (text, time series, etc.)
15
Image credit: Wiley
16. The final Solution16
GRU
GRU
LeakyReLu
Softmax
C 1D 3 C 1D 10C 1D 5 C 1D 15 C 1D 20 S 2S GRU F C c La
▹ Proposed architecture:
▸ 5 Convolutions with increasing kernel size
▸ 2 large Recurrent layers
▹ based on:
▸ Similar architectures in literature
▸ Experiments and tuning
18. Case Study: Autopilot software18
Icons made by DinosoftLabs from Flaticon is licensed by CC 3.0 BY
500K
LoC in C
1000+
Customers
In 85 Countries
UAVs
20. 20
Experiment
Internal State Detection Non-Hybrid Models
RQ1) How does the proposed technique perform in detecting the state changes?
When the state change happens
What the source and target states are
State Change Detection
21. 21
Experiment (RQ1)
Internal State Detection Non-Hybrid Models
▹ Metrics
▹ Baselines
▹ Design
Precision, Recall, and F1 Score with a tolerance
State Change Detection
Ground truth:
Prediction:
TP TP FP FN FN TN
Precision =
TP
P
Recall =
TP
T
F1 = 2
Prec × Recall
Prec + Recall
22. 22
Experiment (RQ1)
Internal State Detection Non-Hybrid Models
▹ Metrics
▹ Baselines
▹ Design
Ruptures library [Truong et al. ‘18]
State Change Detection
23. 23
Experiment (RQ1)
Internal State Detection Non-Hybrid Models
▹ Metrics
▹ Baselines
▹ Design
State Change Detection
Different configurations of baseline algos
▹Search Method:
▸ Pelt (Exact method)
▸ Bottom-up Segmentation
▸ Window based
▹Penalty: 3 values
▹Cost functions: 7 values
27. 27
Results (RQ1)
Internal State Detection Non-Hybrid ModelsState Change Detection
Compared to traditional CPD algorithms the proposed model is both
faster and less resource intensive; All while it showed an improvement of
88 to 102%, measured by F1 score, compared to the best baseline.
28. 28
Experiment
Internal State DetectionState Change Detection Non-Hybrid Models
RQ2) How well does the proposed technique predict the internal state of the
system?
RQ1 was a binary classification, here we have a multiclass classification problem
29. 29
Experiment (RQ2)
Internal State DetectionState Change Detection Non-Hybrid Models
▹ Metrics
▹ Baselines
▹ Design
▹Multi-class classification
▸ Classes = The Possible States
▸ Average of precision and recalls per class
Ground truth:
Prediction:
30. 30
Experiment (RQ2)
Internal State DetectionState Change Detection Non-Hybrid Models
▹ Metrics
▹ Baselines
▹ Design
Traditional ML Classifiers
▹Ridge Classifier
▹3 Decision Trees
31. 31
Experiment (RQ2)
Internal State DetectionState Change Detection Non-Hybrid Models
Input: A rolling window of size 𝑤 over the 10 inputs: 10𝑤 features
𝑤 ∈ 3, 5, 10, 15, 20
48 Configurations overall
▹ Metrics
▹ Baselines
▹ Design
32. 32
Results (RQ2)
Internal State DetectionState Change Detection Non-Hybrid Models
+18% +16% +17%
▹Larger window size is better,
because there are many long-
term relations to discover
▹Training complexity increases
dramatically with growing
window size
33. 33
Results (RQ2)
Internal State DetectionState Change Detection Non-Hybrid Models
▹
▸
▹
▹
The proposed model, improved classical ML models’ best performance by
up to 17%, measured by F1 score, in a shorter execution time and using
less computational resources.
36. 36
Experiment (RQ3)
Internal State DetectionState Change Detection Non-Hybrid Models
▹ Baselines
▹ Metrics▹Metrics: the same as RQ1 and RQ2
▸ CPD Precision and recall with 3 tolerances
▸ Classification precision and recall
37. 37
Results (RQ3)
Internal State DetectionState Change Detection Non-Hybrid Models
▹Larger recurrent layers is
more effective than
larger convolutions
▸ Discovering long-term
relations
▸ Training larger convolutions
requires more data
38. 38
Results (RQ3)
Internal State DetectionState Change Detection Non-Hybrid Models
▹
▸
▸
The hybrid architecture performs better than a comparable RNN model or fully
convolutional model, however the recurrent section plays a more important role in
the model’s performance.
39. Potential future extensions
▹ More case studies
▹ Applying it to other types of controller systems
▹ Using the inferred model in a downstream task
▹ Systematic hyper-parameter tuning
39