2. Prerequisites
Basic knowledge on the Citadel framework
● Only very briefly recalled here.
Knowledge of CITADEL Modeling,
Specifications and Verification Tools for
Related material
This module cover communications monitoring
theory, [9] provides instructions on how to
configure and use the code. Examples are
included with the code.
[1, sec 3] documents the communications
monitoring component.
[2] provides more details on association rules.
Other material and knowledge
TU/e Training – Advanced Technical Module Communication Monitoring 2
3. Overview of communications monitoring
Basics of intrusion detection
Communications monitoring in CITADEL
Communication monitoring methods
`signature’-based monitoring
white-box anomaly detection
association rules
Monitoring and Specification interaction
specifying predicates to be learned
Content
TU/e Training – Advanced Technical Module Communication Monitoring 3
4. TU/e 4
Intrusion detection basics
Distinguish `legitimate’ from `malicious’ cases.
Classification is not perfect; will need to make a trade-
off between detection rate and false positive rate.
False positive rate: % legitimate marked as attack
Detection rate: % malicious marked as attack
Two categories of approaches:
Black listing
● Specify known malicious cases to be prevented
White listing
● Specify allowed `good’ situations.
But `lists’ do not cover all cases; leading to
● false negatives for black listing (new, unknown attacks)
● false positives for white listing (unseen legitimate behaviour).
Traffic: Flagged as normal Flagged as attack
legitimate True negative False positive
malicious False negative True positive (detection)
6. Monitoring Plane gathers & evaluates information from
Operational Plane and, when needed, alerts the Adaptation Plane.
Citadel Planes interaction
TU/e Training – Advanced Technical Module Communication Monitoring 6
Overview of the citadel planes [8]
7. From model to monitoring
Different monitors may be configured for different
(types of) interfaces used for different (types of)
applications.
Model specifies interfaces (part of architecture)
Analysis of the model determines which need to be
monitored (Monitor synthesis).
Model may be useful in the creation of monitors as
well (discussed more below).
Configuration plane activates the required
monitoring.
Monitoring may trigger alerts leading to adaption
of the system (adaption plane)
TU/e Training – Advanced Technical Module Communication Monitoring 7
8. Monitor extracts and analyses messages
features
Monitoring communication
TU/e Training – Advanced Technical Module Communication Monitoring 8
Process
A
Process
B
Monitor
Raw data Parser Message features
Traffic on monitored interface(s)
is also sent to monitor.
Processes use network interfaces to
talk to other processes
10. Basics of monitoring: Features
Features capture specific aspects of
messages.
Connection aspects, e.g.: Sender, Receiver,
ports they use, timestamp, ...
but also content based, e.g. http response
code, function code, setpoint, etc.
● Different protocols will have different content that
can be extracted.
or even composite(metadata) eg. `connection’
which captures both sender receiver and their
ports.
Analysis considers tuples of feature values(*).
I.e. all relevant information is captured by
features.
(*) Feature value: the value (eg 192.168.0.1) a feature (eg source‐ip) takes.
11. Monitors look for indicators of compromise,
risks or problems in the system by finding
specific situations (blacklisting) or deviations
from the norm (whitelisting). We consider:
`signature’-based(*) monitoring (blacklisting)
white-box anomaly detection (whitelisting)
association rules (constraints; whitelisting)
Analysis: What to look for and how
TU/e Training – Advanced Technical Module Communication Monitoring 11
(*) Here we use the term `signature’ for a quite general form of rule based blacklisting.
Other literature may use a much more specific notion of signature.
13. `Signature’-based monitoring
If we know the `bad situation’ we are looking
for (be it an attack, failure, etc.) we can try
to capture it in a signature.
Simply specify which combination of feature
values that indicates the situation.
A signature can be quite specific to capture
exactly the situation we want to detect.
Can combine: High likelihood of detection this
situation with low change of false positives.
Below we show a simple signature and how it
can be used within the CITADEL framework.
14. `Balancer’ B can be configured to use one of two
servers (S1, S2). Currently using S1.
If the configured server fails , B will send out an
internal server error response: An HTTP 500 message.
If this happens, B should be reconfigured to use S2
instead.
`Signature’-based scenario
TU/e Training – Advanced Technical Module Communication Monitoring 14
B
500
15. The CITADEL communications monitoring component
monitors the outgoing connection of balancer B.
The monitor uses a rule:
If response_code == ‘500’ then ALARM_8080
If the message response code is `500’ then raise an
alert called `ALARM_8080’.
`Signature’-based scenario
TU/e Training – Advanced Technical Module Communication Monitoring 15
B
500
CITADEL
monitor
16. Alert id `ALARM_8080’ is defined in the system
model so the adaptation plane recognizes it.
Adaption handles the alarm by instructing the
configuration plane to switch to a configuration
where B uses S2 instead. (See those planes and
[7] for more information on these steps.)
`Signature’- based scenario
TU/e Training – Advanced Technical Module Communication Monitoring 16
B
CITADEL
monitor
ALARM_8080
500
17. We have seen signature using a condition to
trigger an alert; a signature rule consists of:
a condition; which is a boolean expression on
features, e.g.
● response_code == 500
● setpoint1 < 5 OR setpoint1 > 10
an alert identifier; which is a string (its
meaning is given by the system model), e.g.
● ALARM_8080
● SetpointOutOfRange
Note how signature rules may use any of the
features that we have defined, including
those about the content of the
communication, making them quite versatile.
`Signatures’ in general
TU/e Training – Advanced Technical Module Communication Monitoring 17
19. Signatures work well if you know what
you are looking for, but typically not all
attacks/failures will be known.
Monitor for anomalous behaviour that can
indicate attacks/problems.
Need to distinguish between `normal’ and
`anomalous’ traffic.
Learn model of normal traffic from training
data.
Whitelisting; any deviation from normal
model is seen as anomalous.
White-box: informative features &
understandable model.
Anomaly detection
TU/e Training – Advanced Technical Module Communication Monitoring 19
20. Feature binning
Some features with useful information may not be directly suitable
for learning.
Consider for example a timestamp. Trying to learn the exact millisecond
something happens is not meaningful. However, it may be interesting
whether it is during the day or the night.
Binning allows learning the useful part of such features.
needed for features that can take many different values (eg large
numbers, floating point) though it can be used on any feature.
the domain of values is divided into sets called bins
feature is assigned the bin it falls into, rather than the exact value taken.
Examples:
ranges for potentially large numbers, such as the size of a message
● bins could be: 0-499bits, 500-999 bits, etc.
There are many ways to bin a timestamp, e.g.
● ranges like, in which hour does it fall, or which part of the day;
morning/afternoon/evening/night,
● which day is it on; mo-tue-,...-su, or weekday vs weekend.
● What makes most sense depends on the application.
TU/e Training – Advanced Technical Module Communication Monitoring 20
21. Learning based scenario
In this scenario (inspired by [4]) we
consider messages that contain requests
sent to a database.
Examples of features:
time of access, source
query content
● command (eg select, update, delete)
● which tables & fields
● response content (eg #records retrieved)
● combinations of the above.
Database
Monitor
TU/e Training – Advanced Technical Module Communication Monitoring 21
22. Learning based scenario
Bob from accounting needs access to the
database for his job, but we do not know
beforehand how that translates exactly
into the requests he makes.
Learn his profile by monitoring normal
behaviour for some time (training; learning
a model).
Histograms capture behaviour per feature.
For many anomaly‐based approaches, the models and alerts are not very informative (`black box’).
Here models (and alerts as we will see below) are easy to interpret; `whitebox’.
TU/e Training – Advanced Technical Module Communication Monitoring 22
23. Thresholds and Model tuning
After learning normal behaviour, on needs to set which
values (bins) are considered normal and which are
anomalous.
The easiest way is to set a single threshold; anything
that is less likely that the threshold is seen as
anomalous.
The figure shows the effect of setting (an extremely
high) threshold of 26%.
● Insert and delete commands and access to column age are
seen as normal.
TU/e Training – Advanced Technical Module Communication Monitoring 23
24. Thresholds and Model tuning
In addition to a default threshold one can set a threshold per feature
The figure shows that a threshold of 10% will still leave delete as `anomalous’.
We can also tweak the model itself;
If we know this value is ok, we can tweak the model to specifically set it to normal.
Similarly we can mark values as anomalous even when encountered in the training.
(Further) tweaking may occur upon a detection
Upon false positives mark values as normal, tweak thresholds and/or use a sliding
window (discussed below).
Upon true positives one may set a custom alert, and update the system model (more
below).
Being `whitebox’ makes such tweaking possible
TU/e Training – Advanced Technical Module Communication Monitoring 24
25. Alerts on deviations
Alerts are raised when observed queries do not fit
with learned behaviour; a feature value has a
likelihood lower that the threshold.
Alerts indicate why the query is strange; which
features cause the alert.
The alert below shows Sally accesses unusual data
at a strange time.
TU/e Training – Advanced Technical Module Communication Monitoring 25
TU/
26. Context can matter
Bob may need to change the value of days_off
update is a normal value on feature command
Bob may need to read the content of name
name is a normal value for feature column_set
However, normally he would not change the name.
The combination update and name is not normal
The model above cannot detect this; update and name by themselves are normal.
Combined feature command-column_set can detect this
Consider such composite features if features are correlated
Association rules below give another way to specify relationships between features.
having the right features is essential for effective
white‐box anomaly detection, see also [5],[6]
TU/e Training – Advanced Technical Module Communication Monitoring 26
27. From system specification to
anomaly detection and back.
TU/e Training – Advanced Technical Module Communication Monitoring 27
28. Monitoring and Specification
Monitoring can benefit from the system specification
Where to monitor, what to monitor for; interpreting data
(defining features), potentially useful combinations of
features.
Specification can benefit from learning through
monitoring
Learn details of the specification instead of having to
define them by hand.
Alerts may indicate situations not yet considered in the
specification.
Below we detail this interactive approach combining
specification and monitoring, illustrated with a simple
smart manufacturing use case.
TU/e Training – Advanced Technical Module Communication Monitoring 28
29. Monitoring and Specification [3]
Lightweight
specification
learn detect
Data DataData
classification /
visualization
model
features
linked to
predicates
“What is seen”
“What it means”
“What is correct”
TP
FPtuning
completing
providing context
alerts linked to features
uninterpreted predicates
TU/e Training – Advanced Technical Module Communication Monitoring 29
Multiple views on the same system:
30. Bottle Filling Plant (BFP)
A smart manufacturing use case
Remote controlled production facility.
Fills bottles with two ingredients, mixes them & inspects result.
Picture shows main components and communication links.
TU/e Training – Advanced Technical Module Communication Monitoring 30
31. Components of the BFP
Physical Process:
Belt: moves the bottles from station to station, can be started and
stopped.
Stations :
● each station has a sensor (1-4) to detect whether a bottle is present
● Filling stations: with valves that can be opened and closed to control the flow
of liquid
● Mixer: blends the liquids in the bottle, can be started and stopped.
● Quality check station: has sensor (5) to measure amount of liquid in the bottle
Programmable Logic Controller (PLC)
controls `actuators’ (belt, valves, mixer) and sensors.
Uses the Modbus protocol to communicate.
Remote Terminal Unit (RTU)
provides an interface to connect to the PLC from the outside network.
Master (at the factory headquarters)
provides the remote Human machine interface (HMI)
TU/e Training – Advanced Technical Module Communication Monitoring 31
32. `Lightweight’ Specification
In modeling the process we can use `to-
be-learned’ predicates
Interpretation not given by model; but
rather filled in by learning.
Example: liquids in bottle should form a
valid mixture, but what constitutes a valid
mixture? Learning answers that:
G( bottle_ok → ?Valid(ingr1, ingr2) )
Valid is an uninterpreted property
Learn from monitoring what are valid
combinations of ingr1, ingr2.
TU/e Training – Advanced Technical Module Communication Monitoring 32
33. Modbus is very simple protocol,
we can extract:
Function code
register nr, value
Knowing what registers are used for (map them to
notions in the specification), allows extracting
meaningful features:
setpoint_1, setpoint_2, setpoint_mixer, etc.
at_valve_1, at_valve_2, etc.
Create mapping from PLC implementation
documentation (if available), visual inspection (see
next slide) of the traffic and a (partial)
specification of the system.
BFP – Feature building
TU/e Training – Advanced Technical Module Communication Monitoring 33
Data
classif. / vis.
features
LW spec
Data
learn
34. BFP – Feature building
TU/e Training – Advanced Technical Module Communication Monitoring 34
Register values plotted and interpreted using system model (see [1]) and basic process knowledge.
35. BFP –
Feature building
counter like (bottles_started, bottle_done)
are not useful in the whitebox model
(histograms).
but may be used to compute useful ones (eg
compare bottles_on_belt with bottles_started
- bottles_done. If not equal indicates a
problem.)
Specification constraints `total’
reason to consider it as a potential feature
G(¬(bottle_ok ∧ (ingr1= 0 ∨ingr2= 0))) //null
G(¬(bottle_ok ∧ total > k)) //overflowSpecified constraints:
TU/e Training – Advanced Technical Module Communication Monitoring 35
36. Learn and tune model as before and deploy
Raised alerts have context; meaning full features linked to
system specification.
False positives are used to tune the detection model
True positives are added to the specification
Using detection results
TU/e Training – Advanced Technical Module Communication Monitoring 36
120.0
✔
?
?
G(¬(bottle_ok ∧ ingr1 / ingr2 ≠ k))
// composition
total_liquid_in_bottle = 100.0
setpoint_1 = 99.0
setpoint_2 = 1.0
invalid ratio to be prevented
additional `normal’ values
37. Sliding windows
Sometimes a single violation of a rule/deviation from a
model is not necessarily a problem (eg a physical process
that needs to be in a `bad’ state for some time before it
actually becomes a problem) and it may not provide
enough evidence of compromise – in this case you would
want to react to such situations if there are several
instances within a short time period.
Look for multiple deviations within a time window
Sliding window always considers the last T seconds,
and only raises an alarm if it finds at least N deviations
within this window.
duration T
timeline with
anomalies
(3) Alert
(2) no alert
#deviations for alert N = 3
(1) no alert
(1) no alert
TU/e Training – Advanced Technical Module Communication Monitoring 37
38. Systems often have many interrelated variables.
Changes in the relationship instead of only the values of
variables may be relevant indicators of compromise.
White box provides meaningful alerts in form of which
feature(s) exhibit unusual values.
Considers features individually, if combination of fields
is relevant then needs to be captured in a composed
features, like `connection’.
Learning of relationships and efficiently capturing them:
Association rules.
Association Rules are shortly discussed here, see [2] for
more details.
Association rules
39. Association rules steps
Invariant learning requires a training set of states, which are
extracted from a training set of network traffic
Each time a message feature (e.g. at_valve1)shows a change in a system
variable we update the state.
We consider two methods of learning process invariants
Baysian network learning
Association rule mining
We shortly show examples of process invariants that may be learned
and how to use them
Details on how the learning works is beyond the scope of this learning
material; see [2] for details and a comparison of the two methods.
40. Association rule mining
An association rule represents a process invariant;
a combination of values that should occur together
a confidence level that this must be the case.
In for the bottlelab plant we learn the invariants:
valve_1_on → ¬belt_moving
valve_2_on → ¬belt_moving
¬valve_2_on → belt_moving
We can interpret these rules as:
● the belt is off when a valve is open (first two)
● the belt starts moving as soon as valve 2 is off.
(Just several of many rules learned, not all rules are
meaningful/useful-see [2]).
41. Bayesian Network learning
A Bayesian network captures dependencies between variables.
looks at the variables rather than at individual values (see [2]).
Still we can extract the same type of process invariants.
Focusing on valve_2_on we see it depends on belt_moving.
Note that AR learning also found a relation between these two
variables.
A high conditional probability give a rule.
In this case we get two rules:
● ¬valve_2_on → belt_moving
● valve_2_on → ¬belt_moving
We also find that valve_1_on depends on belt_moving
but only value true has a high conditional probability:
● valve_1_on → ¬belt_moving
Baysiannetwork around
variable valve_2_on
42. Using Association Rules
Consider our association rule:
¬valve_2_on→belt_moving
Whenever we see a change to an involved variable we check
that the process variant is preserved
If value 2 is off but belt is not moving this represent a
violation of the invariant.
A rule has a confidence level (or conditional probability) x
We thus expect it to be violated at a rate of at most 1-x.
To test this we use an (event based) sliding window; if the
number of deviations among the last T events is larger than it
should be, an alert is raised.
● Note that if the certainty is 1.0, like for our example rule, we can raise
an alert as soon as a invariant violation is found.
43. Related reading
[1] CITADEL D4.4 MILS Monitoring System
[2] CITADEL D3.3 CITADEL Design Techniques to Specify, Verify, and Synthesize Policies for
Run-Time Monitors
[3] From system specification to anomaly detection (and back) (2017)
Davide Fauri, Daniel Ricardo dos Santos, Elisa Costante, Jerry den Hartog, Sandro Etalle, Stefano Tonetta
Workshop on Cyber-Physical Systems Security and Privacy
[4] A white-box anomaly-based framework for database leakage detection (2017)
E Costante, J den Hartog, M Petković, S Etalle, M Pechenizkiy
Journal of Information Security and Applications 32, 27-46
[5] Towards useful anomaly detection for back office networks (2016)
Ö Yüksel, J den Hartog, S Etalle
International Conference on Information Systems Security, 509-520
[6] Reading between the fields: practical, effective intrusion detection for industrial control
systems (2016)
Ö Yüksel, J den Hartog, S Etalle
Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2063-2070
[7] CITADEL D4.5 Integrated and tested Adaptive MILS Platform
[8] CITADEL. D4.3 MILS adaptation system.
[9] Module Configuring the Mils Monitoring System for Communications monitoring of CITADEL
D6.6 Training Materials for Electronic Delivery