Communications monitoring

Communications Monitoring
TU/e Training – Advanced Technical Module Communication Monitoring 1

 Prerequisites
 Basic knowledge on the Citadel framework
● Only very briefly recalled here.
 Knowledge of CITADEL Modeling,
Specifications and Verification Tools for
 Related material
 This module cover communications monitoring
theory, [9] provides instructions on how to
configure and use the code. Examples are
included with the code.
 [1, sec 3] documents the communications
monitoring component.
 [2] provides more details on association rules.
Other material and knowledge

 Overview of communications monitoring
 Basics of intrusion detection
 Communications monitoring in CITADEL
 Communication monitoring methods
 `signature’-based monitoring
 white-box anomaly detection
 association rules
 Monitoring and Specification interaction
 specifying predicates to be learned
Content

TU/e 4
Intrusion detection basics
 Distinguish `legitimate’ from `malicious’ cases.
 Classification is not perfect; will need to make a trade-
off between detection rate and false positive rate.
 False positive rate: % legitimate marked as attack
 Detection rate: % malicious marked as attack
 Two categories of approaches:
 Black listing
● Specify known malicious cases to be prevented
 White listing
● Specify allowed `good’ situations.
 But `lists’ do not cover all cases; leading to
● false negatives for black listing (new, unknown attacks)
● false positives for white listing (unseen legitimate behaviour).
Traffic: Flagged as normal Flagged as attack
legitimate True negative False positive
malicious False negative True positive (detection)

Communications Monitoring
within CITADEL (see also [1])

 Monitoring Plane gathers & evaluates information from
Operational Plane and, when needed, alerts the Adaptation Plane.
Citadel Planes interaction
Overview of the citadel planes [8]

From model to monitoring
 Different monitors may be configured for different
(types of) interfaces used for different (types of)
applications.
 Model specifies interfaces (part of architecture)
 Analysis of the model determines which need to be
monitored (Monitor synthesis).
 Model may be useful in the creation of monitors as
well (discussed more below).
 Configuration plane activates the required
monitoring.
 Monitoring may trigger alerts leading to adaption
of the system (adaption plane)

 Monitor extracts and analyses messages
features
Monitoring communication
Process
A
Process
B
Monitor
Raw data Parser Message features
Traffic on monitored interface(s)
is also sent to monitor.
 Processes use network interfaces to
talk to other processes

Basics of monitoring: Features
 Features capture specific aspects of
messages.
 Connection aspects, e.g.: Sender, Receiver,
ports they use, timestamp, ...
 but also content based, e.g. http response
code, function code, setpoint, etc.
● Different protocols will have different content that
can be extracted.
 or even composite(metadata) eg. `connection’
which captures both sender receiver and their
ports.
 Analysis considers tuples of feature values(*).
 I.e. all relevant information is captured by
features.
(*) Feature value: the value (eg 192.168.0.1) a feature (eg source‐ip) takes.

Monitors look for indicators of compromise,
risks or problems in the system by finding
specific situations (blacklisting) or deviations
from the norm (whitelisting). We consider:
 `signature’-based(*) monitoring (blacklisting)
 white-box anomaly detection (whitelisting)
 association rules (constraints; whitelisting)
Analysis: What to look for and how
(*) Here we use the term `signature’ for a quite general form of rule based blacklisting.
Other literature may use a much more specific notion of signature.

`Signature’-based analysis
Blacklisting of known bad
situations

`Signature’-based monitoring
 If we know the `bad situation’ we are looking
for (be it an attack, failure, etc.) we can try
to capture it in a signature.
 Simply specify which combination of feature
values that indicates the situation.
 A signature can be quite specific to capture
exactly the situation we want to detect.
 Can combine: High likelihood of detection this
situation with low change of false positives.
 Below we show a simple signature and how it
can be used within the CITADEL framework.

 `Balancer’ B can be configured to use one of two
servers (S1, S2). Currently using S1.
 If the configured server fails , B will send out an
internal server error response: An HTTP 500 message.
 If this happens, B should be reconfigured to use S2
instead.
`Signature’-based scenario
B
500

 The CITADEL communications monitoring component
monitors the outgoing connection of balancer B.
 The monitor uses a rule:
If response_code == ‘500’ then ALARM_8080
 If the message response code is `500’ then raise an
alert called `ALARM_8080’.
`Signature’-based scenario
B
500
CITADEL
monitor

 Alert id `ALARM_8080’ is defined in the system
model so the adaptation plane recognizes it.
 Adaption handles the alarm by instructing the
configuration plane to switch to a configuration
where B uses S2 instead. (See those planes and
[7] for more information on these steps.)
`Signature’- based scenario
B
CITADEL
monitor
ALARM_8080
500

 We have seen signature using a condition to
trigger an alert; a signature rule consists of:
 a condition; which is a boolean expression on
features, e.g.
● response_code == 500
● setpoint1 < 5 OR setpoint1 > 10
 an alert identifier; which is a string (its
meaning is given by the system model), e.g.
● ALARM_8080
● SetpointOutOfRange
 Note how signature rules may use any of the
features that we have defined, including
those about the content of the
communication, making them quite versatile.
`Signatures’ in general

White-box anomaly detection

 Signatures work well if you know what
you are looking for, but typically not all
attacks/failures will be known.
 Monitor for anomalous behaviour that can
indicate attacks/problems.
 Need to distinguish between `normal’ and
`anomalous’ traffic.
 Learn model of normal traffic from training
data.
 Whitelisting; any deviation from normal
model is seen as anomalous.
 White-box: informative features &
understandable model.
Anomaly detection

Feature binning
 Some features with useful information may not be directly suitable
for learning.
 Consider for example a timestamp. Trying to learn the exact millisecond
something happens is not meaningful. However, it may be interesting
whether it is during the day or the night.
 Binning allows learning the useful part of such features.
 needed for features that can take many different values (eg large
numbers, floating point) though it can be used on any feature.
 the domain of values is divided into sets called bins
 feature is assigned the bin it falls into, rather than the exact value taken.
 Examples:
 ranges for potentially large numbers, such as the size of a message
● bins could be: 0-499bits, 500-999 bits, etc.
 There are many ways to bin a timestamp, e.g.
● ranges like, in which hour does it fall, or which part of the day;
morning/afternoon/evening/night,
● which day is it on; mo-tue-,...-su, or weekday vs weekend.
● What makes most sense depends on the application.

Learning based scenario
 In this scenario (inspired by [4]) we
consider messages that contain requests
sent to a database.
 Examples of features:
 time of access, source
 query content
● command (eg select, update, delete)
● which tables & fields
● response content (eg #records retrieved)
● combinations of the above.
Database
Monitor

Learning based scenario
 Bob from accounting needs access to the
database for his job, but we do not know
beforehand how that translates exactly
into the requests he makes.
 Learn his profile by monitoring normal
behaviour for some time (training; learning
a model).
 Histograms capture behaviour per feature.
For many anomaly‐based approaches, the models and alerts are not very informative (`black box’).
Here models (and alerts as we will see below) are easy to interpret; `whitebox’.

Thresholds and Model tuning
 After learning normal behaviour, on needs to set which
values (bins) are considered normal and which are
anomalous.
 The easiest way is to set a single threshold; anything
that is less likely that the threshold is seen as
anomalous.
 The figure shows the effect of setting (an extremely
high) threshold of 26%.
● Insert and delete commands and access to column age are
seen as normal.

Thresholds and Model tuning
 In addition to a default threshold one can set a threshold per feature
 The figure shows that a threshold of 10% will still leave delete as `anomalous’.
 We can also tweak the model itself;
 If we know this value is ok, we can tweak the model to specifically set it to normal.
 Similarly we can mark values as anomalous even when encountered in the training.
 (Further) tweaking may occur upon a detection
 Upon false positives mark values as normal, tweak thresholds and/or use a sliding
window (discussed below).
 Upon true positives one may set a custom alert, and update the system model (more
below).
Being `whitebox’ makes such tweaking possible

Alerts on deviations
 Alerts are raised when observed queries do not fit
with learned behaviour; a feature value has a
likelihood lower that the threshold.
 Alerts indicate why the query is strange; which
features cause the alert.
 The alert below shows Sally accesses unusual data
at a strange time.
TU/

Context can matter
 Bob may need to change the value of days_off
 update is a normal value on feature command
 Bob may need to read the content of name
 name is a normal value for feature column_set
 However, normally he would not change the name.
 The combination update and name is not normal
 The model above cannot detect this; update and name by themselves are normal.
 Combined feature command-column_set can detect this
 Consider such composite features if features are correlated
 Association rules below give another way to specify relationships between features.
having the right features is essential for effective
white‐box anomaly detection, see also [5],[6]

From system specification to
anomaly detection and back.

Monitoring and Specification
 Monitoring can benefit from the system specification
 Where to monitor, what to monitor for; interpreting data
(defining features), potentially useful combinations of
features.
 Specification can benefit from learning through
monitoring
 Learn details of the specification instead of having to
define them by hand.
 Alerts may indicate situations not yet considered in the
specification.
 Below we detail this interactive approach combining
specification and monitoring, illustrated with a simple
smart manufacturing use case.

Monitoring and Specification [3]
Lightweight
specification
learn detect
Data DataData
classification /
visualization
model
features
linked to
predicates
“What is seen”
“What it means”
“What is correct”
TP
FPtuning
completing
providing context
alerts linked to features
uninterpreted predicates
 Multiple views on the same system:

Bottle Filling Plant (BFP)
A smart manufacturing use case
 Remote controlled production facility.
 Fills bottles with two ingredients, mixes them & inspects result.
 Picture shows main components and communication links.

Components of the BFP
 Physical Process:
 Belt: moves the bottles from station to station, can be started and
stopped.
 Stations :
● each station has a sensor (1-4) to detect whether a bottle is present
● Filling stations: with valves that can be opened and closed to control the flow
of liquid
● Mixer: blends the liquids in the bottle, can be started and stopped.
● Quality check station: has sensor (5) to measure amount of liquid in the bottle
 Programmable Logic Controller (PLC)
 controls `actuators’ (belt, valves, mixer) and sensors.
 Uses the Modbus protocol to communicate.
 Remote Terminal Unit (RTU)
 provides an interface to connect to the PLC from the outside network.
 Master (at the factory headquarters)
 provides the remote Human machine interface (HMI)

`Lightweight’ Specification
 In modeling the process we can use `to-
be-learned’ predicates
 Interpretation not given by model; but
rather filled in by learning.
 Example: liquids in bottle should form a
valid mixture, but what constitutes a valid
mixture? Learning answers that:
 G( bottle_ok → ?Valid(ingr1, ingr2) )
 Valid is an uninterpreted property
 Learn from monitoring what are valid
combinations of ingr1, ingr2.

 Modbus is very simple protocol,
we can extract:
 Function code
 register nr, value
 Knowing what registers are used for (map them to
notions in the specification), allows extracting
meaningful features:
 setpoint_1, setpoint_2, setpoint_mixer, etc.
 at_valve_1, at_valve_2, etc.
 Create mapping from PLC implementation
documentation (if available), visual inspection (see
next slide) of the traffic and a (partial)
specification of the system.
BFP – Feature building
Data
classif. / vis.
features
LW spec
Data
learn

BFP – Feature building
Register values plotted and interpreted using system model (see [1]) and basic process knowledge.

BFP –
Feature building
 counter like (bottles_started, bottle_done)
are not useful in the whitebox model
(histograms).
 but may be used to compute useful ones (eg
compare bottles_on_belt with bottles_started
- bottles_done. If not equal indicates a
problem.)
 Specification constraints `total’
 reason to consider it as a potential feature
G(¬(bottle_ok ∧ (ingr1= 0 ∨ingr2= 0))) //null
G(¬(bottle_ok ∧ total > k)) //overflowSpecified constraints:

 Learn and tune model as before and deploy
 Raised alerts have context; meaning full features linked to
system specification.
 False positives are used to tune the detection model
 True positives are added to the specification
Using detection results
120.0
✔
?
?
G(¬(bottle_ok ∧ ingr1 / ingr2 ≠ k))
// composition
total_liquid_in_bottle = 100.0
setpoint_1 = 99.0
setpoint_2 = 1.0
invalid ratio to be prevented
additional `normal’ values

Sliding windows
Sometimes a single violation of a rule/deviation from a
model is not necessarily a problem (eg a physical process
that needs to be in a `bad’ state for some time before it
actually becomes a problem) and it may not provide
enough evidence of compromise – in this case you would
want to react to such situations if there are several
instances within a short time period.
 Look for multiple deviations within a time window
 Sliding window always considers the last T seconds,
and only raises an alarm if it finds at least N deviations
within this window.
duration T
timeline with
anomalies
(3) Alert
(2) no alert
#deviations for alert N = 3
(1) no alert
(1) no alert

 Systems often have many interrelated variables.
Changes in the relationship instead of only the values of
variables may be relevant indicators of compromise.
 White box provides meaningful alerts in form of which
feature(s) exhibit unusual values.
 Considers features individually, if combination of fields
is relevant then needs to be captured in a composed
features, like `connection’.
 Learning of relationships and efficiently capturing them:
Association rules.
 Association Rules are shortly discussed here, see [2] for
more details.
Association rules

Association rules steps
 Invariant learning requires a training set of states, which are
extracted from a training set of network traffic
 Each time a message feature (e.g. at_valve1)shows a change in a system
variable we update the state.
 We consider two methods of learning process invariants
 Baysian network learning
 Association rule mining
 We shortly show examples of process invariants that may be learned
and how to use them
 Details on how the learning works is beyond the scope of this learning
material; see [2] for details and a comparison of the two methods.

Association rule mining
 An association rule represents a process invariant;
 a combination of values that should occur together
 a confidence level that this must be the case.
 In for the bottlelab plant we learn the invariants:
valve_1_on → ￢belt_moving
valve_2_on → ￢belt_moving
￢valve_2_on → belt_moving
 We can interpret these rules as:
● the belt is off when a valve is open (first two)
● the belt starts moving as soon as valve 2 is off.
 (Just several of many rules learned, not all rules are
meaningful/useful-see [2]).

Bayesian Network learning
 A Bayesian network captures dependencies between variables.
 looks at the variables rather than at individual values (see [2]).
 Still we can extract the same type of process invariants.
 Focusing on valve_2_on we see it depends on belt_moving.
 Note that AR learning also found a relation between these two
variables.
 A high conditional probability give a rule.
 In this case we get two rules:
● ￢valve_2_on → belt_moving
● valve_2_on → ￢belt_moving
 We also find that valve_1_on depends on belt_moving
 but only value true has a high conditional probability:
● valve_1_on → ￢belt_moving
Baysiannetwork around
variable valve_2_on

Using Association Rules
 Consider our association rule:
￢valve_2_on→belt_moving
 Whenever we see a change to an involved variable we check
that the process variant is preserved
 If value 2 is off but belt is not moving this represent a
violation of the invariant.
 A rule has a confidence level (or conditional probability) x
 We thus expect it to be violated at a rate of at most 1-x.
 To test this we use an (event based) sliding window; if the
number of deviations among the last T events is larger than it
should be, an alert is raised.
● Note that if the certainty is 1.0, like for our example rule, we can raise
an alert as soon as a invariant violation is found.

Related reading
[1] CITADEL D4.4 MILS Monitoring System
[2] CITADEL D3.3 CITADEL Design Techniques to Specify, Verify, and Synthesize Policies for
Run-Time Monitors
[3] From system specification to anomaly detection (and back) (2017)
Davide Fauri, Daniel Ricardo dos Santos, Elisa Costante, Jerry den Hartog, Sandro Etalle, Stefano Tonetta
Workshop on Cyber-Physical Systems Security and Privacy
[4] A white-box anomaly-based framework for database leakage detection (2017)
E Costante, J den Hartog, M Petković, S Etalle, M Pechenizkiy
Journal of Information Security and Applications 32, 27-46
[5] Towards useful anomaly detection for back office networks (2016)
Ö Yüksel, J den Hartog, S Etalle
International Conference on Information Systems Security, 509-520
[6] Reading between the fields: practical, effective intrusion detection for industrial control
systems (2016)
Ö Yüksel, J den Hartog, S Etalle
Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2063-2070
[7] CITADEL D4.5 Integrated and tested Adaptive MILS Platform
[8] CITADEL. D4.3 MILS adaptation system.
[9] Module Configuring the Mils Monitoring System for Communications monitoring of CITADEL
D6.6 Training Materials for Electronic Delivery

Communications monitoring

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to Communications monitoring

Similar to Communications monitoring (20)

Recently uploaded

Recently uploaded (20)

Communications monitoring