SlideShare a Scribd company logo


Ertuğrul Akbaş
Enhancing SIEM Correlation Rules
Through Baselining
Abstract— In this paper, we describe research into the use of
baselining for enhancing SIEM Correlation rules. Enterprise
grade software has been updated with a capability that
identifies anomalous events based on baselines as well as rule
based correlation engine, and alerts administrators when such
events are identified. To reduce the number of false positive
alerts we have investigated the use of different baseline training
techniques and introduce the use of 3 different training
approaches for baseline detection and updating lifecycle.
Index Terms—Log, SIEM, Correlation, Rule Base, Baseline,
Anomaly Detection, Baseline



omputer crime continues to be problematic for both
public and private sectors not only in Turkey but at an
international level. Over 50% of respondents to the 2006
Computer Security Institute/FBI Computer Crime and
Security Survey [1] reported unauthorized use of computer
The field of computer forensics has been rapidly expanding
in the past twenty years in an effort to combat the continuing
increase in the incidence of criminal activity involving
computers. This area is normally defined by the
identification, securing and analysis of evidence. Generally
accepted definition of computer forensics can be made that
simply attempts to detect, secure and analyze evidence from
computer systems. Also some few cases will end at the court.
This may be done by an organization, for instance, in
response to a security incident, internal or external
The surveys show that the most common types of criminal
activity are the results of virus, worm or Trojan infections.
Insider abuse of Internet access, email or computer system
resources, however, is the third most common type of misuse
in the United States of America [1] Insider misuse can be
defined as the performance of activities where computers and
networks in an organization are deliberately misused by those
who are authorized to use them. Some activities which can be
categorized as insider misuse include: unauthorized access to
information which is an abuse of privileges unauthorized use
of software or applications for purposes other than carrying
out one’s duties theft or breach of proprietary or confidential
information theft or unauthorized use of staff or customer’s
access credentials computer facilitated financial fraud
Logging – the act of recording events occurring in a system –
is a mandatory security measure in any moderately complex
digital environment. Logs can be used to detect and
investigate incidents after they have occurred, but also to
assist in the prevention of harmful incidents, by revealing use
Manuscript received March 11, 2012.
Ertuğrul Akbaş is with the ANET Software Turkey (corresponding author to

and abuse patterns and increasing situational awareness [2]
The most efficient method for incident detection is using
SIEM solution. All of the SIEM solutions are rule based
systems but enhancing this rule based systems by comparing
current performance to a historical metric is needed. And we
need to define a methodology for updating this historical
A good security tool should make comparisons with
historical data, clearly presenting deviations from the norm
This paper reports on work aimed at detecting anomalous
events that may be indicators of misuse. First of all, we
attempt to detect baselining parameters.
Then we will prose an update mechanism for selected
We suggest three possible approaches for updating the
baselines. The first approach is to use a static or constant
window baseline. With this approach detection of alerts is
carried out on a weekly basis after the training period and the
baseline remains the same for each week of testing. The
drawback of this method is: decisions are always based on
that initial baseline.
The second approach proposed for updating baseline is to
use an extended window, where the newly added data from
last baseline calculated time will be added to the baseline
updating cycle. With this approach, the baseline becomes
dynamic and captures changes in a baseline’s behavior or in
the environment. A possible problem with this approach is
that the baseline may retain too much related with history.
A third approach is to use a fixed sliding time window for
the baseline where the width of the time window remains
fixed but data within windows always renewed. After training
the baseline and making decisions based on the data from this
week, the baseline is recalculated by removing the oldest
week of baseline data and adding the new week of data. This
approach is dynamic in nature and prune to historical data
affects over baseline.
Baseline detection relies on models of the intended
behavior of users, applications and networks and interprets
deviations from this normal' behavior [4, 5, 6, 7]. This
approach is complementary with respect to misuse detection,
where a number of attack descriptions (usually in the form of
signatures) are matched against the stream of audited events,
looking for evidence that one of the modeled attacks is
occurring [8].
A basic assumption underlying anomaly detection is that
attack patterns differ from normal behavior. In addition,
anomaly detection assumes that this `difference' can be
expressed quantitatively. Under these assumptions, many
techniques have been proposed to analyze different data
streams, such as data mining for network traffic [9],
statistical analysis for audit records [10], and sequence
analysis for operating system calls [11].
Of particular relevance to the work described here are
techniques that learn the detection parameters from the
analyzed data. For instance, the framework developed by Lee
et al. [12] Provides guidelines to extract features that are
useful for building intrusion classification models. The
approach uses labeled data to derive which is the best set of
features to be used in intrusion detection.
The approach described in this paper is similar to Lee's
because it relies on a set of selected features.
The learning process is purely based on past data, as, for
example, in [13] but we propose new learning schema.
Also we borrowed techniques from intrusion detection and
anomaly detection field.
As logs are sent to the correlation engine, they are
normalized into a high level event type and a more specific
unique event and processed by correlation algorithms. The
quality of a correlation depends on accuracy and speed. The
speed depicts how many correlations can be found in a certain
amount of time. The accuracy depicts how many of the
identified correlations represent real existing relations
between these alerts. Due to more complex attacks and large
scale networks, the amount of alerts increases significantly
which yields the requirement for improved quality of the
correlation. The correlation accuracy depends on the used
correlation algorithm.
Main approach for identifying those baselines is to
interview a group of experienced network administrators and
ask them how they identify and tackle problems. The authors
have selected different kinds of administrators ranging from
small, single OS (e.g. mostly Apple or Windows) networks to
large heterogeneous, multi vendor networks. Below it is
summarized the original results of the survey:
1. Concentrate on core services. DNS, routing services are
critical in IP networks. If core services fail then everything
2. Statistical analysis that deals with extracting information
from data is important for baseline detection. The statistical
analysis of error status over the time is another good way for
detecting baselines.
3. Monitor non-unicast traffic. Non-unicast traffic that is
causing high utilization of the switch. or misconfigured
windows hosts that have a misspelled workgroup name create
a significant amount of broadcast traffic and can be easily
4. Traffic analysis is very important for baseline detection.
Unwanted traffic is key concept for network monitoring.
There is some heuristics [16] that have an efficient way to
detect unwanted network protocols.
5. Active monitoring: Rejected connection rate can be used
for TCP services. ICMP port unreachable can be used for
UDP services.
6. Monitor protocol distribution.
7. Client are most vulnerable than servers. Because they


Fig. 1. DOS attack affect on network traffic.

are not maintained by the network administrators. Monitor
clients’ behavior
8. Monitor protocols that have weakness or implementation
flaws like FTP, IRC and DNS.
Some other techniques are available in the literature
Since baselines are mostly statistically defined elements,
they need to be updated. As your SIEM usage progresses, you
need to add an interim values periodically.
Although there is some previously done researches in the
literature [20,21]. First of all we suggest detection methods of
baselines then we list the baselines for enterprise network
using case studies. As a last we suggest three possible new
approaches for learning phase of the baseline detection. first
approach is to use a static or constant window baseline. With
this approach detection of alerts is carried out on a weekly
basis after the training period and the baseline remains the
same for each week of testing. The drawback of this method
is: decisions are always based on that initial baseline.
The second approach proposed for updating baseline is to
use an extended window, where the newly added data from
last baseline calculated time will be added to the baseline
updating cycle. With this approach, the baseline becomes
dynamic and captures changes in a baseline’s behavior or in
the environment. A possible problem with this approach is
that the baseline may retain too much related with history.
A third approach is to use a fixed sliding time window for
the baseline where the width of the time window remains
fixed but data within windows always renewed. After training
the baseline and making decisions based on the data from this
week, the baseline is recalculated by removing the oldest
week of baseline data and adding the new week of data. This
approach is dynamic in nature and prune to historical data
affects over baseline.
The concept of a baseline is not new or complex. In general,
a baseline is a well-defined, well-documented reference that
serves as the foundation for other activities. We will call this
baseline from now on.
Understanding the network devices, applications, services
etc. deployed in your network and their normal status is key
to establishing a baseline.
A. Identifying Baselines
The method is marked by comparing current performance
to a historical metric. Detecting baselines for correlation
engine is important. One can increase the number of
baselines detected as many as needed. But detecting sufficient
baselines and baseline parameters will be critical in a manner
of system response. For an enterprise grade SIEM solution in
memory correlation is critical.
You will need to understand 'good traffic' for your network
during the 'initial tuning' phase, use white lists and blacklists
of signatures, and then build heuristic analysis if you want to
alert on deviation from the normal. In our approach we offer
methods for baseline detection in section IV. Also we
analyzed some real world case studies from FAUNA
installations [14]. Results of those methods are:
• Port Usage Baseline
o Hits on port 80 over the last week
o Count of new ports hit on a firewall
• User Baseline
o Software Usage Baseline
o Internet Usage Baseline
o Intranet Usage Baseline
• EPS Baseline
o Normal Events per second (NE) :The NE
metric will represent the normal number
of events usage time for a device, or for
your Log or Event Management scope.
o Peak Events per second (PE) :The PE
metric will represent the peak number of
events usage time for a device, or for your
Log or Event Management scope. The PE
represent abnormal activities on devices
you create temporary peaks of EPS, for
example DoS, ports scanning, mass SQL
injections attempts, etc. PE metric is the
more important cause it will determine
your real EPS requirements.
• Traffic Baseline
o Number of hosts touching each server per
• Network Baseline
o Protocols
o System Access
o Event Types
• System Baseline
o Login/Logout Success/Failure
o CPU per day
o Memory per day
o Disk availability per day
o User Logins to Server X per day
o Process Starts
o Use of “su” command per hour of day
o Configuration Changes
o User Rights
o Event Log settings
o Restricted Groups
o System Services
o File Permissions
o Registry Permissions
• Application Baseline
o Data Access Types


o User Data Changes
o Client
• Database Baseline
o Data Access Types
o User Data Changes
o Client
As it is mentioned before that detecting baselines and
baseline parameters are very critical, we did a search on
use cases of an available enterprise software FAUNA [14]
within real world cases then detect above baselines and
baseline parameters
The challenge with correlation rules is that in a sense they
are “signature based” in that you largely have to know the
situation you are trying to detect. For example, “monitor my
five external firewalls and tell me if you see port scans from
the same public net block on more than 3 of my firewalls in
the same 30 minute period.”
An approach we find far more promising is Baselining.
This approach will be used addition to rule based system in
our case [15]. Its advantage is that it doesn’t watch for
anything specific, rather it attempts to identify any patterns of
security events which are unusual based on previously base
lined performance. Interestingly, baselining is based on the
meta-data from events — not the events themselves –
although the events themselves can be retrieved based on the
meta data.
An example that will best illustrates the value of Anomaly
Detection: A quick FTP connection attempt confirmed that
FTP was indeed responding on the server and a few more
minutes of sleuthing determined that a Windows reboot had
restarted the FTP service a few hours prior. Within an hour a
hacker had initiated a brute force admin password attack on
the server. Anomaly Detection noted the unusual FTP pattern
(as compared to the previous months baseline) and thwarted
the security incident before any impact.
Automatically detecting anomalous behavior by looking for
unexpected statistical deviations from an ever evolving “baseline”
Since FAUNA [3] has a JAVA based API for
customization, we used JAVA for implementation. FAUNA
has RULE based correlation engine currently with the
features of:
• Support Mvel and java
• JSR94 compatible
• In-memory correlation
• Single point of correlation rules source
• Multi point of correlation rules source
• Negative condition rules
• Context based correlation
• Rule Base.
• Complex Event Processing(CEP).


• Forward Channing.
Backward Chaining.

With this adaptation FAUNA will have a correlation
engine with the below features:
• Pre-defined pattern matching
• Statistical analysis (anomaly detection)
• Basic conditional Boolean logic statements
• Contextually relevant and/or enhanced data set
with Boolean logic
Complex Event Processing, or CEP, is primarily an event
processing concept that deals with the task of processing
multiple events with the goal of identifying the meaningful
events within the event cloud. CEP employs techniques such
as detection of complex patterns of many events, event
correlation and abstraction, event hierarchies, and
relationships between events such as causality, membership,
and timing, and event-driven processes."
The vision on FAUNA of a Behavioral Modeling Platform
can only be achieved by moving away from any of the narrow
modeling perspectives that see only Rules, or Processes, or
Events as their main modeling concept. To effectively achieve
the flexibility and power of behavioral modeling, a platform
must understand all of these concepts as primary concepts
and allow them to leverage on each other strengths.
FAUNA, in this scenario, is an independent module, but
still completely integrated with the rest of the platform, that
adds a set of features to enable it:
• understand and handle events as first class
citizens of the platform
• select a set of interesting events in a cloud or
stream of events
• detect the relevant relationships (patterns) among
these events
• take appropriate actions based on the patterns
Events as first class citizens: events are a special entity
that is a record of a significant change of state in the
application domain. They have several unique and
distinguishing characteristics, like being usually immutable,
having strong temporal constraints and relationships.
FAUNA understand events by what they are and allow users
to model business rules, queries and processes depending on
the occurrence or absence of them
Support asynchronous multi-thread streams: events may
arrive at any time and from multiple sources (or streams).
They can also be stored in cloud-like structures. FAUNA
supports both work with streams and clouds of events. In case
of streams it supports asynchronous, multi-thread feeding of
Support for temporal reasoning: events usually have
strong temporal relationships and constraints. FAUNA adds a
complete set of temporal operators to allow modeling and
reasoning over temporal relationships between events.
Support events garbage collection: events grow old,


quickly or slow, but they do grow old. FAUNA is able to
identify the events that are no longer needed and dispose
them as a way of freeing resources and scaling well on
growing volumes.
Support reasoning over absence of events: the same way
in that it is necessary to model rules and processes that react
to the presence of events, it is necessary to model rules and
processes that react to the absence of events. Example: "If the
temperature goes over the threshold and no contention
measure is taken within 10 seconds, then sound the alarm".
FAUNA leverages on the capabilities of the Drools Expert
engine, allowing it complete and flexible reasoning over the
absence of events, including the transparent delaying of rules
in case of events that require a waiting period before firing
the absence.
Support to Sliding Windows: a especially common
scenario on Event Processing applications is the requirement
of doing calculations on moving windows of interest, be it
temporal or length-based windows. FAUNA has complete
support for Sliding Windows, providing out of the box
aggregation functions as well as leveraging the pluggable
function framework to allow for the use of users defined
custom functions.
FAUNA correlation engine will automatically retract the
event if it's not needed anymore, for example if you are
calculating the average using sliding windows, as soon as the
event falls out of the sliding windows and don't match any
other rule it will be automatically retracted if FAUNA is
operating in Stream Mode.

About persisting the internal state, we have implemented
our own mechanisms storing the average and re initialize the
state when it is needed
Baseline updating achieved by Rule/Model chaining. In a
typical scenario, a control rule triggers a model and generates
an output. This output can either be validated directly using
rules or by starting other more complex inference chains.
Eventually,one or more of the inferred facts is consumed by
an avaluator which in turn triggers another control rule and
the processis restarted, as shown below.
rule " Trigger "
$input : Data ( <pr e c ondi t i ons> )
$model : ?Model ( . . . )
insert (new Re sul t ( $model . invoke ( $input ) ) ) ;
rule " Eval "
Re sul t ( value pm eval t a r g e t )
insert (new Data ( . . . ) ) ;
On the existing problem of the current network security
event management, this paper proposed a security event
correlation algorithm based on baselining and an algorithm
for updating lifecycle of baselines. Additionally we listed
predefined baselines and baseline parameters from real world
So, when you think SIEM, don’t just think “how many
rules?” – think “what other methods for real-time and
historical event analysis do they use?”.
Sample of the rules than can be developed with the help of
enhancement to rule based systems:
1. Monitor and compare with baseline “count unique port
per source per hour/day”
2. Monitor and compare with baseline “protocol per hour
per sensor”
3. Monitor and compare with baseline “Hits on port 80 per
the week”
4. Monitor my five external firewalls and tell me if you see
port scans from the same public net block on more than 3
of my firewalls in the same 30 minute period. For
example, “monitor my five external firewalls and tell me
if you see port scans from the same public net block on
more than 3 of my firewalls in the same 30 minute
5. Monitor five attempts to login to a system within one
minute using the same user account
6. Monitor/Associate system navigation with non-standard
working hours.
7. Monitor traffic source:
o (IF username=root AND ToD>10:00PM AND
ToD<7:00AM AND Source_Country=China,
THEN ALERT “Root Login During Non-Biz
Hours from Foreign IP”)
As a result, the developed system has features below:
• Pre-defined pattern matching
• Statistical analysis (anomaly detection)
• Basic conditional Boolean logic statements
• Contextually relevant and/or enhanced data set
with Boolean logic




Gordon, L. A., Loeb, M. P., Lucyshyn, W. and Richardson, R. 2006
CSI/FBI computer crime and security survey: Accessed 12 Oct
Stefan Axelsson, Ulf Lindqvist, Ulf Gustafson, and Erland Jonsson. An
Approach to UNIX Security Logging. In Proc. 21st NIST-NCSC National
Information Systems Security Conference, pages 62–75, 1998.
Raffael Marty. Applied Security Visualization. Pearson Education, Inc,501
Boylston Street, Suite 900, Boston, MA, 2008.
D.E. Denning. An Intrusion Detection Model. IEEE Transactions on
Software Engineering, 13(2):222{232,February 1987.
A.K. Ghosh, J. Wanken, and F. Charron. Detecting Anomalous and
Unknown Intrusions Against Programs. In Proceedings of the Annual
Computer Security Applications Conference (ACSAC'98), pages 259{267,
Scottsdale, AZ, December 1998.
C. Ko, M. Ruschitzka, and K. Levitt. Execution Monitoring of SecurityCritical Programs in Distributed Systems: A Specification-based Approach.
In Proceedings of the 1997 IEEE Symposium on Security and Privacy,
pages 175{187, May 1997.
T. Lane and C.E. Brodley. Temporal sequence learning and data reduction
for anomaly detection. In Proceedings of the 5th ACM conference on





Computer and communications security, pages 150{158. ACM Press,
K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A
Rule-Based Intrusion Detection System. IEEE Transactions on Software
Engineering, 21(3):181{199, March 1995.
W. Lee, S. Stolfo, and K. Mok. Mining in a DataFlow Environment: Experience in Network Intrusion Detection. In
Proceedings of the 5th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining (KDD '99), San Diego, CA, August
H. S. Javitz and A. Valdes. The SRI IDES Statistical Anomaly Detector. In
Proceedings of the IEEE Symposium on Security and Privacy, May 1991.
S. Forrest. A Sense of Self for UNIX Processes. In Proceedings of the IEEE
Symposium on Security and Privacy, pages 120{128, Oakland, CA, May
W. Lee and S. Stolfo. A Framework for Constructing Features and Models
for Intrusion Detection Systems. ACM Transactions on Information and
System Security, 3(4), November 2000.
C. Kruegel, T. Toth, and E. Kirda. Service Specific Anomaly Detection for
Network Intrusion Detection. In Symposium on Applied Computing
(SAC). ACM Scientific Press, March 2002.
D. Plonka, FlowScan: A Network Traffic Flow Reporting and
Visualization Tool, Proc. of XIV th Lisa Conference, December 2000.
Daniela Brauckhoff, Network Traffic Anomaly Detection and Evaluation ,
Augustin Soule, Kav´e Salamatian, and Nina Taft.Combining filtering and
statistical methods for anomaly detection. In IMC’05: Proceedings of the
5th Conference on Internet Measurement 2005, Berkeley, California, USA,
October 19-21, 2005,pages 331–344. USENIX Association, 2005.
Animesh Patcha and Jung-Min Park. An overview of anomaly detection
techniques: Existing solutions and latest technological trends. Comput.
Netw., 51(12):3448–3470, 2007.
Northcutt, S., Novak, J.: Network Intrusion Detection: An Analyst’s
Handbook. New Riders Publishing, Thousand Oaks (2002)
Kruegel, C., Valuer, F., Vigna, G.: Intrusion Detection and Correlation:
Challenges and Solutions. AIS, vol. 14. Springer, Heidelberg (2005)

Ertuğrul Akbaş is currently working with
ANET. He received his Ph.D. in Computer Science with a thesis on Intelligent
Systems from Gebze Institute of Technology in 2005. He previously worked as
research scientist TÜBİTAK-Marmara Research Center and as project manager
at National Research Institute of Electronics and Cryptology. His professional
interests include network security, information security, network management
and monitoring, software components and object-oriented technology.
event management, this paper proposed a security event
correlation algorithm based on baselining and an algorithm
for updating lifecycle of baselines. Additionally we listed
predefined baselines and baseline parameters from real world
So, when you think SIEM, don’t just think “how many
rules?” – think “what other methods for real-time and
historical event analysis do they use?”.
Sample of the rules than can be developed with the help of
enhancement to rule based systems:
1. Monitor and compare with baseline “count unique port
per source per hour/day”
2. Monitor and compare with baseline “protocol per hour
per sensor”
3. Monitor and compare with baseline “Hits on port 80 per
the week”
4. Monitor my five external firewalls and tell me if you see
port scans from the same public net block on more than 3
of my firewalls in the same 30 minute period. For
example, “monitor my five external firewalls and tell me
if you see port scans from the same public net block on
more than 3 of my firewalls in the same 30 minute
5. Monitor five attempts to login to a system within one
minute using the same user account
6. Monitor/Associate system navigation with non-standard
working hours.
7. Monitor traffic source:
o (IF username=root AND ToD>10:00PM AND
ToD<7:00AM AND Source_Country=China,
THEN ALERT “Root Login During Non-Biz
Hours from Foreign IP”)
As a result, the developed system has features below:
• Pre-defined pattern matching
• Statistical analysis (anomaly detection)
• Basic conditional Boolean logic statements
• Contextually relevant and/or enhanced data set
with Boolean logic




Gordon, L. A., Loeb, M. P., Lucyshyn, W. and Richardson, R. 2006
CSI/FBI computer crime and security survey: Accessed 12 Oct
Stefan Axelsson, Ulf Lindqvist, Ulf Gustafson, and Erland Jonsson. An
Approach to UNIX Security Logging. In Proc. 21st NIST-NCSC National
Information Systems Security Conference, pages 62–75, 1998.
Raffael Marty. Applied Security Visualization. Pearson Education, Inc,501
Boylston Street, Suite 900, Boston, MA, 2008.
D.E. Denning. An Intrusion Detection Model. IEEE Transactions on
Software Engineering, 13(2):222{232,February 1987.
A.K. Ghosh, J. Wanken, and F. Charron. Detecting Anomalous and
Unknown Intrusions Against Programs. In Proceedings of the Annual
Computer Security Applications Conference (ACSAC'98), pages 259{267,
Scottsdale, AZ, December 1998.
C. Ko, M. Ruschitzka, and K. Levitt. Execution Monitoring of SecurityCritical Programs in Distributed Systems: A Specification-based Approach.
In Proceedings of the 1997 IEEE Symposium on Security and Privacy,
pages 175{187, May 1997.
T. Lane and C.E. Brodley. Temporal sequence learning and data reduction
for anomaly detection. In Proceedings of the 5th ACM conference on





Computer and communications security, pages 150{158. ACM Press,
K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A
Rule-Based Intrusion Detection System. IEEE Transactions on Software
Engineering, 21(3):181{199, March 1995.
W. Lee, S. Stolfo, and K. Mok. Mining in a DataFlow Environment: Experience in Network Intrusion Detection. In
Proceedings of the 5th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining (KDD '99), San Diego, CA, August
H. S. Javitz and A. Valdes. The SRI IDES Statistical Anomaly Detector. In
Proceedings of the IEEE Symposium on Security and Privacy, May 1991.
S. Forrest. A Sense of Self for UNIX Processes. In Proceedings of the IEEE
Symposium on Security and Privacy, pages 120{128, Oakland, CA, May
W. Lee and S. Stolfo. A Framework for Constructing Features and Models
for Intrusion Detection Systems. ACM Transactions on Information and
System Security, 3(4), November 2000.
C. Kruegel, T. Toth, and E. Kirda. Service Specific Anomaly Detection for
Network Intrusion Detection. In Symposium on Applied Computing
(SAC). ACM Scientific Press, March 2002.
D. Plonka, FlowScan: A Network Traffic Flow Reporting and
Visualization Tool, Proc. of XIV th Lisa Conference, December 2000.
Daniela Brauckhoff, Network Traffic Anomaly Detection and Evaluation ,
Augustin Soule, Kav´e Salamatian, and Nina Taft.Combining filtering and
statistical methods for anomaly detection. In IMC’05: Proceedings of the
5th Conference on Internet Measurement 2005, Berkeley, California, USA,
October 19-21, 2005,pages 331–344. USENIX Association, 2005.
Animesh Patcha and Jung-Min Park. An overview of anomaly detection
techniques: Existing solutions and latest technological trends. Comput.
Netw., 51(12):3448–3470, 2007.
Northcutt, S., Novak, J.: Network Intrusion Detection: An Analyst’s
Handbook. New Riders Publishing, Thousand Oaks (2002)
Kruegel, C., Valuer, F., Vigna, G.: Intrusion Detection and Correlation:
Challenges and Solutions. AIS, vol. 14. Springer, Heidelberg (2005)

Ertuğrul Akbaş is currently working with
ANET. He received his Ph.D. in Computer Science with a thesis on Intelligent
Systems from Gebze Institute of Technology in 2005. He previously worked as
research scientist TÜBİTAK-Marmara Research Center and as project manager
at National Research Institute of Electronics and Cryptology. His professional
interests include network security, information security, network management
and monitoring, software components and object-oriented technology.

More Related Content

What's hot

Final Project Report-SIEM
Final Project Report-SIEMFinal Project Report-SIEM
Final Project Report-SIEM
Rangan Yoga

What's hot (20)

Understanding ransomware
Understanding ransomwareUnderstanding ransomware
Understanding ransomware
Cyber Defense Matrix: Reloaded
Cyber Defense Matrix: ReloadedCyber Defense Matrix: Reloaded
Cyber Defense Matrix: Reloaded
Introduction to SIEM.pptx
Introduction to SIEM.pptxIntroduction to SIEM.pptx
Introduction to SIEM.pptx
QRadar, ArcSight and Splunk
QRadar, ArcSight and Splunk QRadar, ArcSight and Splunk
QRadar, ArcSight and Splunk
From SIEM to SOC: Crossing the Cybersecurity Chasm
From SIEM to SOC: Crossing the Cybersecurity ChasmFrom SIEM to SOC: Crossing the Cybersecurity Chasm
From SIEM to SOC: Crossing the Cybersecurity Chasm
Information Security Awareness And Training Business Case For Web Based Solut...
Information Security Awareness And Training Business Case For Web Based Solut...Information Security Awareness And Training Business Case For Web Based Solut...
Information Security Awareness And Training Business Case For Web Based Solut...
Cyber Security roadmap.pptx
Cyber Security roadmap.pptxCyber Security roadmap.pptx
Cyber Security roadmap.pptx
Vulnerability assessment and penetration testing
Vulnerability assessment and penetration testingVulnerability assessment and penetration testing
Vulnerability assessment and penetration testing
Intrusion detection system
Intrusion detection system Intrusion detection system
Intrusion detection system
SOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations CenterSOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations Center
Honeypot based intrusion detection system PPT
Honeypot based intrusion detection system PPTHoneypot based intrusion detection system PPT
Honeypot based intrusion detection system PPT
Cyber Security 101: Training, awareness, strategies for small to medium sized...
Cyber Security 101: Training, awareness, strategies for small to medium sized...Cyber Security 101: Training, awareness, strategies for small to medium sized...
Cyber Security 101: Training, awareness, strategies for small to medium sized...
What is SIEM? A Brilliant Guide to the Basics
What is SIEM? A Brilliant Guide to the BasicsWhat is SIEM? A Brilliant Guide to the Basics
What is SIEM? A Brilliant Guide to the Basics
Aujas incident management webinar deck 08162016
Aujas incident management webinar deck 08162016Aujas incident management webinar deck 08162016
Aujas incident management webinar deck 08162016
Web vulnerabilities
Web vulnerabilitiesWeb vulnerabilities
Web vulnerabilities
Cybersecurity Awareness Session by Adam
Cybersecurity Awareness Session by AdamCybersecurity Awareness Session by Adam
Cybersecurity Awareness Session by Adam
Security Model in .NET Framework
Security Model in .NET FrameworkSecurity Model in .NET Framework
Security Model in .NET Framework
Final Project Report-SIEM
Final Project Report-SIEMFinal Project Report-SIEM
Final Project Report-SIEM
Cyber Security and Data Protection
Cyber Security and Data ProtectionCyber Security and Data Protection
Cyber Security and Data Protection
Network Forensic
Network ForensicNetwork Forensic
Network Forensic

Viewers also liked

Log management siem 5651 sayılı yasa
Log management siem 5651 sayılı yasaLog management siem 5651 sayılı yasa
Log management siem 5651 sayılı yasa
Ertugrul Akbas
Threat intelligence ve siem
Threat intelligence ve siemThreat intelligence ve siem
Threat intelligence ve siem
Ertugrul Akbas

Viewers also liked (20)

Log yonetimi tecrubeleri
Log yonetimi tecrubeleriLog yonetimi tecrubeleri
Log yonetimi tecrubeleri
Log Yönetimi yazılımımın veritabanında günlük ne kadar log olmalı?
Log Yönetimi yazılımımın veritabanında günlük ne kadar log olmalı?Log Yönetimi yazılımımın veritabanında günlük ne kadar log olmalı?
Log Yönetimi yazılımımın veritabanında günlük ne kadar log olmalı?
Log yonetmi ve siem ürünlerinde veri analizi, sonuclarin tutarliligi ve dogru...
Log yonetmi ve siem ürünlerinde veri analizi, sonuclarin tutarliligi ve dogru...Log yonetmi ve siem ürünlerinde veri analizi, sonuclarin tutarliligi ve dogru...
Log yonetmi ve siem ürünlerinde veri analizi, sonuclarin tutarliligi ve dogru...
Hep İşin Geyiğini Yapıyoruz: AR-GE, İnovasyon, Endüstri 4.0, Ahlak, Eğitim, P...
Hep İşin Geyiğini Yapıyoruz: AR-GE, İnovasyon, Endüstri 4.0, Ahlak, Eğitim, P...Hep İşin Geyiğini Yapıyoruz: AR-GE, İnovasyon, Endüstri 4.0, Ahlak, Eğitim, P...
Hep İşin Geyiğini Yapıyoruz: AR-GE, İnovasyon, Endüstri 4.0, Ahlak, Eğitim, P...
Ertugrul akbas
Ertugrul akbasErtugrul akbas
Ertugrul akbas
Log yönetimi ve siem
Log yönetimi ve siemLog yönetimi ve siem
Log yönetimi ve siem
Log yonetimi
Log yonetimiLog yonetimi
Log yonetimi
Log management siem 5651 sayılı yasa
Log management siem 5651 sayılı yasaLog management siem 5651 sayılı yasa
Log management siem 5651 sayılı yasa
DHCP SERVER Logları ve SNMP ile Kimlik Takibi
DHCP SERVER Logları ve SNMP ile Kimlik TakibiDHCP SERVER Logları ve SNMP ile Kimlik Takibi
DHCP SERVER Logları ve SNMP ile Kimlik Takibi
Loglari nerede saklayalım?
Loglari nerede saklayalım?Loglari nerede saklayalım?
Loglari nerede saklayalım?
Ajansız log toplama
Ajansız log toplamaAjansız log toplama
Ajansız log toplama
Threat intelligence ve siem
Threat intelligence ve siemThreat intelligence ve siem
Threat intelligence ve siem
Juniper Srx Log
Juniper Srx LogJuniper Srx Log
Juniper Srx Log
Ajanlı ve ajansız log toplama
Ajanlı ve ajansız log toplamaAjanlı ve ajansız log toplama
Ajanlı ve ajansız log toplama
SureLog SIEM Jobs
SureLog SIEM JobsSureLog SIEM Jobs
SureLog SIEM Jobs
Log Yönetimi SIEM Demek Değildir!
Log Yönetimi SIEM Demek Değildir!Log Yönetimi SIEM Demek Değildir!
Log Yönetimi SIEM Demek Değildir!
Monitoring Privileged User Actions for Security and Compliance with SureLog: ...
Monitoring Privileged User Actions for Security and Compliance with SureLog: ...Monitoring Privileged User Actions for Security and Compliance with SureLog: ...
Monitoring Privileged User Actions for Security and Compliance with SureLog: ...
Anet SureLog SIEM IntelligentResponse
Anet SureLog SIEM IntelligentResponse Anet SureLog SIEM IntelligentResponse
Anet SureLog SIEM IntelligentResponse

Similar to Enhancing SIEM Correlation Rules Through Baselining

Anomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a surveyAnomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a survey
eSAT Publishing House
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
eSAT Journals

Similar to Enhancing SIEM Correlation Rules Through Baselining (20)

Design and implementation for
Design and implementation forDesign and implementation for
Design and implementation for
Improving the performance of Intrusion detection systems
Improving the performance of Intrusion detection systemsImproving the performance of Intrusion detection systems
Improving the performance of Intrusion detection systems
Intrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIntrusion detection system based on web usage mining
Intrusion detection system based on web usage mining
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques
Anomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a surveyAnomaly detection in the services provided by multi cloud architectures a survey
Anomaly detection in the services provided by multi cloud architectures a survey
IRJET- Attack Detection Strategies in Wireless Sensor Network
IRJET- Attack Detection Strategies in Wireless Sensor NetworkIRJET- Attack Detection Strategies in Wireless Sensor Network
IRJET- Attack Detection Strategies in Wireless Sensor Network
Intrusion Detection System (IDS): Anomaly Detection using Outlier Detection A...
Intrusion Detection System (IDS): Anomaly Detection using Outlier Detection A...Intrusion Detection System (IDS): Anomaly Detection using Outlier Detection A...
Intrusion Detection System (IDS): Anomaly Detection using Outlier Detection A...
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...
Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...
Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...
Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...
Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...
Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...
Ids 014 anomaly detection
Ids 014 anomaly detectionIds 014 anomaly detection
Ids 014 anomaly detection
Critical analysis of genetic algorithm based IDS and an approach for detecti...
Critical analysis of genetic algorithm based IDS and an approach  for detecti...Critical analysis of genetic algorithm based IDS and an approach  for detecti...
Critical analysis of genetic algorithm based IDS and an approach for detecti...
Internet ttraffic monitering anomalous behiviour detection
Internet ttraffic monitering anomalous behiviour detectionInternet ttraffic monitering anomalous behiviour detection
Internet ttraffic monitering anomalous behiviour detection
IRJET- Online Crime Reporting and Management System using Data Mining
IRJET- Online Crime Reporting and Management System using Data MiningIRJET- Online Crime Reporting and Management System using Data Mining
IRJET- Online Crime Reporting and Management System using Data Mining
Ids 013 detection approaches
Ids 013 detection approachesIds 013 detection approaches
Ids 013 detection approaches

More from Ertugrul Akbas

Olay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
Olay Müdahale İçin Canlı Kayıtların Saklanmasının ÖnemiOlay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
Olay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
Ertugrul Akbas
SureLog SIEM Fast Edition
SureLog SIEM Fast EditionSureLog SIEM Fast Edition
SureLog SIEM Fast Edition
Ertugrul Akbas
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
Ertugrul Akbas

More from Ertugrul Akbas (20)

BDDK, SPK, TCMB, Cumhurbaşkanlığı Dijital Dönüşüm Ofisi ve ISO27001 Denetiml...
BDDK, SPK, TCMB, Cumhurbaşkanlığı Dijital Dönüşüm Ofisi ve  ISO27001 Denetiml...BDDK, SPK, TCMB, Cumhurbaşkanlığı Dijital Dönüşüm Ofisi ve  ISO27001 Denetiml...
BDDK, SPK, TCMB, Cumhurbaşkanlığı Dijital Dönüşüm Ofisi ve ISO27001 Denetiml...
Olay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
Olay Müdahale İçin Canlı Kayıtların Saklanmasının ÖnemiOlay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
Olay Müdahale İçin Canlı Kayıtların Saklanmasının Önemi
SOC ve SIEM Çözümlerinde Korelasyon
SOC ve SIEM Çözümlerinde KorelasyonSOC ve SIEM Çözümlerinde Korelasyon
SOC ve SIEM Çözümlerinde Korelasyon
SIEM den Maksimum Fayda Almak
SIEM den Maksimum Fayda AlmakSIEM den Maksimum Fayda Almak
SIEM den Maksimum Fayda Almak
SureLog SIEM Fast Edition Özellikleri ve Fiyatı
SureLog SIEM Fast Edition Özellikleri ve FiyatıSureLog SIEM Fast Edition Özellikleri ve Fiyatı
SureLog SIEM Fast Edition Özellikleri ve Fiyatı
Neden SureLog?
Neden SureLog?Neden SureLog?
Neden SureLog?
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog SIEM Fast Edition
SureLog SIEM Fast EditionSureLog SIEM Fast Edition
SureLog SIEM Fast Edition
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
SureLog intelligent response
SureLog intelligent responseSureLog intelligent response
SureLog intelligent response
SureLog SIEM Has The Best On-Line Log Retention Time (Hot Storage).
SureLog SIEM Has The Best On-Line Log Retention Time (Hot Storage).SureLog SIEM Has The Best On-Line Log Retention Time (Hot Storage).
SureLog SIEM Has The Best On-Line Log Retention Time (Hot Storage).
Detecting attacks with SureLog SIEM
Detecting attacks with SureLog SIEMDetecting attacks with SureLog SIEM
Detecting attacks with SureLog SIEM
SureLog SIEM
SureLog SIEMSureLog SIEM
SureLog SIEM
Siem tools
Siem toolsSiem tools
Siem tools
SIEM ve KVKK Teknik Tedbirlerinin ANET SureLog SIEM ile uygulanması
SIEM ve KVKK Teknik Tedbirlerinin  ANET SureLog SIEM  ile uygulanması SIEM ve KVKK Teknik Tedbirlerinin  ANET SureLog SIEM  ile uygulanması
SIEM ve KVKK Teknik Tedbirlerinin ANET SureLog SIEM ile uygulanması

Recently uploaded

Recently uploaded (20)

IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group

Enhancing SIEM Correlation Rules Through Baselining

  • 1. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING 1 Ertuğrul Akbaş Enhancing SIEM Correlation Rules Through Baselining Abstract— In this paper, we describe research into the use of baselining for enhancing SIEM Correlation rules. Enterprise grade software has been updated with a capability that identifies anomalous events based on baselines as well as rule based correlation engine, and alerts administrators when such events are identified. To reduce the number of false positive alerts we have investigated the use of different baseline training techniques and introduce the use of 3 different training approaches for baseline detection and updating lifecycle. Index Terms—Log, SIEM, Correlation, Rule Base, Baseline, Anomaly Detection, Baseline C I. INTRODUCTION omputer crime continues to be problematic for both public and private sectors not only in Turkey but at an international level. Over 50% of respondents to the 2006 Computer Security Institute/FBI Computer Crime and Security Survey [1] reported unauthorized use of computer systems. The field of computer forensics has been rapidly expanding in the past twenty years in an effort to combat the continuing increase in the incidence of criminal activity involving computers. This area is normally defined by the identification, securing and analysis of evidence. Generally accepted definition of computer forensics can be made that simply attempts to detect, secure and analyze evidence from computer systems. Also some few cases will end at the court. This may be done by an organization, for instance, in response to a security incident, internal or external The surveys show that the most common types of criminal activity are the results of virus, worm or Trojan infections. Insider abuse of Internet access, email or computer system resources, however, is the third most common type of misuse in the United States of America [1] Insider misuse can be defined as the performance of activities where computers and networks in an organization are deliberately misused by those who are authorized to use them. Some activities which can be categorized as insider misuse include: unauthorized access to information which is an abuse of privileges unauthorized use of software or applications for purposes other than carrying out one’s duties theft or breach of proprietary or confidential information theft or unauthorized use of staff or customer’s access credentials computer facilitated financial fraud Logging – the act of recording events occurring in a system – is a mandatory security measure in any moderately complex digital environment. Logs can be used to detect and investigate incidents after they have occurred, but also to assist in the prevention of harmful incidents, by revealing use Manuscript received March 11, 2012. Ertuğrul Akbaş is with the ANET Software Turkey (corresponding author to provide phone: 90-216-3540580; fax: 90-216-3540580; e-mail: ertugrul.akbas@  and abuse patterns and increasing situational awareness [2] The most efficient method for incident detection is using SIEM solution. All of the SIEM solutions are rule based systems but enhancing this rule based systems by comparing current performance to a historical metric is needed. And we need to define a methodology for updating this historical metric. A good security tool should make comparisons with historical data, clearly presenting deviations from the norm [3] This paper reports on work aimed at detecting anomalous events that may be indicators of misuse. First of all, we attempt to detect baselining parameters. Then we will prose an update mechanism for selected baselines. We suggest three possible approaches for updating the baselines. The first approach is to use a static or constant window baseline. With this approach detection of alerts is carried out on a weekly basis after the training period and the baseline remains the same for each week of testing. The drawback of this method is: decisions are always based on that initial baseline. The second approach proposed for updating baseline is to use an extended window, where the newly added data from last baseline calculated time will be added to the baseline updating cycle. With this approach, the baseline becomes dynamic and captures changes in a baseline’s behavior or in the environment. A possible problem with this approach is that the baseline may retain too much related with history. A third approach is to use a fixed sliding time window for the baseline where the width of the time window remains fixed but data within windows always renewed. After training the baseline and making decisions based on the data from this week, the baseline is recalculated by removing the oldest week of baseline data and adding the new week of data. This approach is dynamic in nature and prune to historical data affects over baseline. II.RELATED WORK Baseline detection relies on models of the intended behavior of users, applications and networks and interprets deviations from this normal' behavior [4, 5, 6, 7]. This approach is complementary with respect to misuse detection, where a number of attack descriptions (usually in the form of signatures) are matched against the stream of audited events, looking for evidence that one of the modeled attacks is occurring [8]. A basic assumption underlying anomaly detection is that attack patterns differ from normal behavior. In addition, anomaly detection assumes that this `difference' can be expressed quantitatively. Under these assumptions, many techniques have been proposed to analyze different data streams, such as data mining for network traffic [9],
  • 2. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING statistical analysis for audit records [10], and sequence analysis for operating system calls [11]. Of particular relevance to the work described here are techniques that learn the detection parameters from the analyzed data. For instance, the framework developed by Lee et al. [12] Provides guidelines to extract features that are useful for building intrusion classification models. The approach uses labeled data to derive which is the best set of features to be used in intrusion detection. The approach described in this paper is similar to Lee's because it relies on a set of selected features. The learning process is purely based on past data, as, for example, in [13] but we propose new learning schema. Also we borrowed techniques from intrusion detection and anomaly detection field. III. LOG CORRELATION As logs are sent to the correlation engine, they are normalized into a high level event type and a more specific unique event and processed by correlation algorithms. The quality of a correlation depends on accuracy and speed. The speed depicts how many correlations can be found in a certain amount of time. The accuracy depicts how many of the identified correlations represent real existing relations between these alerts. Due to more complex attacks and large scale networks, the amount of alerts increases significantly which yields the requirement for improved quality of the correlation. The correlation accuracy depends on the used correlation algorithm. IV. DETECTION AND LEARNING PHASE OF BASELINES Main approach for identifying those baselines is to interview a group of experienced network administrators and ask them how they identify and tackle problems. The authors have selected different kinds of administrators ranging from small, single OS (e.g. mostly Apple or Windows) networks to large heterogeneous, multi vendor networks. Below it is summarized the original results of the survey: 1. Concentrate on core services. DNS, routing services are critical in IP networks. If core services fail then everything fail 2. Statistical analysis that deals with extracting information from data is important for baseline detection. The statistical analysis of error status over the time is another good way for detecting baselines. 3. Monitor non-unicast traffic. Non-unicast traffic that is causing high utilization of the switch. or misconfigured windows hosts that have a misspelled workgroup name create a significant amount of broadcast traffic and can be easily detected. 4. Traffic analysis is very important for baseline detection. Unwanted traffic is key concept for network monitoring. There is some heuristics [16] that have an efficient way to detect unwanted network protocols. 5. Active monitoring: Rejected connection rate can be used for TCP services. ICMP port unreachable can be used for UDP services. 6. Monitor protocol distribution. 7. Client are most vulnerable than servers. Because they 2 Fig. 1. DOS attack affect on network traffic. are not maintained by the network administrators. Monitor clients’ behavior 8. Monitor protocols that have weakness or implementation flaws like FTP, IRC and DNS. Some other techniques are available in the literature [17,18,19] Since baselines are mostly statistically defined elements, they need to be updated. As your SIEM usage progresses, you need to add an interim values periodically. Although there is some previously done researches in the literature [20,21]. First of all we suggest detection methods of baselines then we list the baselines for enterprise network using case studies. As a last we suggest three possible new approaches for learning phase of the baseline detection. first approach is to use a static or constant window baseline. With this approach detection of alerts is carried out on a weekly basis after the training period and the baseline remains the same for each week of testing. The drawback of this method is: decisions are always based on that initial baseline. The second approach proposed for updating baseline is to use an extended window, where the newly added data from last baseline calculated time will be added to the baseline updating cycle. With this approach, the baseline becomes dynamic and captures changes in a baseline’s behavior or in the environment. A possible problem with this approach is that the baseline may retain too much related with history. A third approach is to use a fixed sliding time window for the baseline where the width of the time window remains fixed but data within windows always renewed. After training the baseline and making decisions based on the data from this week, the baseline is recalculated by removing the oldest week of baseline data and adding the new week of data. This approach is dynamic in nature and prune to historical data affects over baseline. V. BASELINES FOR CORRELATION RULES The concept of a baseline is not new or complex. In general, a baseline is a well-defined, well-documented reference that serves as the foundation for other activities. We will call this baseline from now on. Understanding the network devices, applications, services etc. deployed in your network and their normal status is key to establishing a baseline. A. Identifying Baselines The method is marked by comparing current performance to a historical metric. Detecting baselines for correlation engine is important. One can increase the number of baselines detected as many as needed. But detecting sufficient
  • 3. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING baselines and baseline parameters will be critical in a manner of system response. For an enterprise grade SIEM solution in memory correlation is critical. You will need to understand 'good traffic' for your network during the 'initial tuning' phase, use white lists and blacklists of signatures, and then build heuristic analysis if you want to alert on deviation from the normal. In our approach we offer methods for baseline detection in section IV. Also we analyzed some real world case studies from FAUNA installations [14]. Results of those methods are: • Port Usage Baseline o Hits on port 80 over the last week o Count of new ports hit on a firewall • User Baseline o Software Usage Baseline o Internet Usage Baseline o Intranet Usage Baseline • EPS Baseline o Normal Events per second (NE) :The NE metric will represent the normal number of events usage time for a device, or for your Log or Event Management scope. o Peak Events per second (PE) :The PE metric will represent the peak number of events usage time for a device, or for your Log or Event Management scope. The PE represent abnormal activities on devices you create temporary peaks of EPS, for example DoS, ports scanning, mass SQL injections attempts, etc. PE metric is the more important cause it will determine your real EPS requirements. • Traffic Baseline o Number of hosts touching each server per hour • Network Baseline o Protocols o System Access o Event Types • System Baseline o Login/Logout Success/Failure o CPU per day o Memory per day o Disk availability per day o User Logins to Server X per day o Process Starts o Use of “su” command per hour of day o Configuration Changes o User Rights o Event Log settings o Restricted Groups o System Services o File Permissions o Registry Permissions • Application Baseline o Data Access Types 3 o User Data Changes o Client • Database Baseline o Data Access Types o User Data Changes o Client As it is mentioned before that detecting baselines and baseline parameters are very critical, we did a search on use cases of an available enterprise software FAUNA [14] within real world cases then detect above baselines and baseline parameters VI. SECURITY INCIDENT DETECTION LEVERAGING PROFILING The challenge with correlation rules is that in a sense they are “signature based” in that you largely have to know the situation you are trying to detect. For example, “monitor my five external firewalls and tell me if you see port scans from the same public net block on more than 3 of my firewalls in the same 30 minute period.” An approach we find far more promising is Baselining. This approach will be used addition to rule based system in our case [15]. Its advantage is that it doesn’t watch for anything specific, rather it attempts to identify any patterns of security events which are unusual based on previously base lined performance. Interestingly, baselining is based on the meta-data from events — not the events themselves – although the events themselves can be retrieved based on the meta data. An example that will best illustrates the value of Anomaly Detection: A quick FTP connection attempt confirmed that FTP was indeed responding on the server and a few more minutes of sleuthing determined that a Windows reboot had restarted the FTP service a few hours prior. Within an hour a hacker had initiated a brute force admin password attack on the server. Anomaly Detection noted the unusual FTP pattern (as compared to the previous months baseline) and thwarted the security incident before any impact. Automatically detecting anomalous behavior by looking for unexpected statistical deviations from an ever evolving “baseline” VII. IMPLEMENTATION DETAILS Since FAUNA [3] has a JAVA based API for customization, we used JAVA for implementation. FAUNA has RULE based correlation engine currently with the features of: • Support Mvel and java • JSR94 compatible • In-memory correlation • Single point of correlation rules source • Multi point of correlation rules source • Negative condition rules • Context based correlation • Rule Base. • Complex Event Processing(CEP).
  • 4. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING • • Forward Channing. Backward Chaining. With this adaptation FAUNA will have a correlation engine with the below features: • Pre-defined pattern matching • Statistical analysis (anomaly detection) • Basic conditional Boolean logic statements • Contextually relevant and/or enhanced data set with Boolean logic Complex Event Processing, or CEP, is primarily an event processing concept that deals with the task of processing multiple events with the goal of identifying the meaningful events within the event cloud. CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes." The vision on FAUNA of a Behavioral Modeling Platform can only be achieved by moving away from any of the narrow modeling perspectives that see only Rules, or Processes, or Events as their main modeling concept. To effectively achieve the flexibility and power of behavioral modeling, a platform must understand all of these concepts as primary concepts and allow them to leverage on each other strengths. FAUNA, in this scenario, is an independent module, but still completely integrated with the rest of the platform, that adds a set of features to enable it: • understand and handle events as first class citizens of the platform • select a set of interesting events in a cloud or stream of events • detect the relevant relationships (patterns) among these events • take appropriate actions based on the patterns detected Events as first class citizens: events are a special entity that is a record of a significant change of state in the application domain. They have several unique and distinguishing characteristics, like being usually immutable, having strong temporal constraints and relationships. FAUNA understand events by what they are and allow users to model business rules, queries and processes depending on the occurrence or absence of them Support asynchronous multi-thread streams: events may arrive at any time and from multiple sources (or streams). They can also be stored in cloud-like structures. FAUNA supports both work with streams and clouds of events. In case of streams it supports asynchronous, multi-thread feeding of events. Support for temporal reasoning: events usually have strong temporal relationships and constraints. FAUNA adds a complete set of temporal operators to allow modeling and reasoning over temporal relationships between events. Support events garbage collection: events grow old, 4 quickly or slow, but they do grow old. FAUNA is able to identify the events that are no longer needed and dispose them as a way of freeing resources and scaling well on growing volumes. Support reasoning over absence of events: the same way in that it is necessary to model rules and processes that react to the presence of events, it is necessary to model rules and processes that react to the absence of events. Example: "If the temperature goes over the threshold and no contention measure is taken within 10 seconds, then sound the alarm". FAUNA leverages on the capabilities of the Drools Expert engine, allowing it complete and flexible reasoning over the absence of events, including the transparent delaying of rules in case of events that require a waiting period before firing the absence. Support to Sliding Windows: a especially common scenario on Event Processing applications is the requirement of doing calculations on moving windows of interest, be it temporal or length-based windows. FAUNA has complete support for Sliding Windows, providing out of the box aggregation functions as well as leveraging the pluggable function framework to allow for the use of users defined custom functions. FAUNA correlation engine will automatically retract the event if it's not needed anymore, for example if you are calculating the average using sliding windows, as soon as the event falls out of the sliding windows and don't match any other rule it will be automatically retracted if FAUNA is operating in Stream Mode. About persisting the internal state, we have implemented our own mechanisms storing the average and re initialize the state when it is needed Baseline updating achieved by Rule/Model chaining. In a typical scenario, a control rule triggers a model and generates an output. This output can either be validated directly using rules or by starting other more complex inference chains. Eventually,one or more of the inferred facts is consumed by an avaluator which in turn triggers another control rule and the processis restarted, as shown below. rule " Trigger " when $input : Data ( <pr e c ondi t i ons> ) $model : ?Model ( . . . ) then insert (new Re sul t ( $model . invoke ( $input ) ) ) ; end rule " Eval " when Re sul t ( value pm eval t a r g e t ) then insert (new Data ( . . . ) ) ; end VIII.CONCLUSION On the existing problem of the current network security
  • 5. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING event management, this paper proposed a security event correlation algorithm based on baselining and an algorithm for updating lifecycle of baselines. Additionally we listed predefined baselines and baseline parameters from real world cases. So, when you think SIEM, don’t just think “how many rules?” – think “what other methods for real-time and historical event analysis do they use?”. Sample of the rules than can be developed with the help of enhancement to rule based systems: 1. Monitor and compare with baseline “count unique port per source per hour/day” 2. Monitor and compare with baseline “protocol per hour per sensor” 3. Monitor and compare with baseline “Hits on port 80 per the week” 4. Monitor my five external firewalls and tell me if you see port scans from the same public net block on more than 3 of my firewalls in the same 30 minute period. For example, “monitor my five external firewalls and tell me if you see port scans from the same public net block on more than 3 of my firewalls in the same 30 minute period. 5. Monitor five attempts to login to a system within one minute using the same user account 6. Monitor/Associate system navigation with non-standard working hours. 7. Monitor traffic source: o (IF username=root AND ToD>10:00PM AND ToD<7:00AM AND Source_Country=China, THEN ALERT “Root Login During Non-Biz Hours from Foreign IP”) As a result, the developed system has features below: • Pre-defined pattern matching • Statistical analysis (anomaly detection) • Basic conditional Boolean logic statements • Contextually relevant and/or enhanced data set with Boolean logic . REFERENCES [1] [2] [3] [4] [5] [6] [7] Gordon, L. A., Loeb, M. P., Lucyshyn, W. and Richardson, R. 2006 CSI/FBI computer crime and security survey: Accessed 12 Oct 2007. Stefan Axelsson, Ulf Lindqvist, Ulf Gustafson, and Erland Jonsson. An Approach to UNIX Security Logging. In Proc. 21st NIST-NCSC National Information Systems Security Conference, pages 62–75, 1998. Raffael Marty. Applied Security Visualization. Pearson Education, Inc,501 Boylston Street, Suite 900, Boston, MA, 2008. D.E. Denning. An Intrusion Detection Model. IEEE Transactions on Software Engineering, 13(2):222{232,February 1987. A.K. Ghosh, J. Wanken, and F. Charron. Detecting Anomalous and Unknown Intrusions Against Programs. In Proceedings of the Annual Computer Security Applications Conference (ACSAC'98), pages 259{267, Scottsdale, AZ, December 1998. C. Ko, M. Ruschitzka, and K. Levitt. Execution Monitoring of SecurityCritical Programs in Distributed Systems: A Specification-based Approach. In Proceedings of the 1997 IEEE Symposium on Security and Privacy, pages 175{187, May 1997. T. Lane and C.E. Brodley. Temporal sequence learning and data reduction for anomaly detection. In Proceedings of the 5th ACM conference on [8] [9] [1] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] 5 Computer and communications security, pages 150{158. ACM Press, 1998. K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A Rule-Based Intrusion Detection System. IEEE Transactions on Software Engineering, 21(3):181{199, March 1995. W. Lee, S. Stolfo, and K. Mok. Mining in a DataFlow Environment: Experience in Network Intrusion Detection. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '99), San Diego, CA, August 1999. H. S. Javitz and A. Valdes. The SRI IDES Statistical Anomaly Detector. In Proceedings of the IEEE Symposium on Security and Privacy, May 1991. S. Forrest. A Sense of Self for UNIX Processes. In Proceedings of the IEEE Symposium on Security and Privacy, pages 120{128, Oakland, CA, May 1996. W. Lee and S. Stolfo. A Framework for Constructing Features and Models for Intrusion Detection Systems. ACM Transactions on Information and System Security, 3(4), November 2000. C. Kruegel, T. Toth, and E. Kirda. Service Specific Anomaly Detection for Network Intrusion Detection. In Symposium on Applied Computing (SAC). ACM Scientific Press, March 2002. FAUNA, FAUNA, D. Plonka, FlowScan: A Network Traffic Flow Reporting and Visualization Tool, Proc. of XIV th Lisa Conference, December 2000. Daniela Brauckhoff, Network Traffic Anomaly Detection and Evaluation , ETH ZURICH ,2010 Augustin Soule, Kav´e Salamatian, and Nina Taft.Combining filtering and statistical methods for anomaly detection. In IMC’05: Proceedings of the 5th Conference on Internet Measurement 2005, Berkeley, California, USA, October 19-21, 2005,pages 331–344. USENIX Association, 2005. Animesh Patcha and Jung-Min Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw., 51(12):3448–3470, 2007. Northcutt, S., Novak, J.: Network Intrusion Detection: An Analyst’s Handbook. New Riders Publishing, Thousand Oaks (2002) Kruegel, C., Valuer, F., Vigna, G.: Intrusion Detection and Correlation: Challenges and Solutions. AIS, vol. 14. Springer, Heidelberg (2005) Ertuğrul Akbaş is currently working with ANET. He received his Ph.D. in Computer Science with a thesis on Intelligent Systems from Gebze Institute of Technology in 2005. He previously worked as research scientist TÜBİTAK-Marmara Research Center and as project manager at National Research Institute of Electronics and Cryptology. His professional interests include network security, information security, network management and monitoring, software components and object-oriented technology.
  • 6. ENHANCING SIEM CORRELATION RULES THROUGH BASELINING event management, this paper proposed a security event correlation algorithm based on baselining and an algorithm for updating lifecycle of baselines. Additionally we listed predefined baselines and baseline parameters from real world cases. So, when you think SIEM, don’t just think “how many rules?” – think “what other methods for real-time and historical event analysis do they use?”. Sample of the rules than can be developed with the help of enhancement to rule based systems: 1. Monitor and compare with baseline “count unique port per source per hour/day” 2. Monitor and compare with baseline “protocol per hour per sensor” 3. Monitor and compare with baseline “Hits on port 80 per the week” 4. Monitor my five external firewalls and tell me if you see port scans from the same public net block on more than 3 of my firewalls in the same 30 minute period. For example, “monitor my five external firewalls and tell me if you see port scans from the same public net block on more than 3 of my firewalls in the same 30 minute period. 5. Monitor five attempts to login to a system within one minute using the same user account 6. Monitor/Associate system navigation with non-standard working hours. 7. Monitor traffic source: o (IF username=root AND ToD>10:00PM AND ToD<7:00AM AND Source_Country=China, THEN ALERT “Root Login During Non-Biz Hours from Foreign IP”) As a result, the developed system has features below: • Pre-defined pattern matching • Statistical analysis (anomaly detection) • Basic conditional Boolean logic statements • Contextually relevant and/or enhanced data set with Boolean logic . REFERENCES [1] [2] [3] [4] [5] [6] [7] Gordon, L. A., Loeb, M. P., Lucyshyn, W. and Richardson, R. 2006 CSI/FBI computer crime and security survey: Accessed 12 Oct 2007. Stefan Axelsson, Ulf Lindqvist, Ulf Gustafson, and Erland Jonsson. An Approach to UNIX Security Logging. In Proc. 21st NIST-NCSC National Information Systems Security Conference, pages 62–75, 1998. Raffael Marty. Applied Security Visualization. Pearson Education, Inc,501 Boylston Street, Suite 900, Boston, MA, 2008. D.E. Denning. An Intrusion Detection Model. IEEE Transactions on Software Engineering, 13(2):222{232,February 1987. A.K. Ghosh, J. Wanken, and F. Charron. Detecting Anomalous and Unknown Intrusions Against Programs. In Proceedings of the Annual Computer Security Applications Conference (ACSAC'98), pages 259{267, Scottsdale, AZ, December 1998. C. Ko, M. Ruschitzka, and K. Levitt. Execution Monitoring of SecurityCritical Programs in Distributed Systems: A Specification-based Approach. In Proceedings of the 1997 IEEE Symposium on Security and Privacy, pages 175{187, May 1997. T. Lane and C.E. Brodley. Temporal sequence learning and data reduction for anomaly detection. In Proceedings of the 5th ACM conference on [8] [9] [1] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] 5 Computer and communications security, pages 150{158. ACM Press, 1998. K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A Rule-Based Intrusion Detection System. IEEE Transactions on Software Engineering, 21(3):181{199, March 1995. W. Lee, S. Stolfo, and K. Mok. Mining in a DataFlow Environment: Experience in Network Intrusion Detection. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '99), San Diego, CA, August 1999. H. S. Javitz and A. Valdes. The SRI IDES Statistical Anomaly Detector. In Proceedings of the IEEE Symposium on Security and Privacy, May 1991. S. Forrest. A Sense of Self for UNIX Processes. In Proceedings of the IEEE Symposium on Security and Privacy, pages 120{128, Oakland, CA, May 1996. W. Lee and S. Stolfo. A Framework for Constructing Features and Models for Intrusion Detection Systems. ACM Transactions on Information and System Security, 3(4), November 2000. C. Kruegel, T. Toth, and E. Kirda. Service Specific Anomaly Detection for Network Intrusion Detection. In Symposium on Applied Computing (SAC). ACM Scientific Press, March 2002. FAUNA, FAUNA, D. Plonka, FlowScan: A Network Traffic Flow Reporting and Visualization Tool, Proc. of XIV th Lisa Conference, December 2000. Daniela Brauckhoff, Network Traffic Anomaly Detection and Evaluation , ETH ZURICH ,2010 Augustin Soule, Kav´e Salamatian, and Nina Taft.Combining filtering and statistical methods for anomaly detection. In IMC’05: Proceedings of the 5th Conference on Internet Measurement 2005, Berkeley, California, USA, October 19-21, 2005,pages 331–344. USENIX Association, 2005. Animesh Patcha and Jung-Min Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw., 51(12):3448–3470, 2007. Northcutt, S., Novak, J.: Network Intrusion Detection: An Analyst’s Handbook. New Riders Publishing, Thousand Oaks (2002) Kruegel, C., Valuer, F., Vigna, G.: Intrusion Detection and Correlation: Challenges and Solutions. AIS, vol. 14. Springer, Heidelberg (2005) Ertuğrul Akbaş is currently working with ANET. He received his Ph.D. in Computer Science with a thesis on Intelligent Systems from Gebze Institute of Technology in 2005. He previously worked as research scientist TÜBİTAK-Marmara Research Center and as project manager at National Research Institute of Electronics and Cryptology. His professional interests include network security, information security, network management and monitoring, software components and object-oriented technology.