SlideShare a Scribd company logo
1 of 11
Download to read offline
HCFI: Holistic Control Flow Integrity
A kernel-level approach to enforce CFI
M.Golyani, S.Niksefat
2017
Abstract—While Control Flow Integrity is one of the most
powerful methods used to prevent attackers from obtaining
control of a process, there are still some shortcomings in different
aspects of yet presented CFI systems. In this paper, we propose
a new CFI system which is able to work alongside with other
protection schemes, without the need of the program’s source
code, specific hardware, and binary rewriting. Our proposed
work uses kernel facilities as well as performance counters in the
processor to monitor the execution of the protected applications
and detects any violation of the correct execution flow. In this
CFI system, the CFI policy is generated once on a single machine
and used on other machines as well. We have implemented this
system on a Linux box and evaluation results indicate that this
CFI system is completely practical with low overhead and is able
to detect various kinds of attacks.
I. INTRODUCTION
Up to now, lots of mechanisms have been developed to
provide security for operating systems and running processes.
Among them, Control Flow Integrity (CFI) [2] is one of the
most reliable techniques. In CFI-based solutions, the overall
process is that in the first step, also known as the offline phase,
a valid control flow graph is depicted for each binary, and
in the on-line phase, when the operating system is executing
the binary, a CFI enforcement mechanism takes place which
compares the current execution flow with the saved one. If
the CFI system finds any violation of valid control flow
graph, it raises an alarm and takes proper action accordingly.
Implemented successfully, CFI is one of the most proper
protection schemes due to the fact that it is not restricted to a
specific type of attack and detects any kind of violation from
the valid execution flow.
The practicability of yet presented CFI mechanisms has
been discussed in many other papers. Although in some of
these papers it is stated that the studied CFI mechanisms
can’t provide a suitable protection for the system, it should
be noticed that most of these CFI mechanisms are similar
to each other in case of the detection process. In these
CFI mechanisms a set of valid targets for indirect branches
is generated and during the run-time, each indirect branch
instruction is checked against the list separately. While this
approach may be bypassed [14], [3], [8], [17], a holistic CFI
mechanism, which we will introduce later on, can still provide
a suitable protection for the system.
The CFI mechanisms presented until today, are categorized
into two groups of fine-grained and coarse-grained CFI sys-
tems, while there can be a third approach between fine-grained
and coarse-grained CFI. In this paper, we present a new CFI
system that uses a holistic approach to check control flow
integrity in a period and not only at a specific time. In our
proposed system, alongside analyzing the valid targets for
branches inside a program, the whole execution flow of the
executed process is also monitored (a holistic CFI system),
therefore any violation from control flow graph is detected
immediately.
Furthermore, in our proposed work, the ability to implement
the CFI system alongside other protection mechanisms like
ASLR and DEP is addressed as well. This characteristic has
not been appropriately considered in existing CFI systems
yet. This proposed CFI system protects both statically and
dynamically linked executable files without the need to source
or recompilation as well as instrumentation and any specific
hardware equipment.
Contributions:
In summary, the contributions of this paper are as follows:
• We propose a CFI system which detects and enforces the
CFG by considering the sequence of branches made till
a certain point. Using a sequence of branches at a time
instead of a single branch at a time provides a holistic
view of the program’s execution flow.
• Our presented system is designed in a way that can be
used alongside other protection schemes like ASLR.
• It is the only CFI system which works in a centralized
way. Computation of CFG for each binary is performed
in a central system and the computed CFG can be used
on other systems.
• We have constructed a working prototype of the presented
system in an Ubuntu 14.0.4 LTS operating system as
a kernel module which protects binaries on a system
which has ASLR and exec-shield enabled and compile-
time protections are used as well.
The rest of this paper is organized as follows. In the ”Back-
ground” section some information about the history of attacks
and the existing CFI systems is provided. In this section we
study the advances made in both the attack techniques and the
defence mechanisms. Section III presents the security model
we considered in designing and implementing the proposed
work. In section IV an overview of the proposed work is
depicted and in sections V and VI we provide an in-depth view
of the two main operation phases in the proposed work. Sec-
tion VII presents the evaluation results of an implementation
of the proposed work in terms of security and performance.
Related works even though discussed throughout the paper, is
presented in Section VIII, and in section IX we conclude the
paper.
II. BACKGROUND
Since Elias Levy’s article titled ”Smashing the stack for fun
and profit” at 1996 [23], lots of attacks have been introduced to
take the control of processes and alongside with these attacks,
security solutions have been presented as well. One of the
most well-known attacks is stack buffer overflow attack which
exploits the lack of boundary checking in some programming
languages to overwrite sensitive data in memory. This attack
was first introduced to exploit stack buffers but later expanded
to heap area as well.
To defend against overflow-based attacks, security mecha-
nisms like canary based protections were introduced, in which
a particular value, known as canary is placed in the buffer,
right beside critical pointer and at the way of the overflow.
Using this technique, whenever overflow occurs, just before
the overflow overwrites the critical pointer, it has to overwrite
the canary to reach the pointer and hence the protection system
detects the change of the canary and raises an alarm.
While this technique works fine against overflow-based
attacks, it has no effect on other types of attacks like format
string-based attacks. Format string attacks were proposed after
buffer overflow attacks and using them, an attacker is able to
overwrite a sensitive pointer in memory, e.g., return address,
and transfer the control of the process execution into his/her
own injected shellcode.
When an attacker is able to put some code into the mem-
ory, he/she can use some techniques like format-string-based
attacks to execute his/her injected code. In general, these types
of attacks are known as Code injection attacks in which a
piece of code is first injected into the memory and executed
afterward. To counter code injection attacks, data execution
prevention [41] systems were introduced in which a distinction
between code and data were made in memory regions and
only code region has the permission to get executed, not data
region. Security systems like WˆX, Exec-shield and DEP are
based on this protection mechanism. Alongside with software
solutions, hardware processor producers like Intel and AMD,
implemented some facilities to enforce this type of protection
in hardware level as well. Intel introduced the Execute disable
(XD) flag and AMD introduced the No execute (NX) flag in
their processors.
Although the mentioned protection techniques stop code
injection attacks effectively, they are ineffective against some
other types of attacks like return into libc (a.k.a. ret2libc).
In ret2libc technique, the attacker overwrites a pointer (e.g.,
return address) to the location of a function within libc library.
As address space of libc is marked as executable and normal
execution flow is often transferred there to execute a function,
the attacker will be able to execute a whole function in libc
library, providing its arguments and hence bypass the above
protections.
To prevent the ret2libc attack which in fact is the first
and most basic type of Code reuse attacks, randomization
techniques come into play. Protection mechanisms like ASLR
(Address Space Layout Randomization) [42] in Linux, and
concepts like Position Independent Executable [43] files are
some of the mechanisms that can be used against this type
of attack, but alongside with these protection schemes, some
other techniques have been proposed to bypass them. Heap
spray [44], ASLR brute force [39], and return into non-
randomized regions [45] are some of these techniques.
Two of the most cutting-edge methods which attackers
use are Return Oriented Programming and Jump Oriented
Programming. In these methods, attacker overwrites a pointer
with a pointer to a small part of the program’s own code. The
final attack is formed by arranging these small parts of the
code in proper order. These small parts of the code are called
Gadgets. By executing gadgets in proper order, it is proved
that attacker can gain a Turing-complete system to execute
whatever he/she likes [1].
Nowadays, presenting an efficient and reliable technique to
counter code reuse attacks effectively is the main concern in
academic society. kBouncer [28] was one of the first attempts
to gain a practical solution, although it has been proven to
be inefficient in practice [19]. One of the other prominent
works in this field is ROPecker [6], but there have been found
some methods to bypass it as well [4]. Randomization-based
protections like Isomeron [13], protections based on omitting
gadgets from binary thorough recompilation like DROP [5],
hardware-based approaches like SIGDROP [34], Gadge me
if you can [16], protections based on binary instrumentation
like ROPDefender [15] and lots of other works have tried
to mitigate code reuse attacks in different ways, but unfor-
tunately almost all of them consider a special characteristic
of these attacks as a mean of detection. Examples for these
characteristics are the length of the gadget chain, length of
the gadget itself, and so on. Hence, to bypass these protection
mechanisms, attackers always come up with some new attack
techniques, just by changing these characteristics in their
attacks.
CFI protections, On the other hand, without targeting neither
a specific attack nor a particular characteristic of an attack, are
able to detect and prevent a vast variety of attacks. From code
injection attacks to ROP and JOP attacks, if the attack affects
the running program’ execution in a way that it differs from the
predefined control flow, the attack is detected and prevented.
CFI enforcement mechanisms usually work in two phases,
first, in an off-line phase the control flow graph, a.k.a., CFG, of
a binary is obtained and then in an online phase, any violation
from this CFG is identified and considered as an attack.
CCFI [26], CFIMon [36], CONVERSE [21], OCFI [27], HCFI
[7] are some of the most prominent CFI systems presented
yet. Intel has recently presented the control flow enforcement
technology overview, CET, [22] which provides shadow stack
and indirect branch tracking capabilities to counter ROP like
attacks. CFI concept although seems to be flawless at the
first glance, but the implementations presented up to now, all
have some shortcomings which is noted in recent researches
that challenge the functionality of these schemes and have
presented some techniques to bypass these implementations
[14], [3], [8], [17]. It should be noticed that each of these attack
techniques target a specific implementation of CFI concept
and not the CFI concept itself. Considering this background, a
practical protection system against different attack techniques
is still needed.
In this paper, we present a new protection scheme which
enforces CFI in a different way from existing mechanisms.
Our proposed work, compares the current execution flow of the
protected process with the correct CFG, driven from the off-
line phase, considering a sequence of branches not just one at
a time. This system can be implemented alongside with other
protection schemes and does not need access to source code
in order to operate correctly. In this system, no modification
is made in binaries and no instrumentation is made neither.
III. SECURITY MODEL
Nowadays, there are various protection mechanisms used in
operating systems. Canary based protections, Data execution
prevention protections, and randomization-based protections
are the most popular ones. When there is fewer number of
protection schemes activated in a system, an attacker can take
over the control of the system more easily and the task of
activated protection schemes is heavier. In other words, there
is an inverse relationship between the number of protection
schemes and the number of security task each protection
should perform. Accordingly, in a system, if a security eval-
uation is made by only enabling one protection system, this
evaluation would be more rigorous than evaluating the same
system with more protection schemes enabled. Of course, it
should be noticed that the security scheme under evaluation
should be able to operate correctly when other protection
schemes are enabled as well, and activity of other protection
systems, should not interfere the operation of the system under
evaluation.
Accordingly, in our security model, the SSP protection
module which is used by GCC compiler at compile time, has
been disabled. This protection module is solely used to detect
the rewrite of sensitive pointers like saved EIP and have no
role in detection or prevention of code execution if the return
address is overwritten. Therefore enabling or disabling this
module has no effect on the functionality of the proposed
system, and by disabling the SSP, it will be just easier to
attack the system.
Data Execution Prevention and Address Space Layout Ran-
domization, on the other hand, are related to what happens
after an attacker takes over the control of the execution.
These protection schemes are not about stopping the attacker
from overwriting the return address, but to prevent them
from executing their own code. Therefore by disabling these
schemes, only the complexity of attacks is decreased, although
to ensure the correct functionality, we evaluate our job in
both situations. In other words, in our security model, the
functionality of the proposed system is checked in both the
presence and the absence of the ASLR and the DEP.
According to what stated above, in our security model, the
SSP protection is disabled, but we evaluate our work in both
ASLR and Exec-shield enabled and disabled states. Accord-
ingly, in our threat model, an attacker is able to overwrite a
sensitive pointer arbitrarily and is also able to execute his/her
own pieces of code in program’s address space, either by
injecting the code directly (ASLR, exec-shield disabled), or
by using more advanced attack techniques like ROP (ASLR,
exec-shield enabled).
IV. OVERVIEW OF THE PROPOSED WORK
Like other CFI systems, the general structure of our pro-
posed system is that in an off-line phase, the control flow graph
of the protected binaries is created and in an on-line phase,
this CFG is enforced. In our proposed work the generation and
enforcement of the CFG are done using some kernel features
for each binary under system’s protection.
Enforcing the CFG is done using Kprobe facility in the
Linux kernel in conjunction with the LBR Model Specific
Registers. Since the execution path of a process is determined
by the branch instructions it executes, in this system we
built a kernel module to record the performed branches. In
other words, by monitoring all branch instructions made in
a process, one can determine the valid execution flow graph.
Each branch instruction, despite its type, at the execution time
has specific source and destination address. In our proposed
work the distance between the source and the destination of a
branch is used as an identifier for that branch.
In this system, we use LBR model specific registers to
obtain the distance between the source and the destination
of a branch instruction. These registers, which come in 16
pairs, configured properly, store the source and destination
address of each user-space branch instruction executed in the
system in a ring buffer. Although it is possible to detect
the correct execution path of a process using these registers,
monitoring all branch instructions made in a process, incurs
high overhead. Therefore, in our proposed work, the contents
of LBR registers is accessed only when a system call is made.
Almost all of the existing CFI systems, work by computing
and analyzing the valid destination for branch instructions.
In these systems, in each task of analysis, a specific branch
instruction is analyzed and valid destination addresses for that
instruction is identified and compared with current execution
flow, whilst in our proposed work, in each task of analysis, 16
branch instructions made till now are analyzed and hence it is
possible to check the integrity of execution flow in a period.
In this system, whenever a sensitive system call is executed,
the 16 branch instructions before this system call is identified,
the distance between source and destination of these branches
is calculated and then this table of 16 branch distances is
compared with the table received from the off-line phase. Any
contradiction in these two tables is assumed as an unauthorized
attempt to redirect the control flow and so the execution of
that process is stopped immediately. Since in this scheme, the
Fig. 1. The overview of the proposed system: 1. The system is triggered 2.
Current LBR contents loaded 3. Current LBR is compared with the offline
table 4. The decision to stop or continue the execution of the protected
program is made
distance between source and destination of each branch is used
and not the absolute addresses, it is possible to implement our
proposed work in conjunction with other protection schemes
like ASLR which changes the address of where the binary
is loaded in memory on each execution. On the other hand,
because of special design in our system, which we will discuss
later, the table of valid branch distances (TVBD) that is
computed in a specific OS version for a specific binary, is
usable on other systems running the same binary on the same
OS.
A general overview of our proposed work is depicted in
figure 1. As it is shown, the system is triggered by a sensitive
system call made from a protected application (1). At this
point, the detection module loads current LBR contents (2)
and computes the distance between destination and source
of each branch trace record in a specific way which will
be discussed later in Offline analysis section. After that, it
compares the resulted table of 16 branches currently executed
just before the sensitive system call with the table derived from
the offline analysis (3). Comparing these two sets of 16 branch
information, the system will stop the process immediately if
a violation is detected, and otherwise, the execution of the
program will continue (4).
V. ONLINE PHASE
In our implementation of the proposed work, the list of the
applications which the CFI system should protect is announced
to the kernel module through a device file in the system. In
the kernel space, the installed kernel module will process this
list and the CFI checks will be enforced only for applications
mentioned in this list. In this system, the CFI enforcement
module is activated by each invocation of a sensitive system
call, it then checks the executed branch instructions and their
destination addresses and compares them with the table of the
valid branch distances (TVBD) received from off-line phase
and any difference in these two tables is considered as an
attack. Therefore, the proposed system works in three main
phases: First, the system is configured in a way that any
invocation of a sensitive system call, triggers the detection
module, second, the executed branch instructions till now are
analyzed, and third, the decision is made about whether to
stop the execution or not.
A. Hooking sensitive system calls
In this state, which runs just after the installation of the pro-
posed system, we use kernel probes to intercept the sensitive
system calls. Using these probes, it is possible to dynamically
insert breakpoints inside of the each desired kernel routine and
collect performance or debug information as needed. Before
the introduction of kernel probes, in kernel version 2.6 and
before, one would need to alter the sys call table array inside
the kernel to do this job, but by introduction of kernel probes,
this array is now marked as read-only and it is possible
to intercept the functions and routines without breaking the
integrity of the kernel, by simply using kernel probes.
Currently, there are three types of kernel probes available:
kprobe, jprobe, and kretprobe. In our implementation of the
proposed work, we use jprobes to perform our job. A jprobe
could be set on the entry point of each kernel function, and it is
possible to access the arguments of the called function inside
the probe. Using jprobes, it is possible not only to intercept
sensitive system calls and perform CFI checks but also to
analyze the passed arguments as a mean of valid execution
flow detection.
In our proposed work, we use jprobes to intercept sensitive
system calls like exec, fork, and so on. Although it is possible
to set a jprobe on any desired point in the system, for
example, in systems using sysenter mechanism, it is possible
to set a jprobe at the start of sysenter do call and identify
the called function by examining the arguments passed to
it. By the way, in our proposed work, the executed branch
instructions until the invocation of a sensitive system call in
the protected application are identified and compared with the
TVBD received from the off-line phase.
B. Tracking the branches
In our proposed work, LBR model specific registers are
used to analyze the executed branch instructions in the running
process. LBR registers are 16 pairs of MSR registers which
could be found in Intel processors based on Nehalem micro-
architecture onwards. By executing each branch instruction on
a processor with activated LBR, the source and the destination
address of the branch instruction are stored in one of 16 LBR
register pairs. Since this job is performed by the hardware,
there will be no added overhead for the system.
When there are more than 16 branch instructions executed,
the old LBR contents are overwritten in a ringed-buffer order,
overwriting the first record at first. Hence, there should always
be an index to point to the last filled LBR register. This
pointer is called TOS. In this way, it is possible at any time
to identify the last executed branch instruction by examining
the LBR TOS register. It is also possible to confine the LBR
facility to record only the user-space branches to use these 16
registers more thrifty.
Accordingly, in our implementation of the proposed work,
after enabling LBR in the processor, using it’s filtering facility,
the source and the destination address of executed branches
in the user-space would be accessible through these registers.
Therefore, after the interception of sensitive system calls,
whenever a jprobe is activated, the contents of LBR registers
are analyzed and sorted by the time of execution. Doing so,
the specifications of 16 branch instructions which executed
just before the sensitive system call is analyzed and compared
with the TVBD received from the off-line phase.
C. Enforcing the CFI
After comparing the 16 records of the saved branch in-
structions, received from the off-line phase, with 16 records
of executed branch instructions in current process, if any
contradictions found in these two tables, the execution of the
current process is interrupted and otherwise, if these two tables
are the same, the execution will proceed.
In our implementation of the proposed work, we use signals
to stop the running process. If any violation of CFG is de-
tected, a kill signal is sent to the running process immediately,
causing the protected application to stop forcefully. Although
there are lots of more appropriate actions available to take in
case of an attack being identified, for the sake of simplicity,
we chose signals. In case of normal behavior and conformity
of the TVBD and the executed branch instructions in the
running process, the normal execution flow will continue,
calling jprobe return.
It is to be mentioned that in this CFI system, we use a table
of the 16 latest branch instructions executed in the running
process just before the sensitive system call to identify the
attack. Whilst in current existing CFI protection mechanisms,
each branch instruction is handled separately, checking the
destination of this particular branch instruction against a list
of valid destinations. Using a table of 16 branch instructions
instead of just one branch instruction at a time improves
the security of our CFI system and it will stop lots of
yet-discovered attack techniques, as we will discuss in the
evaluation section in this paper.
VI. OFFLINE ANALYSIS
In this phase, generating the Control Flow Graph is the main
operation. There have been lots of techniques introduced in
current existing CFI systems to compute the CFG and any of
these techniques can be used to generate the CFG. Some of
the existing CFI systems use static analysis of binary files to
generate the CFG, and some others use dynamic analysis and
emulated runs to do so.
The most challenging task in generating the CFG, in most
of the CFI systems, is how to handle the indirect branches.
In these systems, each indirect branch instruction is handled
separately and possible destinations for that specific branch
are identified. Any failure in the correct identification of these
valid destinations, or the vast range of possible destinations,
makes these systems vulnerable to some advanced attacks like
ROP. That is because by increasing the number of valid desti-
nations for a specific branch instruction, or by identifying an
invalid address as a valid destination for a branch instruction, it
would be possible that a malicious branch instruction, related
to a ROP attack gadget, be considered as a valid branch and
hence, the chance of success for the attacker is increased
accordingly.
To avoid that, in our proposed work a branch instruction
is not handled separately. This means that even if an indirect
branch instruction in the current execution of the protected
binary is executed exactly according to the CFG, it will not
be considered as a valid branch yet. A valid branch in our
proposed work is a branch which not only conforms to the
CFG but also in the set of 16 branches that this particular
branch is part of, there is no violation from CFG as well,
and also the sequence order of the current branch instructions
executed in the protected binary, should be exactly the same
as the TVBD. On the other hand, in our proposed work,
instead of using fixed destination addresses to identify the
branch instructions, we use the distance between the source
and the destination of a branch as a characteristic of that
branch instruction. Therefore, for a branch instruction to be
considered as valid, three conditions should meet:
1) The distance between the source and the destination of
the branch should conform to the CFG.
2) In the set of 16 branches which the current branch is
part of, all 16 branches should conform to the CFG.
3) The sequence of these 16 executed branch instructions
till now, should be exactly the same as the sequence of
the entries in the TVBD.
In our implementation of the proposed work, all these three
conditions are checked in the on-line phase by comparing the
received table of valid 16 branch distances from off-line phase
and the 16 branch distances executed in the running process.
Although it is possible to use any desired method to generate
the CFG, in our implementation we use emulation. In this
method, we run the binary in an isolated system and fill the
TVBD for this binary in different execution scenarios for a
while. Doing so, a table of 16 valid distances is available for
each sensitive system call executed in the program code. This
table is available for each protected binary separately. It is
to be mentioned that the produced tables of valid distances
for a specific application on a particular operating system, is
usable on other systems running the same application in the
same version of the operating system. This is because of the
way we use to conduct the table of the valid branch distances,
considering the PAGE SHIFT concept.
A. PAGE SHIFT
In the Linux kernel paging operation, the address of each
page is made up of two parts. The most significant part is
a pointer to the whole page, and the least significant part is
an offset inside the page. In x86 systems, for example, which
page size is set to 4 kilobytes, 4096 bytes, to be able to address
TABLE I
THE PAGE SIZE FOR THE DIFFERENT ARCHITECTURES IN THE LINUX
KERNEL
Architecture PAGE SIZE PAGE SHIFT
X86 4096 12
Alpha 8192 13
ARM 4096 12
AVR32 4096 12
IA64 4096, 8192, 16384, 65536 12, 13, 14, 16
M68k 4096, 8192 12, 13
Sparc 4096 12
all entries of a page, 12 bits are needed. Therefore in these
systems, the 12 least significant bits of each address in paging
operation is related to the offsets inside the page. Accordingly,
by shifting the address of each page by 12 bits, the part of
the address which is responsible for indexing inside the page
is ignored and the remained part, the most significant part, is
the address of the page itself.
This number of shifted bits which is 12 in x86 systems,
is known as the PAGE SHIFT concept in the Linux kernel.
At the time of writing this article the size of each page,
which is named PAGE SIZE in the kernel, is calculated from
the PAGE SHIFT value. The PAGE ALIGN macro inside the
kernel uses this calculated size. The size of the pages in the
Linux kernel for different systems is listed in the tableI.
In the ASLR system, on the other hand, after the random-
ization process, the generated address is aligned according
to the size of the pages and randomization is performed
for the address of each page, not inside the page. In Linux
systems, this alignment is done after the randomization process
(i.e. after get random int), through the PAGE ALIGN macro
inside the kernel, and therefore the generated random value is
aligned regarding the address of the page.
B. Calculation of the TVBD
In our implementation of the proposed work in a Linux
system, considering what mentioned before, to produce the
valid offsets we do not use the address of the page itself,
instead, we use the offset inside the page to calculate the
distance between the source and the destination of a branch. In
other words, the valid distances for each branch instruction are
in the range of 1-4096 and the exact value is recorded during
the off-line phase. To do so, after extracting the addresses
from LBR registers, we subtract the source address from the
destination address and then we use three least significant
digits of the resulted number as the valid offset, which may
differ from 1 to FFF. Doing so, after the off-line phase, we will
have sets of 16 offsets for each sensitive system call executed
in the protected application.
Using this method, it is possible to implement the proposed
work alongside with the other protection schemes like ASLR
and it is also possible to calculate the TVBD once on an
operating system and use that TVBD for the same application
running on different machines with the same operating system.
Fig. 2. Replacement of the pages does not affect the offset inside the page,
in two different runs.
That is because the randomization in the current system is
done per memory page and not inside the pages and hence
the offsets inside the pages are still the same, so the calculated
TVBD will always stay the same. It is possible to construct a
central system to collect the TVBDs for different applications
on different operating system versions calculated on other
systems and store them in a database. This central system
can then give the proper table to other systems on demand,
according to OS-Application combination for that system.
Therefore each system could have an updated database of
TVBDs for applications it is protecting, without the need
of calculating these offset tables itself. In other words, the
operation of calculating the CFG is performed once on a
system and the resulting tables are used on other systems as
well.
To prove that, we executed an altered version of the BET
program, introduced in [3], under two different conditions.
First, we executed BET for 110 times in a single system and
observed the distance between the source and the destination
addresses of the executed branch instructions in every exe-
cution. Because the location of the loaded memory pages is
different almost on each execution (mostly because of the
ASLR operation), we got 86 different tables of distances.
Afterwards, we extracted the 3 least significant digits of each
recorded distance, which would be the offset inside the pages,
and we observed that these 3 digits are always the same.
The resulted distances of 10 different executions among 110
are listed in the table II. As it is listed in the table, the
distance between the source and the destination of the branch
instructions vary in different executions, but the last 3 digits
are always the same, therefore it can be used as a measure to
form the CFG for each program.
Secondly, we executed the BET program for 10 times on
the two separate ASLR enabled machines and recorded the
distance between the source and the destination of each of the
16 branch instructions before a specific sensitive system call
(fork in this example). The resulted addresses and offsets are
listed in table III.
As it is listed in tableIII, the offset inside the pages in
two different executions of the BET program on two separate
TABLE II
THE RESULTED DISTANCE TABLE OF 10 DIFFERENT RUNS ON A SINGLE
MACHINE
Index Resulted distances Offsets inside pages
(Always the same)
1 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fc7319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f835e9
0x0012416b 0x00006335 0x00004962
0x00000795 0xfcf 0xd54 0x003 0xdf8
2 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fc8319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f845e9
0x0012416b 0x00006335 0x00004962
0x00000795 0xdfa 0x319 0xd97 0x228
3 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fca319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f865e9
0x0012416b 0x00006335 0x00004962
0x00000795 0x26c 0xc3 0xc5 0x5e9
4 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fcf319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f8b5e9
0x0012416b 0x00006335 0x00004962
0x00000795 0x16b 0x35 0x962 0x795
5 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fd2319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f8e5e9
0x0012416b 0x00006335 0x00004962
0x00000795
6 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fd6319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f925e9
0x0012416b 0x00006335 0x00004962
0x00000795
7 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fd7319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f935e9
0x0012416b 0x00006335 0x00004962
0x00000795
8 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fd8319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f945e9
0x0012416b 0x00006335 0x00004962
0x00000795
9 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fd9319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f955e9
0x0012416b 0x00006335 0x00004962
0x00000795
10 0x0010efcf 0x00043d54 0x00000003
0x000b4df8 0x000b4dfa 0x09fe1319
0x00043d97 0x00000228 0x0000026c
0x000710c3 0x000710c5 0x09f9d5e9
0x0012416b 0x00006335 0x00004962
0x00000795
TABLE III
THE RESULTING OFFSETS OF THE LAST 16 BRANCH INSTRUCTIONS ON
SYSTEM A (UPPER RECORD) VS SYSTEM B (LOWER RECORD)
Indx
Source Destination Dst-Src Offset
Address Address Distance in page
1
0xb7631445 0xb7740414 0x0010efcf
0xfcf
0xb76ce445 0xb77dd414 0x0010efcf
2
0xb774a015 0xb774a7aa 0x00000795
0x795
0xb77e7015 0xb77e77aa 0x00000795
3
0xb774a81f 0xb774f181 0x00004962
0x962
0xb77e781f 0xb77ec181 0x00004962
4
0xb774f1cb 0xb7755500 0x00006335
0x335
0xb77ec1cb 0xb77f2500 0x00006335
5
0xb775550b 0xb76313a0 0xffedbe95
0xe95
0xb77f250b 0xb76ce3a0 0xffedbe95
6
0xc1652989 0xb76313a0 0xf5fdea17
0xa17
0xc1652989 0xb76ce3a0 0xf607ba17
7
0xb76313a6 0xb76a246b 0x000710c5
0x0c5
0xb76ce3a6 0xb773f46b 0x000710c5
8
0xb76a246e 0xb76313ab 0xfff8ef3d
0xf3d
0xb773f46e 0xb76ce3ab 0xfff8ef3d
9
0xb76313bc 0xb7631628 0x0000026c
0x26c
0xb76ce3bc 0xb76ce628 0x0000026c
10
0xb763162f 0xb7631407 0xfffffdd8
0xdd8
0xb76ce62f 0xb76ce407 0xfffffdd8
11
0xb7631407 0xb75ed670 0xfffbc269
0x269
0xb76ce407 0xb768a670 0xfffbc269
12
0xc1652989 0xb75ed670 0xf5f9ace7
0xce7
0xc1652989 0xb768a670 0xf6037ce7
13
0xb75ed671 0xb76a246b 0x000b4dfa
0xdfa
0xb768a671 0xb773f46b 0x000b4dfa
14
0xb76a246e 0xb75ed676 0xfff4b208
0x208
0xb773f46e 0xb768a676 0xfff4b208
15
0xb75ed69a 0xb75ed69d 0x00000003
0x003
0xb768a69a 0xb768a69d 0x00000003
16
0xb75ed6b8 0xb763140c 0x00043d54
0xd54
0xb768a6b8 0xb76ce40c 0x00043d54
machines, are identical, and hence will result in the same
TVBD on both machines. Therefore we can calculate the
TVBD for each binary on a specific operating system once in a
base system and use it on the other machines running the same
combination of the application and the operating system. The
two systems used in this evaluation are A: Lenovo ThinkPad
T420, and B: HP Pavilion g6.
C. Static analysis and various execution states
Though we used emulated runs to construct the table of the
valid branch distances, but it is also possible to draw the CFG
by static analysis of the binaries. In this case, considering that
we use sets of 16 branches at once, it may come to mind that
how can we handle the conditional and the indirect branches
in the static analysis.
To answer this question it should be mentioned that in
the static analysis, if we handle each instruction separately,
then the only way we can identify the threat is that we build
a table of valid destinations for each individual branch. In
other words, constructing the table of valid distances is the
only solution, but when an indirect branch is analyzed among
the other branches in a set, it would be possible to use the
location of the current executed branch in this set as an extra
characteristic of the executed branch.
This means that in the set of 16 valid branch instructions
just before a sensitive system call, which is received from
the off-line phase, if an indirect branch is located at the
eleventh entry, in the actual execution of this application,
this particular indirect branch instruction should be located
exactly at the same location among the set of 16 branches
before the same sensitive system call. This kind of analysis can
prevent attacks like ROP because in these attacks the attacker
is constructing a gadget chain and to bypass the detection
mechanisms, he/she uses the indirect branches which could
take multiple destinations. Using these branch instructions to
construct the gadget chain, he/she will be able to direct the
execution flow to wherever he/she wants. Because when the
branch instructions are considered and checked separately, the
only way to detect the threat is to check the destination, but
when these indirect branch instructions are checked among
the other 15 branch instructions, not only the destination of
that branch instruction could be used as a way to detect the
violation, but also the position of that specific indirect branch
instruction in the set is another tool to identify the threat as
well.
Another question which may come to mind is that from a
high-level approach an application may have various execution
flow graphs, so how is it possible to construct the set of 16
branches before a sensitive call? To answer this question one
must know that the ”various execution flow graphs” term is
a high-level term. The only way that application can have
various execution flows, in low-level speaking, is conditional
branch instructions. The conditional branch instructions, each
could have two valid destinations that according to the condi-
tions, one of them is chosen at the execution time.
Furthermore, in our proposed work we do not handle all
of the branch instructions executed in an application, we only
consider the 16 branch instructions before a sensitive system
call. Hence, in the worst case, there would be 16 conditional
branch instructions to analyze (i.e., 216
different states will
be analyzed). However, a situation in which all 16 branch
instructions before a sensitive system call are conditional ones,
is the most unlikely. An analysis of the Apache web server
indicates that on average, there are only 6 conditional branches
in the set of 16 branches before sensitive system calls. That
means 64 different valid states for TVBD for each system
call, which is completely feasible to check at the run-time.
Therefore to construct the TVBD and draw the CFG in the
off-line phase, it is also possible to use static binary analysis
as well.
VII. EVALUATION
To evaluate our proposed work, we implemented this system
in an Ubuntu 14.0.4 LTS box with an Intel Core i7 CPU as a
kernel module and analyzed its security against different types
of attacks and also its efficiency related to the performance
overhead it incurs to the system.
TABLE IV
THE EVALUATION OF THE SYSTEM’S SECURITY IN DETECTING VARIOUS
TYPES OF ATTACKS.
Attack type DEP status ASLR status Result
shellcode injection OFF OFF Prevented
return into libc ON OFF Prevented
ROP ON ON Prevented
ROP OFF OFF Prevented
TABLE V
THE PROPOSED SYSTEM’S EFFECTIVENESS AGAINST REAL WORLD
EXPLOITS.
Application EDB/CVE id Result
unrar EDB-ID 17611 Prevented
nginx CVE-2013-2028 Prevented
A. Security
To evaluate the security of this system, we first used a
modified version of the BET program and exploited it using
shellcode injection, return into libc, and ROP techniques. To
check the functionality of our proposed work, we repeated
these attacks with the presence of ASLR and DEP protections,
and the result of the test is listed in table IV. In case of ROP
attacked, we also used the knowledge of the process address
space to bypass the ASLR, and in this case, the proposed
system prevented the attack successfully.
After doing so, to check the reliability of our proposed
CFI system against real-world attacks, we used two publicly
available exploits against nginx and unrar applications, which
both were detected and prevented by the system successfully
as listed in table V.
B. Performance
To evaluate the performance overhead incurred by the
proposed system, we examined the execution time of the BET
program in presence and absence of the protection system,
using valgrind. The results show that in a worth case scenario
which the valid content in TVBD is the last one and the
table itself contains 128 different sets of valid 16 branches,
the incurred overhead is negligible (less than 1%).
We have also examined the performance of our system in
a real-world scenario as well. To do so, we analyzed the
performance of nginx web server in terms of different numbers
of connection requests per second and the resulting mean
response time per request, using apachebench tool. First, we
examined the performance of the web server, running the
apachebench against the web server in a bare system and then
we executed the test in presence of our proposed work. The
result, as depicted in figure 3, shows that the performance in
both cases is almost identical.
VIII. RELATED WORKS
Lots of researches have been done up to date to propose
an efficient CFI system and the results of some of these
Fig. 3. Performance overhead of the system in nginx web server
200 300 400 500 600
10
15
20
Number of requests per second
Meanresponsetimeperrequest(ms)
W/O CFI system W CFI system
researches are known to be practical to some extent, however,
there are some limitations carried with each solution. Some of
these solutions require special hardware peripherals or binary
instrumentation while others need to access the program’s
source code in order to be able to protect it and lots of them
are not able to operate beside other protection mechanisms
like ASLR.
RAGuard [38], SOFIA [10], HAFIX [12], CONVERSE [21]
and the proposed system in [30] are protection mechanisms
which need special hardware support, like customized CPU
instruction set, to operate. Intel has recently presented a
hardware facility, named CET, to provide CFI in hardware
level, but it has not been implemented in its processors yet.
Some other protection systems like CCFIR [37], S-D CFI
[25], Lockdown [29], O-CFI [27], and the proposed work in
[40] use binary rewriting, instrumentation and internal hooks
to enforce the CFI policy. These mechanisms perform some
extra checks before indirect branches and validate the execu-
tion path of the protected application. To do that, they usually
inject some instruction into the binary of the application. The
instrumentation technique has been used in the kernel itself as
well [9].
Another method used to enforce CFI is to change the com-
piler or the protected application’s source code and compiling
it again. In this method, the compiler or the program’s source
code is modified in a way that performs more security checks
during the run-time of the program. Obviously, systems based
on this approach need access to program’s source code and
recompilation as well. CCFI [26] and the proposed work in
[32] are of this type.
The above mentioned approaches and other protection sys-
TABLE VI
COMPARISON OF DIFFERENT CFI SYSTEMS, BASED ON THEIR
REQUIREMENTS
CFI system Release Specific Recompilation Binary
Date Hardware Alteration
CFIMon 2012 - - -
CCFIR 2013 - - *
CONVERSE 2014 * - -
Tiec et al 2014 - * -
S-D CFI 2014 - - *
LockDown 2015 - - *
OCFI 2015 - - *
CCFI 2015 - * -
HAFIX 2015 * - -
HCFI 2016 * - -
RAGuard 2017 * - -
PT-CFI 2017 - - -
Proposed work 2017 - - -
tems like [35], [33], [31], [11], [18], and [24] are some of
the endeavours to propose an effective, practical CFI system.
Another classification of CFI systems is to categorize them
into Fine grained and Coarse grained systems. While most of
the presented systems up to now can be categorized in these
classes, our proposed work takes a third approach which we
called semi-holistic approach, in which the indirect branches
are not analyzed individually, but the system validates them
in sets of 16 branch instructions at a time. Doing so, we are
able to increase the accuracy of the system and meanwhile,
the use of jprobes in Linux kernel is a way of decreasing the
performance overhead and monitoring the general behavior
of programs. According to what mentioned, a comparison of
various CFI systems and our proposed work is presented in
table VI.
As it is shown in the table, our proposed work, as well as the
CFIMon and the PT-CFI, do not have any special requirements
to operate, however, there are some differences between our
proposed work and other those two systems. In CFI-Mon,
the BTS registers are used to track the branch instructions
and detect the violation of valid execution path. The incurred
overhead is announced to be 6%. This CFI system works in
two phases. In the offline phase, a set of valid destinations
for each branch instruction is collected using static analysis
of binaries, and in the online phase, using the BTS registers,
the executed branches are monitored and checked against
specific rules. In this system, in the offline phase, the call set
contains the addresses of all instructions at the beginning
of the functions and the ret set contains the addresses of
instructions just after call instructions in the program. The
valid destinations for indirect branch instructions are stored in
train set, using a learning mechanism.
At the runtime, this CFI system checks every branch in-
structions against these sets. Because this system uses absolute
addresses for branch destinations, it can’t be used beside
randomization mechanisms like ASLR. Processing all branch
instructions in a program, on the other hand, incurs heavy
overhead to the system, and because in this CFI mechanism
each branch instruction is validated separately, an attacker may
exploit the ret set by using the call preceding gadgets in a
ROP attack, and bypass the system.
In PT-CFI system, a newly introduced Intel processors
facility, named PT (Processor Trace), is used to check the
control flow integrity. Using PT, as mentioned in [20], could
cause up to hundreds of megabytes of information per second
to be generated for each processing core. Moreover, because
of the low performance of using all information packets of PT,
the PT-CFI system only uses TIP (Target IP) packets to detect
the violation of the valid execution flow, and when a violation
is detected, then using what they call deep inspection, further
analysis of PT information will take place. Although PT-CFI
is similar to our proposed work in general, the techniques used
there are different. On the other hand, the ability to use the PT-
CFI beside other protection mechanisms have not been studied
yet.
IX. CONCLUSION
Although there are various CFI systems presented up to
date, these systems validate executed branch instructions sep-
arately. This approach lacks a holistic view of the program’s
execution flow. Besides, a practical CFI system which acts
accurately without the need of the program’s source code,
special hardware peripherals, binary alteration, compiler mod-
ification and with the ability to operate beside other protection
mechanisms is still needed.
In this paper, we proposed a new CFI system which ad-
dresses the above-mentioned characteristics. This CFI system
is able to operate alongside with other protection schemes like
DEP and ASLR. In our proposed work, the computation of
the CFG is done once on a system and the resulted policy is
usable on other systems running the same OS/APP as well,
using the PAGE SHIFT concept in Linux kernel. We also
used the LBR model specific registers to compute the distance
between the source and the destination of the executed branch
instructions. To get closer to a holistic view of the program’s
execution flow, at each single act of analysis, we study the
characteristics of the executed branch instructions, as the only
means to direct the execution flow, in sets of 16 branches, just
before each sensitive system call and doing so, we are able
to detect any violation of the CFG in the program’s execution
flow. Facilitated with the kprobe concepts presented in the
Linux kernel, we are able to enforce the CFG with a negligible
performance overhead.
To evaluate our proposed work, we implemented this system
as a single LKM (Linux Kernel Module) interacting with the
user space to get the CFG on an Ubuntu 14.0.4 LTS box.
The result of our evaluations shows that the proposed work is
able to detect various types of attacks with a low-performance
overhead, alongside with the other protection systems like
ASLR and DEP.
REFERENCES
[1] Microgadgets: Size does matter in turing-complete return-oriented pro-
gramming. In Presented as part of the 6th USENIX Workshop on
Offensive Technologies, Bellevue, WA, 2012. USENIX.
[2] Mart´ın Abadi, Mihai Budiu, ´Ulfar Erlingsson, and Jay Ligatti. Control-
flow integrity principles, implementations, and applications. ACM Trans.
Inf. Syst. Secur., 13(1):4:1–4:40, November 2009.
[3] Nicholas Carlini, Antonio Barresi, Mathias Payer, David Wagner, and
Thomas R. Gross. Control-flow bending: On the effectiveness of control-
flow integrity. In 24th USENIX Security Symposium (USENIX Security
15), pages 161–176, Washington, D.C., 2015. USENIX Association.
[4] Nicholas Carlini and David Wagner. Rop is still dangerous: Breaking
modern defenses. In 23rd USENIX Security Symposium (USENIX Security
14), pages 385–399, San Diego, CA, 2014. USENIX Association.
[5] Ping Chen, Hai Xiao, Xiaobin Shen, Xinchun Yin, Bing Mao, and Li Xie.
DROP: Detecting Return-Oriented Programming Malicious Code, pages
163–177. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
[6] Yueqiang Cheng, Zongwei Zhou, Miao Yu, Xuhua Ding, and Robert H.
Deng. Ropecker: A generic and practical approach for defending against
rop attacks. In NDSS. The Internet Society, 2014.
[7] Nick Christoulakis, George Christou, Elias Athanasopoulos, and Sotiris
Ioannidis. Hcfi: Hardware-enforced control-flow integrity. In Proceedings
of the Sixth ACM Conference on Data and Application Security and
Privacy, CODASPY ’16, pages 38–49, New York, NY, USA, 2016. ACM.
[8] Mauro Conti, Stephen Crane, Lucas Davi, Michael Franz, Per Larsen,
Marco Negro, Christopher Liebchen, Mohaned Qunaibit, and Ahmad-
Reza Sadeghi. Losing control: On the effectiveness of control-flow
integrity under stack attacks. In Proceedings of the 22Nd ACM SIGSAC
Conference on Computer and Communications Security, CCS ’15, pages
952–963, New York, NY, USA, 2015. ACM.
[9] John Criswell, Nathan Dautenhahn, and Vikram Adve. Kcofi: Complete
control-flow integrity for commodity operating system kernels. In
Proceedings of the 2014 IEEE Symposium on Security and Privacy,
SP ’14, pages 292–307, Washington, DC, USA, 2014. IEEE Computer
Society.
[10] R. d. Clercq, R. D. Keulenaer, B. Coppens, B. Yang, P. Maene,
K. d. Bosschere, B. Preneel, B. d. Sutter, and I. Verbauwhede. Sofia:
Software and control flow integrity architecture. In 2016 Design,
Automation Test in Europe Conference Exhibition (DATE), pages 1172–
1177, March 2016.
[11] Sanjeev Das, Wei Zhang, and Yang Liu. A fine-grained control flow
integrity approach against runtime memory attacks for embedded systems.
IEEE Trans. Very Large Scale Integr. Syst., 24(11):3193–3207, November
2016.
[12] L. Davi, M. Hanreich, D. Paul, A. R. Sadeghi, P. Koeberl, D. Sullivan,
O. Arias, and Y. Jin. Hafix: Hardware-assisted flow integrity extension.
In 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC),
pages 1–6, June 2015.
[13] Lucas Davi, Christopher Liebchen, Ahmad-Reza Sadeghi, Kevin Z.
Snow, and Fabian Monrose. Isomeron: Code randomization resilient to
(just-in-time) return-oriented programming. In 22nd Annual Network
and Distributed System Security Symposium, NDSS 2015, San Diego,
California, USA, February 8-11, 2015, 2015.
[14] Lucas Davi, Ahmad-Reza Sadeghi, Daniel Lehmann, and Fabian Mon-
rose. Stitching the gadgets: On the ineffectiveness of coarse-grained
control-flow integrity protection. In 23rd USENIX Security Symposium
(USENIX Security 14), pages 401–416, San Diego, CA, 2014. USENIX
Association.
[15] Lucas Davi, Ahmad-Reza Sadeghi, and Marcel Winandy. Ropdefender:
A detection tool to defend against return-oriented programming attacks.
In Proceedings of the 6th ACM Symposium on Information, Computer
and Communications Security, ASIACCS ’11, pages 40–51, New York,
NY, USA, 2011. ACM.
[16] Lucas Vincenzo Davi, Alexandra Dmitrienko, Stefan N¨urnberger, and
Ahmad-Reza Sadeghi. Gadge me if you can: Secure and efficient ad-
hoc instruction-level randomization for x86 and arm. In Proceedings
of the 8th ACM SIGSAC Symposium on Information, Computer and
Communications Security, ASIA CCS ’13, pages 299–310, New York,
NY, USA, 2013. ACM.
[17] Isaac Evans, Fan Long, Ulziibayar Otgonbaatar, Howard Shrobe, Martin
Rinard, Hamed Okhravi, and Stelios Sidiroglou-Douskos. Control jujutsu:
On the weaknesses of fine-grained control flow integrity. In Proceedings
of the 22Nd ACM SIGSAC Conference on Computer and Communications
Security, CCS ’15, pages 901–913, New York, NY, USA, 2015. ACM.
[18] X. Ge, N. Talele, M. Payer, and T. Jaeger. Fine-grained control-flow
integrity for kernel software. In 2016 IEEE European Symposium on
Security and Privacy (EuroS P), pages 179–194, March 2016.
[19] Enes G¨oktas, Elias Athanasopoulos, Herbert Bos, and Georgios Portoka-
lidis. Out of control: Overcoming control-flow integrity. In Proceedings
of the 2014 IEEE Symposium on Security and Privacy, SP ’14, pages
575–589, Washington, DC, USA, 2014. IEEE Computer Society.
[20] Yufei Gu, Qingchuan Zhao, Yinqian Zhang, and Zhiqiang Lin. Pt-cfi:
Transparent backward-edge control flow violation detection using intel
processor trace. In Proceedings of the Seventh ACM on Conference on
Data and Application Security and Privacy, CODASPY ’17, pages 173–
184, New York, NY, USA, 2017. ACM.
[21] Z. Guo, R. Bhakta, and I. G. Harris. Control-flow checking for
intrusion detection via a real-time debug interface. In 2014 International
Conference on Smart Computing Workshops, pages 87–92, Nov 2014.
[22] Intel. Control-flow Enforcement Technology Preview. Intel Corporation,
2016.
[23] Elias Levy. Smashing the stack for fun and profit. Phrack Magazine,
49, 1996.
[24] Yan Lin, Xiaoxiao Tang, Debin Gao, and Jianming Fu. Control flow
integrity enforcement with dynamic code optimization. In Matt Bishop
and Anderson C A Nascimento, editors, Information Security: 19th
International Conference, ISC 2016, Honolulu, HI, USA, September 3-6,
2016. Proceedings, pages 366–385, Cham, 2016. Springer International
Publishing.
[25] X. Liu, Q. Wei, and Z. Ye. Static-dynamic control flow integrity. In
2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and
Internet Computing, pages 189–196, Nov 2014.
[26] Ali Jose Mashtizadeh, Andrea Bittau, Dan Boneh, and David Mazi`eres.
Ccfi: Cryptographically enforced control flow integrity. In Proceedings of
the 22Nd ACM SIGSAC Conference on Computer and Communications
Security, CCS ’15, pages 941–951, New York, NY, USA, 2015. ACM.
[27] Vishwath Mohan, Per Larsen, Stefan Brunthaler, Kevin W. Hamlen, and
Michael Franz. Opaque control-flow integrity. In NDSS, 2015.
[28] Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis.
Transparent rop exploit mitigation using indirect branch tracing. In
Presented as part of the 22nd USENIX Security Symposium (USENIX
Security 13), pages 447–462, Washington, D.C., 2013. USENIX.
[29] Mathias Payer, Antonio Barresi, and Thomas R. Gross. Fine-grained
control-flow integrity through binary hardening. In Magnus Almgren,
Vincenzo Gulisano, and Federico Maggi, editors, Detection of Intrusions
and Malware, and Vulnerability Assessment: 12th International Confer-
ence, DIMVA 2015, Milan, Italy, July 9-10, 2015, Proceedings, Cham,
2015. Springer International Publishing.
[30] Dean Sullivan, Orlando Arias, Lucas Davi, Per Larsen, Ahmad-Reza
Sadeghi, and Yier Jin. Strategy without tactics: Policy-agnostic hardware-
enhanced control-flow integrity. In Proceedings of the 53rd Annual
Design Automation Conference, DAC ’16, pages 163:1–163:6, New York,
NY, USA, 2016. ACM.
[31] Jiaqi Tan, Hui Jun Tay, Utsav Drolia, Rajeev Gandhi, and Priya
Narasimhan. Pcfire: Towards provable preventative control-flow integrity
enforcement for realistic embedded software. In Proceedings of the 13th
International Conference on Embedded Software, EMSOFT ’16, pages
19:1–19:10, New York, NY, USA, 2016. ACM.
[32] Caroline Tice, Tom Roeder, Peter Collingbourne, Stephen Checkoway,
´Ulfar Erlingsson, Luis Lozano, and Geoff Pike. Enforcing forward-edge
control-flow integrity in gcc & llvm. In Proceedings of the 23rd
USENIX Conference on Security Symposium, SEC’14, pages 941–955,
Berkeley, CA, USA, 2014. USENIX Association.
[33] X. Wang and R. Karri. Reusing hardware performance counters to detect
and identify kernel control-flow modifying rootkits. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 35(3):485–
498, March 2016.
[34] Xueyang Wang and Jerry Backer. SIGDROP: signature-based ROP
detection using hardware performance counters. CoRR, abs/1609.02667,
2016.
[35] Xueyang Wang and Ramesh Karri. Numchecker: Detecting kernel
control-flow modifying rootkits by using hardware performance counters.
In Proceedings of the 50th Annual Design Automation Conference, DAC
’13, pages 79:1–79:7, New York, NY, USA, 2013. ACM.
[36] Yubin Xia, Yutao Liu, Haibo Chen, and Binyu Zang. Cfimon: Detecting
violation of control flow integrity using performance counters. In Pro-
ceedings of the 2012 42Nd Annual IEEE/IFIP International Conference
on Dependable Systems and Networks (DSN), DSN ’12, pages 1–12,
Washington, DC, USA, 2012. IEEE Computer Society.
[37] Chao Zhang, Tao Wei, Zhaofeng Chen, Lei Duan, Laszlo Szekeres,
Stephen McCamant, Dawn Song, and Wei Zou. Practical control flow
integrity and randomization for binary executables. In Proceedings of the
2013 IEEE Symposium on Security and Privacy, SP ’13, pages 559–573,
Washington, DC, USA, 2013. IEEE Computer Society.
[38] J. Zhang, R. Hou, J. Fan, K. Liu, K. Zhang, and S. A. McKee. Raguard:
A hardware based mechanism for backward-edgecontrol-flow integrity. In
ACM International Conference on Computing Frontiers 2017, Siena, Italy,
2017. ACM.
[39] Shacham, Hovav and Page, Matthew and Pfaff, Ben and Goh, Eu-Jin and
Modadugu, Nagendra and Boneh, Dan. On the Effectiveness of Address-
space Randomization. In Proceedings of the 11th ACM Conference on
Computer and Communications Security, pages 298–307, New York, NY,
USA, 2004. ACM.
[40] Mingwei Zhang and R. Sekar. Control flow integrity for cots binaries.
In Presented as part of the 22nd USENIX Security Symposium (USENIX
Security 13), pages 337–352, Washington, D.C., 2013. USENIX.
[41] S. Andersen and V. Abella. Changes to functionality in microsoft
windows xp service pack 2, part 3: Memory protection technologies,
Data Execution Prevention, In Microsoft TechNet Library, September
2004. http://technet.microsoft.com/en-us/library/bb457155.aspx.
[42] PaX Team, ”Address Space Layout Randomization (ASLR)”, 2003.
https://pax.grsecurity.net/docs/aslr.txt.
[43] RedHat,
”Position Independent Executables (PIE)”, In Redhat Customer Portal,
November 2012. https://access.redhat.com/blogs/766093/posts/1975793.
[44] Alexander Sotirov, ”Heap Feng Shui in JavaScript”, In Black-
Hat Europe, 2007. https://www.blackhat.com/presentations/bh-europe-
07/Sotirov/Presentation/bh-eu-07-sotirov-apr19.pdf.
[45] Tilo Muler, Computer Science ”ASLR Smack and Laugh Refer-
ence”, In Seminar on Advanced Exploitation Techniques, RWTH
Aachen, Germany, February 2008. https://pdfs.semanticscholar.org/440e/
61ecb744e55d0425cdb648fe24e4ff999686.pdf.

More Related Content

Similar to A holistic Control Flow Integrity

SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONS
SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONSSECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONS
SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONSijseajournal
 
A Security Analysis Framework Powered by an Expert System
A Security Analysis Framework Powered by an Expert SystemA Security Analysis Framework Powered by an Expert System
A Security Analysis Framework Powered by an Expert SystemCSCJournals
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSIJNSA Journal
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSIJNSA Journal
 
Internal security on an ids based on agents
Internal security on an ids based on agentsInternal security on an ids based on agents
Internal security on an ids based on agentscsandit
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTScscpconf
 
Accurately detecting source code of attacks that increase privilege
Accurately detecting source code of attacks that increase privilegeAccurately detecting source code of attacks that increase privilege
Accurately detecting source code of attacks that increase privilegeUltraUploader
 
Kernel security of Systems
Kernel security of SystemsKernel security of Systems
Kernel security of SystemsJamal Jamali
 
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSA SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSIJNSA Journal
 
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSA SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSIJNSA Journal
 
A network worm vaccine architecture
A network worm vaccine architectureA network worm vaccine architecture
A network worm vaccine architectureUltraUploader
 
A method for detecting abnormal program behavior on embedded devices
A method for detecting abnormal program behavior on embedded devicesA method for detecting abnormal program behavior on embedded devices
A method for detecting abnormal program behavior on embedded devicesRaja Ram
 
What Happened to Mathematically Provable Security?
What Happened to Mathematically Provable Security?What Happened to Mathematically Provable Security?
What Happened to Mathematically Provable Security?Frances Coronel
 
Building a Distributed Secure System on Multi-Agent Platform Depending on the...
Building a Distributed Secure System on Multi-Agent Platform Depending on the...Building a Distributed Secure System on Multi-Agent Platform Depending on the...
Building a Distributed Secure System on Multi-Agent Platform Depending on the...CSCJournals
 
Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...ijfcstjournal
 
Iaetsd evasive security using ac ls on threads
Iaetsd evasive security using ac ls on threadsIaetsd evasive security using ac ls on threads
Iaetsd evasive security using ac ls on threadsIaetsd Iaetsd
 
Security Patterns - An Introduction
Security Patterns - An IntroductionSecurity Patterns - An Introduction
Security Patterns - An IntroductionMarcel Winandy
 

Similar to A holistic Control Flow Integrity (20)

F0341026029
F0341026029F0341026029
F0341026029
 
SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONS
SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONSSECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONS
SECURITY FOR DEVOPS DEPLOYMENT PROCESSES: DEFENSES, RISKS, RESEARCH DIRECTIONS
 
A Security Analysis Framework Powered by an Expert System
A Security Analysis Framework Powered by an Expert SystemA Security Analysis Framework Powered by an Expert System
A Security Analysis Framework Powered by an Expert System
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
 
Internal security on an ids based on agents
Internal security on an ids based on agentsInternal security on an ids based on agents
Internal security on an ids based on agents
 
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTSINTERNAL SECURITY ON AN IDS BASED ON AGENTS
INTERNAL SECURITY ON AN IDS BASED ON AGENTS
 
Accurately detecting source code of attacks that increase privilege
Accurately detecting source code of attacks that increase privilegeAccurately detecting source code of attacks that increase privilege
Accurately detecting source code of attacks that increase privilege
 
Antivirus engine
Antivirus engineAntivirus engine
Antivirus engine
 
Kernel security of Systems
Kernel security of SystemsKernel security of Systems
Kernel security of Systems
 
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSA SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
 
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMSA SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
A SYSTEM FOR VALIDATING AND COMPARING HOST-BASED DDOS DETECTION MECHANISMS
 
A network worm vaccine architecture
A network worm vaccine architectureA network worm vaccine architecture
A network worm vaccine architecture
 
A method for detecting abnormal program behavior on embedded devices
A method for detecting abnormal program behavior on embedded devicesA method for detecting abnormal program behavior on embedded devices
A method for detecting abnormal program behavior on embedded devices
 
What Happened to Mathematically Provable Security?
What Happened to Mathematically Provable Security?What Happened to Mathematically Provable Security?
What Happened to Mathematically Provable Security?
 
Building a Distributed Secure System on Multi-Agent Platform Depending on the...
Building a Distributed Secure System on Multi-Agent Platform Depending on the...Building a Distributed Secure System on Multi-Agent Platform Depending on the...
Building a Distributed Secure System on Multi-Agent Platform Depending on the...
 
Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...
 
APT - Project
APT - Project APT - Project
APT - Project
 
Iaetsd evasive security using ac ls on threads
Iaetsd evasive security using ac ls on threadsIaetsd evasive security using ac ls on threads
Iaetsd evasive security using ac ls on threads
 
Security Patterns - An Introduction
Security Patterns - An IntroductionSecurity Patterns - An Introduction
Security Patterns - An Introduction
 

More from Mohammad Golyani

More from Mohammad Golyani (9)

C++ How to program
C++ How to programC++ How to program
C++ How to program
 
GCC, Glibc protections
GCC, Glibc protectionsGCC, Glibc protections
GCC, Glibc protections
 
GCC, Glibc protections
GCC, Glibc protectionsGCC, Glibc protections
GCC, Glibc protections
 
Exec-shield
Exec-shieldExec-shield
Exec-shield
 
ASLR
ASLRASLR
ASLR
 
Advanced c programming in Linux
Advanced c programming in Linux Advanced c programming in Linux
Advanced c programming in Linux
 
How to get LBR contents on Intel x86
How to get LBR contents on Intel x86How to get LBR contents on Intel x86
How to get LBR contents on Intel x86
 
Data encryption standard
Data encryption standardData encryption standard
Data encryption standard
 
Linux Protections Against Exploits
Linux Protections Against ExploitsLinux Protections Against Exploits
Linux Protections Against Exploits
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

A holistic Control Flow Integrity

  • 1. HCFI: Holistic Control Flow Integrity A kernel-level approach to enforce CFI M.Golyani, S.Niksefat 2017 Abstract—While Control Flow Integrity is one of the most powerful methods used to prevent attackers from obtaining control of a process, there are still some shortcomings in different aspects of yet presented CFI systems. In this paper, we propose a new CFI system which is able to work alongside with other protection schemes, without the need of the program’s source code, specific hardware, and binary rewriting. Our proposed work uses kernel facilities as well as performance counters in the processor to monitor the execution of the protected applications and detects any violation of the correct execution flow. In this CFI system, the CFI policy is generated once on a single machine and used on other machines as well. We have implemented this system on a Linux box and evaluation results indicate that this CFI system is completely practical with low overhead and is able to detect various kinds of attacks. I. INTRODUCTION Up to now, lots of mechanisms have been developed to provide security for operating systems and running processes. Among them, Control Flow Integrity (CFI) [2] is one of the most reliable techniques. In CFI-based solutions, the overall process is that in the first step, also known as the offline phase, a valid control flow graph is depicted for each binary, and in the on-line phase, when the operating system is executing the binary, a CFI enforcement mechanism takes place which compares the current execution flow with the saved one. If the CFI system finds any violation of valid control flow graph, it raises an alarm and takes proper action accordingly. Implemented successfully, CFI is one of the most proper protection schemes due to the fact that it is not restricted to a specific type of attack and detects any kind of violation from the valid execution flow. The practicability of yet presented CFI mechanisms has been discussed in many other papers. Although in some of these papers it is stated that the studied CFI mechanisms can’t provide a suitable protection for the system, it should be noticed that most of these CFI mechanisms are similar to each other in case of the detection process. In these CFI mechanisms a set of valid targets for indirect branches is generated and during the run-time, each indirect branch instruction is checked against the list separately. While this approach may be bypassed [14], [3], [8], [17], a holistic CFI mechanism, which we will introduce later on, can still provide a suitable protection for the system. The CFI mechanisms presented until today, are categorized into two groups of fine-grained and coarse-grained CFI sys- tems, while there can be a third approach between fine-grained and coarse-grained CFI. In this paper, we present a new CFI system that uses a holistic approach to check control flow integrity in a period and not only at a specific time. In our proposed system, alongside analyzing the valid targets for branches inside a program, the whole execution flow of the executed process is also monitored (a holistic CFI system), therefore any violation from control flow graph is detected immediately. Furthermore, in our proposed work, the ability to implement the CFI system alongside other protection mechanisms like ASLR and DEP is addressed as well. This characteristic has not been appropriately considered in existing CFI systems yet. This proposed CFI system protects both statically and dynamically linked executable files without the need to source or recompilation as well as instrumentation and any specific hardware equipment. Contributions: In summary, the contributions of this paper are as follows: • We propose a CFI system which detects and enforces the CFG by considering the sequence of branches made till a certain point. Using a sequence of branches at a time instead of a single branch at a time provides a holistic view of the program’s execution flow. • Our presented system is designed in a way that can be used alongside other protection schemes like ASLR. • It is the only CFI system which works in a centralized way. Computation of CFG for each binary is performed in a central system and the computed CFG can be used on other systems. • We have constructed a working prototype of the presented system in an Ubuntu 14.0.4 LTS operating system as a kernel module which protects binaries on a system which has ASLR and exec-shield enabled and compile- time protections are used as well. The rest of this paper is organized as follows. In the ”Back- ground” section some information about the history of attacks and the existing CFI systems is provided. In this section we study the advances made in both the attack techniques and the defence mechanisms. Section III presents the security model we considered in designing and implementing the proposed work. In section IV an overview of the proposed work is depicted and in sections V and VI we provide an in-depth view of the two main operation phases in the proposed work. Sec- tion VII presents the evaluation results of an implementation of the proposed work in terms of security and performance.
  • 2. Related works even though discussed throughout the paper, is presented in Section VIII, and in section IX we conclude the paper. II. BACKGROUND Since Elias Levy’s article titled ”Smashing the stack for fun and profit” at 1996 [23], lots of attacks have been introduced to take the control of processes and alongside with these attacks, security solutions have been presented as well. One of the most well-known attacks is stack buffer overflow attack which exploits the lack of boundary checking in some programming languages to overwrite sensitive data in memory. This attack was first introduced to exploit stack buffers but later expanded to heap area as well. To defend against overflow-based attacks, security mecha- nisms like canary based protections were introduced, in which a particular value, known as canary is placed in the buffer, right beside critical pointer and at the way of the overflow. Using this technique, whenever overflow occurs, just before the overflow overwrites the critical pointer, it has to overwrite the canary to reach the pointer and hence the protection system detects the change of the canary and raises an alarm. While this technique works fine against overflow-based attacks, it has no effect on other types of attacks like format string-based attacks. Format string attacks were proposed after buffer overflow attacks and using them, an attacker is able to overwrite a sensitive pointer in memory, e.g., return address, and transfer the control of the process execution into his/her own injected shellcode. When an attacker is able to put some code into the mem- ory, he/she can use some techniques like format-string-based attacks to execute his/her injected code. In general, these types of attacks are known as Code injection attacks in which a piece of code is first injected into the memory and executed afterward. To counter code injection attacks, data execution prevention [41] systems were introduced in which a distinction between code and data were made in memory regions and only code region has the permission to get executed, not data region. Security systems like WˆX, Exec-shield and DEP are based on this protection mechanism. Alongside with software solutions, hardware processor producers like Intel and AMD, implemented some facilities to enforce this type of protection in hardware level as well. Intel introduced the Execute disable (XD) flag and AMD introduced the No execute (NX) flag in their processors. Although the mentioned protection techniques stop code injection attacks effectively, they are ineffective against some other types of attacks like return into libc (a.k.a. ret2libc). In ret2libc technique, the attacker overwrites a pointer (e.g., return address) to the location of a function within libc library. As address space of libc is marked as executable and normal execution flow is often transferred there to execute a function, the attacker will be able to execute a whole function in libc library, providing its arguments and hence bypass the above protections. To prevent the ret2libc attack which in fact is the first and most basic type of Code reuse attacks, randomization techniques come into play. Protection mechanisms like ASLR (Address Space Layout Randomization) [42] in Linux, and concepts like Position Independent Executable [43] files are some of the mechanisms that can be used against this type of attack, but alongside with these protection schemes, some other techniques have been proposed to bypass them. Heap spray [44], ASLR brute force [39], and return into non- randomized regions [45] are some of these techniques. Two of the most cutting-edge methods which attackers use are Return Oriented Programming and Jump Oriented Programming. In these methods, attacker overwrites a pointer with a pointer to a small part of the program’s own code. The final attack is formed by arranging these small parts of the code in proper order. These small parts of the code are called Gadgets. By executing gadgets in proper order, it is proved that attacker can gain a Turing-complete system to execute whatever he/she likes [1]. Nowadays, presenting an efficient and reliable technique to counter code reuse attacks effectively is the main concern in academic society. kBouncer [28] was one of the first attempts to gain a practical solution, although it has been proven to be inefficient in practice [19]. One of the other prominent works in this field is ROPecker [6], but there have been found some methods to bypass it as well [4]. Randomization-based protections like Isomeron [13], protections based on omitting gadgets from binary thorough recompilation like DROP [5], hardware-based approaches like SIGDROP [34], Gadge me if you can [16], protections based on binary instrumentation like ROPDefender [15] and lots of other works have tried to mitigate code reuse attacks in different ways, but unfor- tunately almost all of them consider a special characteristic of these attacks as a mean of detection. Examples for these characteristics are the length of the gadget chain, length of the gadget itself, and so on. Hence, to bypass these protection mechanisms, attackers always come up with some new attack techniques, just by changing these characteristics in their attacks. CFI protections, On the other hand, without targeting neither a specific attack nor a particular characteristic of an attack, are able to detect and prevent a vast variety of attacks. From code injection attacks to ROP and JOP attacks, if the attack affects the running program’ execution in a way that it differs from the predefined control flow, the attack is detected and prevented. CFI enforcement mechanisms usually work in two phases, first, in an off-line phase the control flow graph, a.k.a., CFG, of a binary is obtained and then in an online phase, any violation from this CFG is identified and considered as an attack. CCFI [26], CFIMon [36], CONVERSE [21], OCFI [27], HCFI [7] are some of the most prominent CFI systems presented yet. Intel has recently presented the control flow enforcement technology overview, CET, [22] which provides shadow stack and indirect branch tracking capabilities to counter ROP like attacks. CFI concept although seems to be flawless at the first glance, but the implementations presented up to now, all
  • 3. have some shortcomings which is noted in recent researches that challenge the functionality of these schemes and have presented some techniques to bypass these implementations [14], [3], [8], [17]. It should be noticed that each of these attack techniques target a specific implementation of CFI concept and not the CFI concept itself. Considering this background, a practical protection system against different attack techniques is still needed. In this paper, we present a new protection scheme which enforces CFI in a different way from existing mechanisms. Our proposed work, compares the current execution flow of the protected process with the correct CFG, driven from the off- line phase, considering a sequence of branches not just one at a time. This system can be implemented alongside with other protection schemes and does not need access to source code in order to operate correctly. In this system, no modification is made in binaries and no instrumentation is made neither. III. SECURITY MODEL Nowadays, there are various protection mechanisms used in operating systems. Canary based protections, Data execution prevention protections, and randomization-based protections are the most popular ones. When there is fewer number of protection schemes activated in a system, an attacker can take over the control of the system more easily and the task of activated protection schemes is heavier. In other words, there is an inverse relationship between the number of protection schemes and the number of security task each protection should perform. Accordingly, in a system, if a security eval- uation is made by only enabling one protection system, this evaluation would be more rigorous than evaluating the same system with more protection schemes enabled. Of course, it should be noticed that the security scheme under evaluation should be able to operate correctly when other protection schemes are enabled as well, and activity of other protection systems, should not interfere the operation of the system under evaluation. Accordingly, in our security model, the SSP protection module which is used by GCC compiler at compile time, has been disabled. This protection module is solely used to detect the rewrite of sensitive pointers like saved EIP and have no role in detection or prevention of code execution if the return address is overwritten. Therefore enabling or disabling this module has no effect on the functionality of the proposed system, and by disabling the SSP, it will be just easier to attack the system. Data Execution Prevention and Address Space Layout Ran- domization, on the other hand, are related to what happens after an attacker takes over the control of the execution. These protection schemes are not about stopping the attacker from overwriting the return address, but to prevent them from executing their own code. Therefore by disabling these schemes, only the complexity of attacks is decreased, although to ensure the correct functionality, we evaluate our job in both situations. In other words, in our security model, the functionality of the proposed system is checked in both the presence and the absence of the ASLR and the DEP. According to what stated above, in our security model, the SSP protection is disabled, but we evaluate our work in both ASLR and Exec-shield enabled and disabled states. Accord- ingly, in our threat model, an attacker is able to overwrite a sensitive pointer arbitrarily and is also able to execute his/her own pieces of code in program’s address space, either by injecting the code directly (ASLR, exec-shield disabled), or by using more advanced attack techniques like ROP (ASLR, exec-shield enabled). IV. OVERVIEW OF THE PROPOSED WORK Like other CFI systems, the general structure of our pro- posed system is that in an off-line phase, the control flow graph of the protected binaries is created and in an on-line phase, this CFG is enforced. In our proposed work the generation and enforcement of the CFG are done using some kernel features for each binary under system’s protection. Enforcing the CFG is done using Kprobe facility in the Linux kernel in conjunction with the LBR Model Specific Registers. Since the execution path of a process is determined by the branch instructions it executes, in this system we built a kernel module to record the performed branches. In other words, by monitoring all branch instructions made in a process, one can determine the valid execution flow graph. Each branch instruction, despite its type, at the execution time has specific source and destination address. In our proposed work the distance between the source and the destination of a branch is used as an identifier for that branch. In this system, we use LBR model specific registers to obtain the distance between the source and the destination of a branch instruction. These registers, which come in 16 pairs, configured properly, store the source and destination address of each user-space branch instruction executed in the system in a ring buffer. Although it is possible to detect the correct execution path of a process using these registers, monitoring all branch instructions made in a process, incurs high overhead. Therefore, in our proposed work, the contents of LBR registers is accessed only when a system call is made. Almost all of the existing CFI systems, work by computing and analyzing the valid destination for branch instructions. In these systems, in each task of analysis, a specific branch instruction is analyzed and valid destination addresses for that instruction is identified and compared with current execution flow, whilst in our proposed work, in each task of analysis, 16 branch instructions made till now are analyzed and hence it is possible to check the integrity of execution flow in a period. In this system, whenever a sensitive system call is executed, the 16 branch instructions before this system call is identified, the distance between source and destination of these branches is calculated and then this table of 16 branch distances is compared with the table received from the off-line phase. Any contradiction in these two tables is assumed as an unauthorized attempt to redirect the control flow and so the execution of that process is stopped immediately. Since in this scheme, the
  • 4. Fig. 1. The overview of the proposed system: 1. The system is triggered 2. Current LBR contents loaded 3. Current LBR is compared with the offline table 4. The decision to stop or continue the execution of the protected program is made distance between source and destination of each branch is used and not the absolute addresses, it is possible to implement our proposed work in conjunction with other protection schemes like ASLR which changes the address of where the binary is loaded in memory on each execution. On the other hand, because of special design in our system, which we will discuss later, the table of valid branch distances (TVBD) that is computed in a specific OS version for a specific binary, is usable on other systems running the same binary on the same OS. A general overview of our proposed work is depicted in figure 1. As it is shown, the system is triggered by a sensitive system call made from a protected application (1). At this point, the detection module loads current LBR contents (2) and computes the distance between destination and source of each branch trace record in a specific way which will be discussed later in Offline analysis section. After that, it compares the resulted table of 16 branches currently executed just before the sensitive system call with the table derived from the offline analysis (3). Comparing these two sets of 16 branch information, the system will stop the process immediately if a violation is detected, and otherwise, the execution of the program will continue (4). V. ONLINE PHASE In our implementation of the proposed work, the list of the applications which the CFI system should protect is announced to the kernel module through a device file in the system. In the kernel space, the installed kernel module will process this list and the CFI checks will be enforced only for applications mentioned in this list. In this system, the CFI enforcement module is activated by each invocation of a sensitive system call, it then checks the executed branch instructions and their destination addresses and compares them with the table of the valid branch distances (TVBD) received from off-line phase and any difference in these two tables is considered as an attack. Therefore, the proposed system works in three main phases: First, the system is configured in a way that any invocation of a sensitive system call, triggers the detection module, second, the executed branch instructions till now are analyzed, and third, the decision is made about whether to stop the execution or not. A. Hooking sensitive system calls In this state, which runs just after the installation of the pro- posed system, we use kernel probes to intercept the sensitive system calls. Using these probes, it is possible to dynamically insert breakpoints inside of the each desired kernel routine and collect performance or debug information as needed. Before the introduction of kernel probes, in kernel version 2.6 and before, one would need to alter the sys call table array inside the kernel to do this job, but by introduction of kernel probes, this array is now marked as read-only and it is possible to intercept the functions and routines without breaking the integrity of the kernel, by simply using kernel probes. Currently, there are three types of kernel probes available: kprobe, jprobe, and kretprobe. In our implementation of the proposed work, we use jprobes to perform our job. A jprobe could be set on the entry point of each kernel function, and it is possible to access the arguments of the called function inside the probe. Using jprobes, it is possible not only to intercept sensitive system calls and perform CFI checks but also to analyze the passed arguments as a mean of valid execution flow detection. In our proposed work, we use jprobes to intercept sensitive system calls like exec, fork, and so on. Although it is possible to set a jprobe on any desired point in the system, for example, in systems using sysenter mechanism, it is possible to set a jprobe at the start of sysenter do call and identify the called function by examining the arguments passed to it. By the way, in our proposed work, the executed branch instructions until the invocation of a sensitive system call in the protected application are identified and compared with the TVBD received from the off-line phase. B. Tracking the branches In our proposed work, LBR model specific registers are used to analyze the executed branch instructions in the running process. LBR registers are 16 pairs of MSR registers which could be found in Intel processors based on Nehalem micro- architecture onwards. By executing each branch instruction on a processor with activated LBR, the source and the destination address of the branch instruction are stored in one of 16 LBR register pairs. Since this job is performed by the hardware, there will be no added overhead for the system. When there are more than 16 branch instructions executed, the old LBR contents are overwritten in a ringed-buffer order, overwriting the first record at first. Hence, there should always be an index to point to the last filled LBR register. This pointer is called TOS. In this way, it is possible at any time to identify the last executed branch instruction by examining
  • 5. the LBR TOS register. It is also possible to confine the LBR facility to record only the user-space branches to use these 16 registers more thrifty. Accordingly, in our implementation of the proposed work, after enabling LBR in the processor, using it’s filtering facility, the source and the destination address of executed branches in the user-space would be accessible through these registers. Therefore, after the interception of sensitive system calls, whenever a jprobe is activated, the contents of LBR registers are analyzed and sorted by the time of execution. Doing so, the specifications of 16 branch instructions which executed just before the sensitive system call is analyzed and compared with the TVBD received from the off-line phase. C. Enforcing the CFI After comparing the 16 records of the saved branch in- structions, received from the off-line phase, with 16 records of executed branch instructions in current process, if any contradictions found in these two tables, the execution of the current process is interrupted and otherwise, if these two tables are the same, the execution will proceed. In our implementation of the proposed work, we use signals to stop the running process. If any violation of CFG is de- tected, a kill signal is sent to the running process immediately, causing the protected application to stop forcefully. Although there are lots of more appropriate actions available to take in case of an attack being identified, for the sake of simplicity, we chose signals. In case of normal behavior and conformity of the TVBD and the executed branch instructions in the running process, the normal execution flow will continue, calling jprobe return. It is to be mentioned that in this CFI system, we use a table of the 16 latest branch instructions executed in the running process just before the sensitive system call to identify the attack. Whilst in current existing CFI protection mechanisms, each branch instruction is handled separately, checking the destination of this particular branch instruction against a list of valid destinations. Using a table of 16 branch instructions instead of just one branch instruction at a time improves the security of our CFI system and it will stop lots of yet-discovered attack techniques, as we will discuss in the evaluation section in this paper. VI. OFFLINE ANALYSIS In this phase, generating the Control Flow Graph is the main operation. There have been lots of techniques introduced in current existing CFI systems to compute the CFG and any of these techniques can be used to generate the CFG. Some of the existing CFI systems use static analysis of binary files to generate the CFG, and some others use dynamic analysis and emulated runs to do so. The most challenging task in generating the CFG, in most of the CFI systems, is how to handle the indirect branches. In these systems, each indirect branch instruction is handled separately and possible destinations for that specific branch are identified. Any failure in the correct identification of these valid destinations, or the vast range of possible destinations, makes these systems vulnerable to some advanced attacks like ROP. That is because by increasing the number of valid desti- nations for a specific branch instruction, or by identifying an invalid address as a valid destination for a branch instruction, it would be possible that a malicious branch instruction, related to a ROP attack gadget, be considered as a valid branch and hence, the chance of success for the attacker is increased accordingly. To avoid that, in our proposed work a branch instruction is not handled separately. This means that even if an indirect branch instruction in the current execution of the protected binary is executed exactly according to the CFG, it will not be considered as a valid branch yet. A valid branch in our proposed work is a branch which not only conforms to the CFG but also in the set of 16 branches that this particular branch is part of, there is no violation from CFG as well, and also the sequence order of the current branch instructions executed in the protected binary, should be exactly the same as the TVBD. On the other hand, in our proposed work, instead of using fixed destination addresses to identify the branch instructions, we use the distance between the source and the destination of a branch as a characteristic of that branch instruction. Therefore, for a branch instruction to be considered as valid, three conditions should meet: 1) The distance between the source and the destination of the branch should conform to the CFG. 2) In the set of 16 branches which the current branch is part of, all 16 branches should conform to the CFG. 3) The sequence of these 16 executed branch instructions till now, should be exactly the same as the sequence of the entries in the TVBD. In our implementation of the proposed work, all these three conditions are checked in the on-line phase by comparing the received table of valid 16 branch distances from off-line phase and the 16 branch distances executed in the running process. Although it is possible to use any desired method to generate the CFG, in our implementation we use emulation. In this method, we run the binary in an isolated system and fill the TVBD for this binary in different execution scenarios for a while. Doing so, a table of 16 valid distances is available for each sensitive system call executed in the program code. This table is available for each protected binary separately. It is to be mentioned that the produced tables of valid distances for a specific application on a particular operating system, is usable on other systems running the same application in the same version of the operating system. This is because of the way we use to conduct the table of the valid branch distances, considering the PAGE SHIFT concept. A. PAGE SHIFT In the Linux kernel paging operation, the address of each page is made up of two parts. The most significant part is a pointer to the whole page, and the least significant part is an offset inside the page. In x86 systems, for example, which page size is set to 4 kilobytes, 4096 bytes, to be able to address
  • 6. TABLE I THE PAGE SIZE FOR THE DIFFERENT ARCHITECTURES IN THE LINUX KERNEL Architecture PAGE SIZE PAGE SHIFT X86 4096 12 Alpha 8192 13 ARM 4096 12 AVR32 4096 12 IA64 4096, 8192, 16384, 65536 12, 13, 14, 16 M68k 4096, 8192 12, 13 Sparc 4096 12 all entries of a page, 12 bits are needed. Therefore in these systems, the 12 least significant bits of each address in paging operation is related to the offsets inside the page. Accordingly, by shifting the address of each page by 12 bits, the part of the address which is responsible for indexing inside the page is ignored and the remained part, the most significant part, is the address of the page itself. This number of shifted bits which is 12 in x86 systems, is known as the PAGE SHIFT concept in the Linux kernel. At the time of writing this article the size of each page, which is named PAGE SIZE in the kernel, is calculated from the PAGE SHIFT value. The PAGE ALIGN macro inside the kernel uses this calculated size. The size of the pages in the Linux kernel for different systems is listed in the tableI. In the ASLR system, on the other hand, after the random- ization process, the generated address is aligned according to the size of the pages and randomization is performed for the address of each page, not inside the page. In Linux systems, this alignment is done after the randomization process (i.e. after get random int), through the PAGE ALIGN macro inside the kernel, and therefore the generated random value is aligned regarding the address of the page. B. Calculation of the TVBD In our implementation of the proposed work in a Linux system, considering what mentioned before, to produce the valid offsets we do not use the address of the page itself, instead, we use the offset inside the page to calculate the distance between the source and the destination of a branch. In other words, the valid distances for each branch instruction are in the range of 1-4096 and the exact value is recorded during the off-line phase. To do so, after extracting the addresses from LBR registers, we subtract the source address from the destination address and then we use three least significant digits of the resulted number as the valid offset, which may differ from 1 to FFF. Doing so, after the off-line phase, we will have sets of 16 offsets for each sensitive system call executed in the protected application. Using this method, it is possible to implement the proposed work alongside with the other protection schemes like ASLR and it is also possible to calculate the TVBD once on an operating system and use that TVBD for the same application running on different machines with the same operating system. Fig. 2. Replacement of the pages does not affect the offset inside the page, in two different runs. That is because the randomization in the current system is done per memory page and not inside the pages and hence the offsets inside the pages are still the same, so the calculated TVBD will always stay the same. It is possible to construct a central system to collect the TVBDs for different applications on different operating system versions calculated on other systems and store them in a database. This central system can then give the proper table to other systems on demand, according to OS-Application combination for that system. Therefore each system could have an updated database of TVBDs for applications it is protecting, without the need of calculating these offset tables itself. In other words, the operation of calculating the CFG is performed once on a system and the resulting tables are used on other systems as well. To prove that, we executed an altered version of the BET program, introduced in [3], under two different conditions. First, we executed BET for 110 times in a single system and observed the distance between the source and the destination addresses of the executed branch instructions in every exe- cution. Because the location of the loaded memory pages is different almost on each execution (mostly because of the ASLR operation), we got 86 different tables of distances. Afterwards, we extracted the 3 least significant digits of each recorded distance, which would be the offset inside the pages, and we observed that these 3 digits are always the same. The resulted distances of 10 different executions among 110 are listed in the table II. As it is listed in the table, the distance between the source and the destination of the branch instructions vary in different executions, but the last 3 digits are always the same, therefore it can be used as a measure to form the CFG for each program. Secondly, we executed the BET program for 10 times on the two separate ASLR enabled machines and recorded the distance between the source and the destination of each of the 16 branch instructions before a specific sensitive system call (fork in this example). The resulted addresses and offsets are listed in table III. As it is listed in tableIII, the offset inside the pages in two different executions of the BET program on two separate
  • 7. TABLE II THE RESULTED DISTANCE TABLE OF 10 DIFFERENT RUNS ON A SINGLE MACHINE Index Resulted distances Offsets inside pages (Always the same) 1 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fc7319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f835e9 0x0012416b 0x00006335 0x00004962 0x00000795 0xfcf 0xd54 0x003 0xdf8 2 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fc8319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f845e9 0x0012416b 0x00006335 0x00004962 0x00000795 0xdfa 0x319 0xd97 0x228 3 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fca319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f865e9 0x0012416b 0x00006335 0x00004962 0x00000795 0x26c 0xc3 0xc5 0x5e9 4 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fcf319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f8b5e9 0x0012416b 0x00006335 0x00004962 0x00000795 0x16b 0x35 0x962 0x795 5 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fd2319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f8e5e9 0x0012416b 0x00006335 0x00004962 0x00000795 6 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fd6319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f925e9 0x0012416b 0x00006335 0x00004962 0x00000795 7 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fd7319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f935e9 0x0012416b 0x00006335 0x00004962 0x00000795 8 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fd8319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f945e9 0x0012416b 0x00006335 0x00004962 0x00000795 9 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fd9319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f955e9 0x0012416b 0x00006335 0x00004962 0x00000795 10 0x0010efcf 0x00043d54 0x00000003 0x000b4df8 0x000b4dfa 0x09fe1319 0x00043d97 0x00000228 0x0000026c 0x000710c3 0x000710c5 0x09f9d5e9 0x0012416b 0x00006335 0x00004962 0x00000795 TABLE III THE RESULTING OFFSETS OF THE LAST 16 BRANCH INSTRUCTIONS ON SYSTEM A (UPPER RECORD) VS SYSTEM B (LOWER RECORD) Indx Source Destination Dst-Src Offset Address Address Distance in page 1 0xb7631445 0xb7740414 0x0010efcf 0xfcf 0xb76ce445 0xb77dd414 0x0010efcf 2 0xb774a015 0xb774a7aa 0x00000795 0x795 0xb77e7015 0xb77e77aa 0x00000795 3 0xb774a81f 0xb774f181 0x00004962 0x962 0xb77e781f 0xb77ec181 0x00004962 4 0xb774f1cb 0xb7755500 0x00006335 0x335 0xb77ec1cb 0xb77f2500 0x00006335 5 0xb775550b 0xb76313a0 0xffedbe95 0xe95 0xb77f250b 0xb76ce3a0 0xffedbe95 6 0xc1652989 0xb76313a0 0xf5fdea17 0xa17 0xc1652989 0xb76ce3a0 0xf607ba17 7 0xb76313a6 0xb76a246b 0x000710c5 0x0c5 0xb76ce3a6 0xb773f46b 0x000710c5 8 0xb76a246e 0xb76313ab 0xfff8ef3d 0xf3d 0xb773f46e 0xb76ce3ab 0xfff8ef3d 9 0xb76313bc 0xb7631628 0x0000026c 0x26c 0xb76ce3bc 0xb76ce628 0x0000026c 10 0xb763162f 0xb7631407 0xfffffdd8 0xdd8 0xb76ce62f 0xb76ce407 0xfffffdd8 11 0xb7631407 0xb75ed670 0xfffbc269 0x269 0xb76ce407 0xb768a670 0xfffbc269 12 0xc1652989 0xb75ed670 0xf5f9ace7 0xce7 0xc1652989 0xb768a670 0xf6037ce7 13 0xb75ed671 0xb76a246b 0x000b4dfa 0xdfa 0xb768a671 0xb773f46b 0x000b4dfa 14 0xb76a246e 0xb75ed676 0xfff4b208 0x208 0xb773f46e 0xb768a676 0xfff4b208 15 0xb75ed69a 0xb75ed69d 0x00000003 0x003 0xb768a69a 0xb768a69d 0x00000003 16 0xb75ed6b8 0xb763140c 0x00043d54 0xd54 0xb768a6b8 0xb76ce40c 0x00043d54 machines, are identical, and hence will result in the same TVBD on both machines. Therefore we can calculate the TVBD for each binary on a specific operating system once in a base system and use it on the other machines running the same combination of the application and the operating system. The two systems used in this evaluation are A: Lenovo ThinkPad T420, and B: HP Pavilion g6. C. Static analysis and various execution states Though we used emulated runs to construct the table of the valid branch distances, but it is also possible to draw the CFG by static analysis of the binaries. In this case, considering that we use sets of 16 branches at once, it may come to mind that how can we handle the conditional and the indirect branches in the static analysis. To answer this question it should be mentioned that in the static analysis, if we handle each instruction separately, then the only way we can identify the threat is that we build a table of valid destinations for each individual branch. In other words, constructing the table of valid distances is the only solution, but when an indirect branch is analyzed among the other branches in a set, it would be possible to use the location of the current executed branch in this set as an extra characteristic of the executed branch.
  • 8. This means that in the set of 16 valid branch instructions just before a sensitive system call, which is received from the off-line phase, if an indirect branch is located at the eleventh entry, in the actual execution of this application, this particular indirect branch instruction should be located exactly at the same location among the set of 16 branches before the same sensitive system call. This kind of analysis can prevent attacks like ROP because in these attacks the attacker is constructing a gadget chain and to bypass the detection mechanisms, he/she uses the indirect branches which could take multiple destinations. Using these branch instructions to construct the gadget chain, he/she will be able to direct the execution flow to wherever he/she wants. Because when the branch instructions are considered and checked separately, the only way to detect the threat is to check the destination, but when these indirect branch instructions are checked among the other 15 branch instructions, not only the destination of that branch instruction could be used as a way to detect the violation, but also the position of that specific indirect branch instruction in the set is another tool to identify the threat as well. Another question which may come to mind is that from a high-level approach an application may have various execution flow graphs, so how is it possible to construct the set of 16 branches before a sensitive call? To answer this question one must know that the ”various execution flow graphs” term is a high-level term. The only way that application can have various execution flows, in low-level speaking, is conditional branch instructions. The conditional branch instructions, each could have two valid destinations that according to the condi- tions, one of them is chosen at the execution time. Furthermore, in our proposed work we do not handle all of the branch instructions executed in an application, we only consider the 16 branch instructions before a sensitive system call. Hence, in the worst case, there would be 16 conditional branch instructions to analyze (i.e., 216 different states will be analyzed). However, a situation in which all 16 branch instructions before a sensitive system call are conditional ones, is the most unlikely. An analysis of the Apache web server indicates that on average, there are only 6 conditional branches in the set of 16 branches before sensitive system calls. That means 64 different valid states for TVBD for each system call, which is completely feasible to check at the run-time. Therefore to construct the TVBD and draw the CFG in the off-line phase, it is also possible to use static binary analysis as well. VII. EVALUATION To evaluate our proposed work, we implemented this system in an Ubuntu 14.0.4 LTS box with an Intel Core i7 CPU as a kernel module and analyzed its security against different types of attacks and also its efficiency related to the performance overhead it incurs to the system. TABLE IV THE EVALUATION OF THE SYSTEM’S SECURITY IN DETECTING VARIOUS TYPES OF ATTACKS. Attack type DEP status ASLR status Result shellcode injection OFF OFF Prevented return into libc ON OFF Prevented ROP ON ON Prevented ROP OFF OFF Prevented TABLE V THE PROPOSED SYSTEM’S EFFECTIVENESS AGAINST REAL WORLD EXPLOITS. Application EDB/CVE id Result unrar EDB-ID 17611 Prevented nginx CVE-2013-2028 Prevented A. Security To evaluate the security of this system, we first used a modified version of the BET program and exploited it using shellcode injection, return into libc, and ROP techniques. To check the functionality of our proposed work, we repeated these attacks with the presence of ASLR and DEP protections, and the result of the test is listed in table IV. In case of ROP attacked, we also used the knowledge of the process address space to bypass the ASLR, and in this case, the proposed system prevented the attack successfully. After doing so, to check the reliability of our proposed CFI system against real-world attacks, we used two publicly available exploits against nginx and unrar applications, which both were detected and prevented by the system successfully as listed in table V. B. Performance To evaluate the performance overhead incurred by the proposed system, we examined the execution time of the BET program in presence and absence of the protection system, using valgrind. The results show that in a worth case scenario which the valid content in TVBD is the last one and the table itself contains 128 different sets of valid 16 branches, the incurred overhead is negligible (less than 1%). We have also examined the performance of our system in a real-world scenario as well. To do so, we analyzed the performance of nginx web server in terms of different numbers of connection requests per second and the resulting mean response time per request, using apachebench tool. First, we examined the performance of the web server, running the apachebench against the web server in a bare system and then we executed the test in presence of our proposed work. The result, as depicted in figure 3, shows that the performance in both cases is almost identical. VIII. RELATED WORKS Lots of researches have been done up to date to propose an efficient CFI system and the results of some of these
  • 9. Fig. 3. Performance overhead of the system in nginx web server 200 300 400 500 600 10 15 20 Number of requests per second Meanresponsetimeperrequest(ms) W/O CFI system W CFI system researches are known to be practical to some extent, however, there are some limitations carried with each solution. Some of these solutions require special hardware peripherals or binary instrumentation while others need to access the program’s source code in order to be able to protect it and lots of them are not able to operate beside other protection mechanisms like ASLR. RAGuard [38], SOFIA [10], HAFIX [12], CONVERSE [21] and the proposed system in [30] are protection mechanisms which need special hardware support, like customized CPU instruction set, to operate. Intel has recently presented a hardware facility, named CET, to provide CFI in hardware level, but it has not been implemented in its processors yet. Some other protection systems like CCFIR [37], S-D CFI [25], Lockdown [29], O-CFI [27], and the proposed work in [40] use binary rewriting, instrumentation and internal hooks to enforce the CFI policy. These mechanisms perform some extra checks before indirect branches and validate the execu- tion path of the protected application. To do that, they usually inject some instruction into the binary of the application. The instrumentation technique has been used in the kernel itself as well [9]. Another method used to enforce CFI is to change the com- piler or the protected application’s source code and compiling it again. In this method, the compiler or the program’s source code is modified in a way that performs more security checks during the run-time of the program. Obviously, systems based on this approach need access to program’s source code and recompilation as well. CCFI [26] and the proposed work in [32] are of this type. The above mentioned approaches and other protection sys- TABLE VI COMPARISON OF DIFFERENT CFI SYSTEMS, BASED ON THEIR REQUIREMENTS CFI system Release Specific Recompilation Binary Date Hardware Alteration CFIMon 2012 - - - CCFIR 2013 - - * CONVERSE 2014 * - - Tiec et al 2014 - * - S-D CFI 2014 - - * LockDown 2015 - - * OCFI 2015 - - * CCFI 2015 - * - HAFIX 2015 * - - HCFI 2016 * - - RAGuard 2017 * - - PT-CFI 2017 - - - Proposed work 2017 - - - tems like [35], [33], [31], [11], [18], and [24] are some of the endeavours to propose an effective, practical CFI system. Another classification of CFI systems is to categorize them into Fine grained and Coarse grained systems. While most of the presented systems up to now can be categorized in these classes, our proposed work takes a third approach which we called semi-holistic approach, in which the indirect branches are not analyzed individually, but the system validates them in sets of 16 branch instructions at a time. Doing so, we are able to increase the accuracy of the system and meanwhile, the use of jprobes in Linux kernel is a way of decreasing the performance overhead and monitoring the general behavior of programs. According to what mentioned, a comparison of various CFI systems and our proposed work is presented in table VI. As it is shown in the table, our proposed work, as well as the CFIMon and the PT-CFI, do not have any special requirements to operate, however, there are some differences between our proposed work and other those two systems. In CFI-Mon, the BTS registers are used to track the branch instructions and detect the violation of valid execution path. The incurred overhead is announced to be 6%. This CFI system works in two phases. In the offline phase, a set of valid destinations for each branch instruction is collected using static analysis of binaries, and in the online phase, using the BTS registers, the executed branches are monitored and checked against specific rules. In this system, in the offline phase, the call set contains the addresses of all instructions at the beginning of the functions and the ret set contains the addresses of instructions just after call instructions in the program. The valid destinations for indirect branch instructions are stored in train set, using a learning mechanism. At the runtime, this CFI system checks every branch in- structions against these sets. Because this system uses absolute addresses for branch destinations, it can’t be used beside randomization mechanisms like ASLR. Processing all branch instructions in a program, on the other hand, incurs heavy overhead to the system, and because in this CFI mechanism
  • 10. each branch instruction is validated separately, an attacker may exploit the ret set by using the call preceding gadgets in a ROP attack, and bypass the system. In PT-CFI system, a newly introduced Intel processors facility, named PT (Processor Trace), is used to check the control flow integrity. Using PT, as mentioned in [20], could cause up to hundreds of megabytes of information per second to be generated for each processing core. Moreover, because of the low performance of using all information packets of PT, the PT-CFI system only uses TIP (Target IP) packets to detect the violation of the valid execution flow, and when a violation is detected, then using what they call deep inspection, further analysis of PT information will take place. Although PT-CFI is similar to our proposed work in general, the techniques used there are different. On the other hand, the ability to use the PT- CFI beside other protection mechanisms have not been studied yet. IX. CONCLUSION Although there are various CFI systems presented up to date, these systems validate executed branch instructions sep- arately. This approach lacks a holistic view of the program’s execution flow. Besides, a practical CFI system which acts accurately without the need of the program’s source code, special hardware peripherals, binary alteration, compiler mod- ification and with the ability to operate beside other protection mechanisms is still needed. In this paper, we proposed a new CFI system which ad- dresses the above-mentioned characteristics. This CFI system is able to operate alongside with other protection schemes like DEP and ASLR. In our proposed work, the computation of the CFG is done once on a system and the resulted policy is usable on other systems running the same OS/APP as well, using the PAGE SHIFT concept in Linux kernel. We also used the LBR model specific registers to compute the distance between the source and the destination of the executed branch instructions. To get closer to a holistic view of the program’s execution flow, at each single act of analysis, we study the characteristics of the executed branch instructions, as the only means to direct the execution flow, in sets of 16 branches, just before each sensitive system call and doing so, we are able to detect any violation of the CFG in the program’s execution flow. Facilitated with the kprobe concepts presented in the Linux kernel, we are able to enforce the CFG with a negligible performance overhead. To evaluate our proposed work, we implemented this system as a single LKM (Linux Kernel Module) interacting with the user space to get the CFG on an Ubuntu 14.0.4 LTS box. The result of our evaluations shows that the proposed work is able to detect various types of attacks with a low-performance overhead, alongside with the other protection systems like ASLR and DEP. REFERENCES [1] Microgadgets: Size does matter in turing-complete return-oriented pro- gramming. In Presented as part of the 6th USENIX Workshop on Offensive Technologies, Bellevue, WA, 2012. USENIX. [2] Mart´ın Abadi, Mihai Budiu, ´Ulfar Erlingsson, and Jay Ligatti. Control- flow integrity principles, implementations, and applications. ACM Trans. Inf. Syst. Secur., 13(1):4:1–4:40, November 2009. [3] Nicholas Carlini, Antonio Barresi, Mathias Payer, David Wagner, and Thomas R. Gross. Control-flow bending: On the effectiveness of control- flow integrity. In 24th USENIX Security Symposium (USENIX Security 15), pages 161–176, Washington, D.C., 2015. USENIX Association. [4] Nicholas Carlini and David Wagner. Rop is still dangerous: Breaking modern defenses. In 23rd USENIX Security Symposium (USENIX Security 14), pages 385–399, San Diego, CA, 2014. USENIX Association. [5] Ping Chen, Hai Xiao, Xiaobin Shen, Xinchun Yin, Bing Mao, and Li Xie. DROP: Detecting Return-Oriented Programming Malicious Code, pages 163–177. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009. [6] Yueqiang Cheng, Zongwei Zhou, Miao Yu, Xuhua Ding, and Robert H. Deng. Ropecker: A generic and practical approach for defending against rop attacks. In NDSS. The Internet Society, 2014. [7] Nick Christoulakis, George Christou, Elias Athanasopoulos, and Sotiris Ioannidis. Hcfi: Hardware-enforced control-flow integrity. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, CODASPY ’16, pages 38–49, New York, NY, USA, 2016. ACM. [8] Mauro Conti, Stephen Crane, Lucas Davi, Michael Franz, Per Larsen, Marco Negro, Christopher Liebchen, Mohaned Qunaibit, and Ahmad- Reza Sadeghi. Losing control: On the effectiveness of control-flow integrity under stack attacks. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pages 952–963, New York, NY, USA, 2015. ACM. [9] John Criswell, Nathan Dautenhahn, and Vikram Adve. Kcofi: Complete control-flow integrity for commodity operating system kernels. In Proceedings of the 2014 IEEE Symposium on Security and Privacy, SP ’14, pages 292–307, Washington, DC, USA, 2014. IEEE Computer Society. [10] R. d. Clercq, R. D. Keulenaer, B. Coppens, B. Yang, P. Maene, K. d. Bosschere, B. Preneel, B. d. Sutter, and I. Verbauwhede. Sofia: Software and control flow integrity architecture. In 2016 Design, Automation Test in Europe Conference Exhibition (DATE), pages 1172– 1177, March 2016. [11] Sanjeev Das, Wei Zhang, and Yang Liu. A fine-grained control flow integrity approach against runtime memory attacks for embedded systems. IEEE Trans. Very Large Scale Integr. Syst., 24(11):3193–3207, November 2016. [12] L. Davi, M. Hanreich, D. Paul, A. R. Sadeghi, P. Koeberl, D. Sullivan, O. Arias, and Y. Jin. Hafix: Hardware-assisted flow integrity extension. In 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), pages 1–6, June 2015. [13] Lucas Davi, Christopher Liebchen, Ahmad-Reza Sadeghi, Kevin Z. Snow, and Fabian Monrose. Isomeron: Code randomization resilient to (just-in-time) return-oriented programming. In 22nd Annual Network and Distributed System Security Symposium, NDSS 2015, San Diego, California, USA, February 8-11, 2015, 2015. [14] Lucas Davi, Ahmad-Reza Sadeghi, Daniel Lehmann, and Fabian Mon- rose. Stitching the gadgets: On the ineffectiveness of coarse-grained control-flow integrity protection. In 23rd USENIX Security Symposium (USENIX Security 14), pages 401–416, San Diego, CA, 2014. USENIX Association. [15] Lucas Davi, Ahmad-Reza Sadeghi, and Marcel Winandy. Ropdefender: A detection tool to defend against return-oriented programming attacks. In Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security, ASIACCS ’11, pages 40–51, New York, NY, USA, 2011. ACM. [16] Lucas Vincenzo Davi, Alexandra Dmitrienko, Stefan N¨urnberger, and Ahmad-Reza Sadeghi. Gadge me if you can: Secure and efficient ad- hoc instruction-level randomization for x86 and arm. In Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, ASIA CCS ’13, pages 299–310, New York, NY, USA, 2013. ACM. [17] Isaac Evans, Fan Long, Ulziibayar Otgonbaatar, Howard Shrobe, Martin Rinard, Hamed Okhravi, and Stelios Sidiroglou-Douskos. Control jujutsu: On the weaknesses of fine-grained control flow integrity. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pages 901–913, New York, NY, USA, 2015. ACM. [18] X. Ge, N. Talele, M. Payer, and T. Jaeger. Fine-grained control-flow integrity for kernel software. In 2016 IEEE European Symposium on Security and Privacy (EuroS P), pages 179–194, March 2016.
  • 11. [19] Enes G¨oktas, Elias Athanasopoulos, Herbert Bos, and Georgios Portoka- lidis. Out of control: Overcoming control-flow integrity. In Proceedings of the 2014 IEEE Symposium on Security and Privacy, SP ’14, pages 575–589, Washington, DC, USA, 2014. IEEE Computer Society. [20] Yufei Gu, Qingchuan Zhao, Yinqian Zhang, and Zhiqiang Lin. Pt-cfi: Transparent backward-edge control flow violation detection using intel processor trace. In Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY ’17, pages 173– 184, New York, NY, USA, 2017. ACM. [21] Z. Guo, R. Bhakta, and I. G. Harris. Control-flow checking for intrusion detection via a real-time debug interface. In 2014 International Conference on Smart Computing Workshops, pages 87–92, Nov 2014. [22] Intel. Control-flow Enforcement Technology Preview. Intel Corporation, 2016. [23] Elias Levy. Smashing the stack for fun and profit. Phrack Magazine, 49, 1996. [24] Yan Lin, Xiaoxiao Tang, Debin Gao, and Jianming Fu. Control flow integrity enforcement with dynamic code optimization. In Matt Bishop and Anderson C A Nascimento, editors, Information Security: 19th International Conference, ISC 2016, Honolulu, HI, USA, September 3-6, 2016. Proceedings, pages 366–385, Cham, 2016. Springer International Publishing. [25] X. Liu, Q. Wei, and Z. Ye. Static-dynamic control flow integrity. In 2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pages 189–196, Nov 2014. [26] Ali Jose Mashtizadeh, Andrea Bittau, Dan Boneh, and David Mazi`eres. Ccfi: Cryptographically enforced control flow integrity. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pages 941–951, New York, NY, USA, 2015. ACM. [27] Vishwath Mohan, Per Larsen, Stefan Brunthaler, Kevin W. Hamlen, and Michael Franz. Opaque control-flow integrity. In NDSS, 2015. [28] Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis. Transparent rop exploit mitigation using indirect branch tracing. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), pages 447–462, Washington, D.C., 2013. USENIX. [29] Mathias Payer, Antonio Barresi, and Thomas R. Gross. Fine-grained control-flow integrity through binary hardening. In Magnus Almgren, Vincenzo Gulisano, and Federico Maggi, editors, Detection of Intrusions and Malware, and Vulnerability Assessment: 12th International Confer- ence, DIMVA 2015, Milan, Italy, July 9-10, 2015, Proceedings, Cham, 2015. Springer International Publishing. [30] Dean Sullivan, Orlando Arias, Lucas Davi, Per Larsen, Ahmad-Reza Sadeghi, and Yier Jin. Strategy without tactics: Policy-agnostic hardware- enhanced control-flow integrity. In Proceedings of the 53rd Annual Design Automation Conference, DAC ’16, pages 163:1–163:6, New York, NY, USA, 2016. ACM. [31] Jiaqi Tan, Hui Jun Tay, Utsav Drolia, Rajeev Gandhi, and Priya Narasimhan. Pcfire: Towards provable preventative control-flow integrity enforcement for realistic embedded software. In Proceedings of the 13th International Conference on Embedded Software, EMSOFT ’16, pages 19:1–19:10, New York, NY, USA, 2016. ACM. [32] Caroline Tice, Tom Roeder, Peter Collingbourne, Stephen Checkoway, ´Ulfar Erlingsson, Luis Lozano, and Geoff Pike. Enforcing forward-edge control-flow integrity in gcc & llvm. In Proceedings of the 23rd USENIX Conference on Security Symposium, SEC’14, pages 941–955, Berkeley, CA, USA, 2014. USENIX Association. [33] X. Wang and R. Karri. Reusing hardware performance counters to detect and identify kernel control-flow modifying rootkits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(3):485– 498, March 2016. [34] Xueyang Wang and Jerry Backer. SIGDROP: signature-based ROP detection using hardware performance counters. CoRR, abs/1609.02667, 2016. [35] Xueyang Wang and Ramesh Karri. Numchecker: Detecting kernel control-flow modifying rootkits by using hardware performance counters. In Proceedings of the 50th Annual Design Automation Conference, DAC ’13, pages 79:1–79:7, New York, NY, USA, 2013. ACM. [36] Yubin Xia, Yutao Liu, Haibo Chen, and Binyu Zang. Cfimon: Detecting violation of control flow integrity using performance counters. In Pro- ceedings of the 2012 42Nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), DSN ’12, pages 1–12, Washington, DC, USA, 2012. IEEE Computer Society. [37] Chao Zhang, Tao Wei, Zhaofeng Chen, Lei Duan, Laszlo Szekeres, Stephen McCamant, Dawn Song, and Wei Zou. Practical control flow integrity and randomization for binary executables. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, SP ’13, pages 559–573, Washington, DC, USA, 2013. IEEE Computer Society. [38] J. Zhang, R. Hou, J. Fan, K. Liu, K. Zhang, and S. A. McKee. Raguard: A hardware based mechanism for backward-edgecontrol-flow integrity. In ACM International Conference on Computing Frontiers 2017, Siena, Italy, 2017. ACM. [39] Shacham, Hovav and Page, Matthew and Pfaff, Ben and Goh, Eu-Jin and Modadugu, Nagendra and Boneh, Dan. On the Effectiveness of Address- space Randomization. In Proceedings of the 11th ACM Conference on Computer and Communications Security, pages 298–307, New York, NY, USA, 2004. ACM. [40] Mingwei Zhang and R. Sekar. Control flow integrity for cots binaries. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), pages 337–352, Washington, D.C., 2013. USENIX. [41] S. Andersen and V. Abella. Changes to functionality in microsoft windows xp service pack 2, part 3: Memory protection technologies, Data Execution Prevention, In Microsoft TechNet Library, September 2004. http://technet.microsoft.com/en-us/library/bb457155.aspx. [42] PaX Team, ”Address Space Layout Randomization (ASLR)”, 2003. https://pax.grsecurity.net/docs/aslr.txt. [43] RedHat, ”Position Independent Executables (PIE)”, In Redhat Customer Portal, November 2012. https://access.redhat.com/blogs/766093/posts/1975793. [44] Alexander Sotirov, ”Heap Feng Shui in JavaScript”, In Black- Hat Europe, 2007. https://www.blackhat.com/presentations/bh-europe- 07/Sotirov/Presentation/bh-eu-07-sotirov-apr19.pdf. [45] Tilo Muler, Computer Science ”ASLR Smack and Laugh Refer- ence”, In Seminar on Advanced Exploitation Techniques, RWTH Aachen, Germany, February 2008. https://pdfs.semanticscholar.org/440e/ 61ecb744e55d0425cdb648fe24e4ff999686.pdf.