SlideShare a Scribd company logo
1 of 68
Download to read offline
I
Handover Parameters
Self-optimization by
Q-Learning in 4G
Networks
Realized by: Supervised by:
Mohamed Raafat OMRI PhD. Maissa BOUJELBEN
July 12, 2016
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
i
Dedication
I dedicate my dissertation work to my family and my friends. A special feeling of
gratitude to my loving parents, Salah and Sghaira Omri whose words of
encouragement and push for tenacity ring in my ears.
My sisters Kaouther, Lamia, Soumaya, Leila
and my brother Lotfi have never left
my side and are very special.
I dedicate this dissertation to my friends
who have supported me throughout the process.
I will always appreciate all they have done.
I also dedicate this work and give special
thanks to my lovely fiancée Safa.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
ii
ACKNOWLEDGEMENTS
I would like to thank my supervisor PhD. Maissa Boujelben for her help and guidance
throughout my progress in this project.
I would like to acknowledge and thank Mr. Walid Douagi, head of Telecom Department,
PhD. Talel Zouari, my school ESPRIT and ESPRIT TECH for allowing me to conduct my
project and providing the requested assistance.
Special thanks go to the members of the jury.
I must acknowledge as well the many friends, colleagues and teachers who assisted, advised,
and supported my engineering studies and writing efforts over the years.
Finally I would like to acknowledge my family for their unlimited support and help.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
iii
Abstract
With more and more customers using mobile communications it is important for the service
providers to give their customers the best Quality-of-Service (QoS) they can afford. Many
providers have taken to improving their networks and make them more appealing to
customers. One such improvement that providers can deliver to their customers is to enhance
reliability of the network meaning that customers' calls are less likely to be dropped by the
network.
This dissertation explores improving the reliability of a 4G network by optimizing the
parameters used in handover. The process of handover within mobile communication
networks is very important since it allows users to move around freely while still staying
connected to the network. The most important parameters used in the handover process are
the Time-to-Trigger (TTT) and Hysteresis (hys). These parameters are used to determine
whether a base station is better than the serving base station by enough offset to warrant a
handover taking place. The challenge in optimizing the handover parameters is that there is a
fine balance that needs to be struck between calls being dropped due to a handover failing and
the connection switching back and forth between two base stations, unnecessarily, wasting the
network resources. In this project, we propose to use a machine learning technique known as
Q-Learning to optimize the handover parameters by generating a policy that can be followed
to adjust the parameters as needed. It was found that the implemented Q-Learning algorithm
was capable of improving the Handover performance by minimizing the chosen Handover-
related Key Performance Indicators (KPI).
Key words: LTE-Advanced, Handover, Q-learning Algorithm, Hysteresis margin, Time-To-
Trigger, Self-Optimization Network.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
iv
Table of contents
General Introduction .................................................................................................................. 1
Chapter I: LTE-Advanced Overview ......................................................................................... 3
Introduction......................................................................................................................... 3
1.1. Requirements and Targets for LTE-Advanced............................................................ 3
1.2. LTE Enabling Technologies. ....................................................................................... 5
1.2.1. Downlink OFDMA (Orthogonal Frequency Division Multiple Access).................. 5
1.2.2. Uplink SC-FDMA (Single Carrier Frequency Division Multiple Access)............... 6
1.2.3. LTE-A Channel Bandwidths and resource elements................................................ 7
1.3. LTE-Advanced Network Architecture......................................................................... 7
1.3.1. The Core Network: Evolved Packet Core (EPC)...................................................... 8
1.3.2. The Access Network E-UTRAN............................................................................... 9
1.3.3. The User Equipment (UE). ..................................................................................... 12
1.4. E-UTRAN Network Interfaces...................................................................................... 12
1.4.1. X2 Interface. ........................................................................................................... 12
1.4.2. S1 Interface ............................................................................................................. 13
1.5. LTE Protocol Architecture ............................................................................................ 14
1.5.1. User Plane ............................................................................................................... 14
1.5.2. Control Plane .......................................................................................................... 14
1.5.2.1. Radio Resource Control (RRC)............................................................................... 15
1.5.2.2. Radio Resource Control States................................................................................ 16
1.6. Self-Organizing Networks............................................................................................. 17
Conclusion............................................................................................................................ 19
Chapter II: Handover in LTE-Advanced.................................................................................. 20
Introduction. ......................................................................................................................... 20
2.1. Handover Definition and Characteristics ...................................................................... 20
2.1.1. Seamless Handover................................................................................................. 20
2.1.2. Lossless Handover .................................................................................................. 21
2.2. Types of Handover ........................................................................................................ 22
2.2.1. Intra LTE Handover: Horizontal Handover............................................................ 22
2.2.2. Vertical Handover................................................................................................... 22
2.3. Handover Techniques.................................................................................................... 23
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
v
2.3.1. Soft handover, Make-Before-Break........................................................................ 23
2.3.2. Hard handover, Break-Before-Make: ..................................................................... 23
2.4. Handover Procedure ...................................................................................................... 24
2.5. Handover Measurements ............................................................................................... 29
2.6. Handover Parameters..................................................................................................... 31
2.7. Time To Trigger & Hysteresis....................................................................................... 33
Conclusion............................................................................................................................ 35
Chapter III: Machine Learning and Handover Parameter Optimization simulation................ 36
Introduction. ......................................................................................................................... 36
3.1. Q-Learning overview..................................................................................................... 36
3.1.1. Machine Learning. .................................................................................................. 36
3.1.2. Reinforcement Learning. ........................................................................................ 37
3.1.3. Q-Learning.............................................................................................................. 38
3.2. Proposed Approach for HO optimization:..................................................................... 40
3.2.1. Set of states ............................................................................................................. 40
3.2.2. Set of actions........................................................................................................... 42
3.2.3. Reward. ................................................................................................................... 43
3.3. Simulation & Performance evaluation: ......................................................................... 43
3.3.1. Simulation parameters............................................................................................. 44
3.3.2. Simulation results.................................................................................................... 48
Conclusion............................................................................................................................ 52
General Conclusion.................................................................................................................. 53
References ................................................................................................................................ 54
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
vi
List of Figures
Figure 1: Orthogonal Frequency Division Multiple Access…………………………….. 6
Figure 2: LTE SAE Evolved Packet Core………………………………………………. 8
Figure 3: E-UTRAN Architecture………………………………………………………. 9
Figure 4: Functional Split between E-UTRAN and EPC……………………………….. 11
Figure 5: Protocol stack for the user-plane and control-plane at X2 interface………….. 13
Figure 6: Protocol stack for the user-plane and control-plane at S1 interface …………. 13
Figure 7: E-UTRAN Protocol Stack…………………………………………………….. 14
Figure 8: The RRC States……………………………………………………………….. 16
Figure 9: Decision on Handover Type………………………………………………….. 24
Figure 10: Intra-MME/Serving Gateway Handover ……………………………………. 25
Figure 11: Handover Timing ……………………………................................................ 28
Figure 12: Downlink reference signal structure for LTE-Advanced …………………... 31
Figure 13: Handover measurement filtering and reporting …………………………… 31
Figure 14: Handover triggering procedure …………………………………………….. 32
Figure 15: State 157 possible actions…………………………………………………… 42
Figure 16: Illustration of Coverage within the Simulation Area……………………….. 45
Figure 17: Illustration of how the TTT values changed over time for large values when
UE travelling at walking speeds………………………………………………………… 49
Figure 18: Comparison of TTT Optimization for Walking Speeds (Starting Point
5.12s)……………………………………………………………………………………… 50
Figure 19: Graph of Optimized vs. Non-Optimized Results for Starting Point
TTT=0s hys.=0dB when UE traveling at walking speeds……………………………….. 51
Figure 20: Graph of Optimized vs. Non-Optimized Results for Starting Point
TTT=0.256s hys.=5dB when UE traveling at walking speeds…………………………… 52
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
vii
List of Tables
Table 1: LTE-Advanced development history…………………………………………... 3
Table 2: Number of PRBs …………….………………………………………………... 6
Table 3: Operational benefits by SON…………………………………………………... 19
Table 4: Table of the different LTE hys. values………………………………………. 33
Table 5: Table of the different LTE TTT values……………………………………….. 34
Table 6: Table of the different LTE Trigger types and their criteria…………………… 34
Table 7: Set of states…………………………………………………………………….. 41
Table 8: Simulation parameters………………………………………………………… 47
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
viii
Abbreviations
3G 3rd Generation (Cellular Systems)
3GPP Third Generation Partnership Project
4G 4th Generation (Cellular Systems)
AC Admission Control
ACK Acknowledgement (in ARQ protocols)
AI Artificial Intelligence
AM Acknowledged mode
AGWA Access Gateway
AS Access Stratum
BS Base Station
CDF Cumulative Distribution Function
CDMA Code Division Multiple Access
CQI Channel Quality Indicator
CS Circuit-Switched
dB Decibel
DFT Discrete Fourier Transform
DL Downlink
DRB Data Radio Bearer
eNodeB Enhanced Node B (3GPP Base Station)
EPC Evolved Packet Core
E-UTRAN Evolved Universal Terrestrial Radio Access
FDD Frequency Division Duplex
GPRS General Packet Radio Service
GSM Global System for Mobile communications
HO Handover
HOM HO margin
HSDPA High Speed Downlink Packet Access
HSS Home Subscriber Server
HYS Hysteresis
IMS Multimedia Sub-system
IP Internet Protocol
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
ix
ITU International Telecommunication Union
ITU-T ITU Telecommunication Standardization Sector
LTE-Advanced Long Term Evolution Advanced
MAC Medium Access Control
MME Mobility Management Entity
NACK Negative Acknowledgement
NAS Non-Access Stratum
NGMN Next Generation Mobile Networks
OFDM Orthogonal Frequency Division Multiplexing
OFDMA Orthogonal Frequency Division Multiple Access
OTP Optimum Trigger Point
PAPR Peak-to-Average Power Ratio
PCRF Policy and Charging Rules Function
PDCP Packet-Data Convergence Protocol
PDN Packet Data Network
PDU Protocol Data Unit
PGW PDN Gateway
QoS Quality of Service
RAN Radio Access Network
RB Resource Block
RF Radio Frequency
RLC Radio Link Protocol
RNC Radio Network Controller
ROHC RObust Header Compression
RRC Radio Resource Control
RRM Radio Resource Management
RSRP Reference Signal Received Power
RSRQ Reference Signal Received Quality
RSS Received Signal Strength
RSSI Received Signal Strength Indicator
SAE System Architecture Evolution
S1 The interface between eNodeB and Access Gateway
S1AP S1 Application Part
SC-FDMA Single Carrier - Frequency Division Multiple Access
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
x
SGW Serving Gateway
SINR Signal-to-Interference-plus-Noise Ratio
SIR Signal-to-Interference Ratio
SN Sequence Number
SON Self-Organizing Network
SRB Signaling Radio Bearers
TE Terminal Equipment
TM Transparent Mode
TTI Transmission Time Interval
TTT Time-to-Trigger
UE User Equipment, the 3GPP name for the mobile terminal
UL Uplink
UM Unacknowledged Mode
UMTS Universal Mobile Telecommunication System
USIM Universal Subscriber Identity Module
VoIP Voice over IP
X2 Interface between eNodeB’s
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
1
General Introduction
In recent years, there has been enormous growth in mobile telecommunications traffic in line
with the rapid spread of smart phone devices. The cellular networks are evolving to meet the
future requirements of data rate, coverage and capacity. LTE Advanced is a mobile
communication standard and a major enhancement of the Long Term Evolution (LTE)
standard. It was formally submitted as a candidate 4G system to ITU-T in late 2009 as
meeting the requirements of the IMT-Advanced standard, and was standardized by the 3rd
Generation Partnership Project (3GPP) in March 2011 as 3GPP Release 10. One of the
important LTE Advanced benefits is the ability to take advantage of advanced topology
networks; optimized heterogeneous networks with a mix of macrocells with low power nodes
such as picocells, femtocells and new relay nodes. The next significant performance leap in
wireless networks will come from making the most of topology, and brings the network closer
to the user by adding many of these low power nodes. LTE-Advanced further improves the
capacity and coverage, and ensures user fairness. LTE-Advanced also introduces multicarrier
to be able to use ultra wide bandwidth, up to 100 MHz of spectrum supporting very high data
rates. Mobility aspect for the enhancement is an important Long Term Evolution technology
since it should support mobility for various mobile speeds up to 350km/h or even up to
500km/h. With the moving speed even higher, the handover procedure will be more frequent
and fast; therefore, the handover performance becomes more crucial especially for real time
services [11].
One of the main goals of LTE-Advanced or any wireless system for that matter is to provide
fast and seamless handover from one cell (a source cell) to another (a target cell). The service
should be maintained during the handover procedure, data transfer should not be delayed or
should not be lost; otherwise performance will be dramatically degraded. This is especially
applicable for LTE-Advanced systems because of the distributed nature of the LTE radio
access network architecture which consists of just one type of node, the base station, known in
LTE-Advanced as the eNodeB [7].
In LTE-Advanced there are also some predefined handover conditions for triggering the
handover procedure as well as some goals regarding handover design and optimization such
as decreasing the total number of handovers in the whole system by predicting the handover,
decreasing the number of ping pong handovers, and having fast and seamless handover.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
2
Hence, optimizing the handover procedure to get the required performance is considered as
one important issue in LTE-Advanced networks [11].
Actually, many studies are carried out to achieve improvements in LTE-Advanced handover,
with different HO algorithms and which take several stages for different cases, but certainly
all of them are done in order to get optimum handover mechanisms that can handle the
smooth handover on cell boundaries of the LTE-Advanced network.
The main objective of this project is to develop a Q-learning algorithm to self-optimize the
parameters used in the handover process of 4G networks.
In this project we have three chapters: the first chapter contains an overview of LTE
technology; the main characteristics and functionalities of the system are described as well as
the enabling technologies, network architecture and protocol. In the second chapter, we
introduce the general concepts of handover and we describe the whole HO procedure. The
optimization and design principles as well as the variables used as inputs and the different HO
parameters also explained and finally the third chapter discusses our proposed approach.
First, we present the machine Learning explaining thus the reinforcement learning and the Q-
Learning. Then we discuss the handover parameter optimization. Finally we present the
simulation parameters and the obtained results.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
3
Chapter I: LTE-Advanced Overview
Introduction:
In LTE-Advanced networks, focus is on higher capacity: The driving force to further develop
LTE towards LTE–Advanced - LTE Release10 was to provide higher bitrates in a cost
efficient way and, at the same time, completely fulfill the requirements set by ITU for IMT
Advanced, also referred to as 4G.
In this chapter, we will present the LTE-Advanced technologies, resource elements and the
network architecture by citing the different key components.
1.1. Requirements and Targets for LTE-Advanced:
3GPP completed the process of defining LTE-Advanced for radio access, so that the
technology systems remain competitive in the future. The 3GPP has identified a set of high
level requirements that have already been exceeded so far.
The following target requirements were agreed among operators and vendors at the project to
define the evolution of 3G networks.
Table 1: LTE-Advanced development history.
WCDMA
(UMTS)
HSPA
HSDPA/HSUPA
HSPA+ LTE LTE-A
Max downlink
speed (bps)
384 K 14 M 28 M 100 M 1 G
Max uplink speed
(bps)
128 K 5.7 M 11 M 50 M 100 M
Latency round
trip time (approx.)
150 ms 100 ms 50 ms
max
~10 ms Less than 5 ms
3GPP releases Rel 99/5 Rel 5/6 Rel 7 Rel 8/9 Rel 10
Approx. years of
initial roll out
2003/4 2005/6 HSDPA
2007/8 HSUPA
2008/9 2009/10 2011
Access
methodology
CDMA CDMA CDMA OFDMA/SC-
FDMA
OFDMA/SC-
FDMA
Some of key LTE-Advanced requirements related to data rate, throughput, latency, and
mobility are provided below [3]:
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
4
 Peak data rate:
o 1 Gbps data rate will be achieved by 4 ‐by‐ 4 MIMO and transmission bandwidth
wider than approximately 70 MHz.
 Peak Spectrum efficiency:
o DL: Rel. 8 LTE satisfies IMT ‐Advanced requirement: 30 bps/Hz.
o UL: Need to double from Release 8 to satisfy IMT‐Advanced
requirement: 15 bps/Hz and 30 bps/Hz in Rel 10.
 Capacity and cell‐edge user throughput:
o Target for LTE‐Advanced was set considering gain of 1.4 to 1.6 from Release 8
LTE performance.
 Spectrum flexibility:
In addition to the bands currently defined for LTE Release 8, TR 36.913
identifies the following new bands:
o 450–470 MHz band
o 698–862 MHz band
o 790–862 MHz band
o 2.3–2.4 GHz band
o 3.4–4.2 GHz band
o 4.4–4.99 GHz band
Some of these bands are now formally included in the 3GPP Release 9 and Release 10
specifications. Note that frequency bands are considered release independent features, which
means that it is acceptable to deploy an earlier release product in a band not defined until a
later release. LTE-Advanced is designed to operate in spectrum allocations of different sizes,
including allocations wider than the 20 MHz in Release 8, in order to achieve higher
performance and target data rates. Although it is desirable to have bandwidths greater than 20
MHz deployed in adjacent spectrum, the limited availability of spectrum means that
aggregation from different bands is necessary to meet the higher bandwidth requirements.
This option has been allowed for in the IMT-Advanced specifications.
 Mobility:
o E-UTRAN should be optimized for low mobile speed from 0 to 15 km/h.
o Higher mobile speed between 15 and 120 km/h should be supported with high
performance.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
5
o Mobility across the cellular network shall be maintained at speeds from 120 km/h
to 350 km/h (or even up to 500 km/h depending on the frequency band).
 Coverage:
o Throughput, spectrum efficiency and mobility targets above should be met for 5
km cells, and with a slight degradation for 30 km cells. Cells range up to 100 km
should not be precluded.
o Available for paired and unpaired spectrum arrangements.
1.2. LTE Enabling Technologies:
LTE has introduced a number of new technologies when compared to the previous cellular
systems. They enable LTE-Advanced to operate more efficiently with respect to the use of
spectrum, and also to provide much higher data rates that are being required.
A major difference of LTE-Advanced in comparison to its 3GPP ancestors is the radio
interface; Orthogonal Frequency Division Multiple Access (OFDMA) and Single Carrier
Frequency Division Multiple Access (SC-FDMA) are used for the downlink and uplink
respectively, as radio access schemes [6].
1.2.1. Downlink OFDMA (Orthogonal Frequency Division Multiple Access):
OFDMA is a variant of OFDM (Orthogonal Frequency Division Multiplexing) and it is the
downlink access technology. One of the most important advantages is the intrinsic
orthogonality provided by OFDMA to the users within a cell, which translates into an almost
null level of intra-cell interference. Therefore, inter-cell interference is the limiting factor
when high reuse levels are intended. In this case, cell-edge users are especially susceptible to
the effects of inter-cell interference. OFDMA divides the wide available bandwidth into many
narrow and mutually orthogonal subcarriers and transmits the data in parallel streams. The
smallest transmission unit in the downlink LTE-Advanced system is known as a Physical
Resource Block (PRB).
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
6
Figure 1: Orthogonal Frequency Division Multiple Access [5].
A resource block contains 12 subcarriers, regardless of the overall LTE-Advanced signal
bandwidth. They also cover one slot in the time frame; this means that different LTE-
Advanced signal bandwidths will have different numbers of resource blocks.
Table 2: Number of PRBs.
Channel Bandwidth (MHz) 1.4 3 5 10 15 20
Number of PRBs 6 15 25 50 75 100
The OFDM signal used in LTE-Advanced comprises a maximum of 2048 different sub-
carriers having a spacing of 15 kHz. Although it is mandatory for the mobiles to have
capability to be able to receive all 2048 sub-carriers, not all need to be transmitted by the base
station (eNodeB) which only needs to be able to support the transmission of 72 sub-carriers.
In this way all mobiles will be able to talk to any base station.
1.2.2. Uplink SC-FDMA (Single Carrier Frequency Division Multiple Access):
For the LTE-Advanced uplink, a different concept is used for the access technique. Although
still using a form of OFDMA technology, the implementation is called Single Carrier
Frequency Division Multiple Access (SC-FDMA). The main task of this scheme is to assign
communication resources to multiple users. The major difference to other schemes is that it
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
7
performs DFT (Discrete Fourier Transform) operation on time domain modulated data before
going into OFDM modulation.
One of the key parameters that affect all mobiles is that of battery life. Even though battery
performance is improving all the time, it is still necessary to ensure that the mobiles use as
little battery power as possible. With the RF power amplifier that transmits the radio
frequency signal via the antenna to the base station being the highest power item within the
mobile, it is necessary that it operates in as efficient mode as possible. This can be
significantly affected by the form of radio frequency modulation and signal format. Signals
that have a high peak to average ratio and require linear amplification do not lend themselves
to the use of efficient RF power amplifiers [5].
1.2.3. LTE-A Channel Bandwidths and resource elements:
One of the key parameters associated with the use of OFDM within LTE-Advanced is the
choice of bandwidth. The available bandwidth influences a variety of decisions including the
number of carriers that can be accommodated in the OFDM signal and in turn this influences
elements including the symbol length and so forth [6].
LTE can support 6 kinds of bandwidth and obviously, to higher bandwidth we will obtain
greater channel capacity: 1.4 MHz, 3MHz, 5MHz, 10MHz, 15 MHz and 20MHz.
In addition to this, the subcarriers are spaced 15 kHz apart from each other. To maintain
orthogonality, this gives a symbol rate of 1 / 15 kHz = of 66.7 µs. Each subcarrier is able to
carry data at a maximum rate of 15 ksps (kilo symbols per second). This gives a 20 MHz
bandwidth system a raw symbol rate of 18 Msps. In turn this is able to provide a raw data rate
of 108 Mbps as each symbol using 64QAM is able to represent six bits.
1.3. LTE-Advanced Network Architecture:
LTE-A has been designed to support only packet switched services, in contrast to the circuit-
switched model of previous cellular systems. It aims to provide seamless Internet Protocol
(IP) connectivity between User Equipment (UE) and the Packet Data Network (PDN), without
any disruption to the end users’ applications during mobility [2].
While the term “LTE” encompasses the evolution of the Universal Mobile
Telecommunications System (UMTS) radio access through the Evolved UTRAN (E-
UTRAN), it is accompanied by an evolution of the non-radio aspects under the term “System
Architecture Evolution” (SAE).
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
8
Together LTE-Advanced and SAE comprise the Evolved Packet System (EPS). This EPS in
turn includes the EPC (Evolved Packet Core) on the core side and E-UTRAN (Evolved
UMTS Terrestrial Radio Access Network) on the access side [2].
In addition to these two components, User Equipment (UE) and Services Domain are also
very important subsystems of LTE architecture.
1.3.1. The Core Network: Evolved Packet Core (EPC):
The core network is responsible for the overall control of the UE and establishment of the
bearers. The Evolved Packet Core is the main element of the LTE-Advanced SAE network.
This consists of four main elements and connects to the eNodeB’s as shown in the diagram
below.
Figure 2: LTE-Advanced SAE Evolved Packet Core [6].
 Mobility Management Entity (MME):
The MME is the main control node for the LTE SAE access network, handling a number of
features, it can therefore be seen that the SAE MME provides a considerable level of overall
control functionality. The protocols running between the UE and the CN are known as the
Non Access Stratum (NAS) protocols. The main functions supported by the MME can be
classified as:
 Functions related to bearer management – This includes the establishment,
maintenance and release of the bearers and is handled by the session management
layer in the NAS protocol.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
9
 Functions related to connection management – This includes the establishment of
the connection and security between the network and UE and is handled by the
connection or mobility management layer in the NAS protocol layer.
 Serving Gateway (SGW):
The Serving Gateway, SGW, is a data plane element within the LTE SAE. Its main purpose is
to manage the user plane mobility and it also acts as the main border between the Radio
Access Network, RAN and the core network. The SGW also maintains the data paths between
the eNodeB’s and the PDN Gateways. In this way the SGW forms an interface for the data
packet network at the E-UTRAN.
 PDN Gateway (PGW):
The LTE SAE PDN (Packet Data Network) gateway provides connectivity for the UE to
external packet data networks, fulfilling the function of entry and exit point for UE data. The
UE may have connectivity with more than one PGW for accessing multiple PDNs.
 Home Subscription Server (HSS):
The HSS is a database server which is located in the operator's premises. All the user
subscription information is stored in the HSS. The HSS also contains the records of the user
location and has the original copy of the user subscription profile. The HSS is interacting with
the MME, and it needs to be connected to all the MMEs in the network that controls the UE.
1.3.2. The Access Network E-UTRAN:
The E-UTRAN is the Access Network of LTE and simply consists of a network of eNodeB’s
that are connected to each other via X2 interface as illustrated in Figure 3. The eNodeB’s are
also connected to the EPC via S1 interface, more specifically to the MME by means of the
S1-MME interface and to the S-GW by means of the S1-U interface.
Figure 3: E-UTRAN Architecture [9].
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
10
 eNodeB:
The eNodeB is a radio base station of a LTE network that controls all radio-related functions
in the fixed part of the system. These radio base stations are distributed throughout the
coverage region and each of them is placed near a radio antenna. One of the biggest
differences between LTE network and legacy mobile communication system 3G is a base
station.
Practically, an eNodeB provides bridging between the UE and EPC. All the radio protocols
that are used in the access link are terminated in the eNodeB. The eNodeB does
ciphering/deciphering in the user plane as well as IP header compression/decompression. The
eNodeB also has some responsibilities in the control plane such as radio resource
management and performing control over the usage of radio resources.
The E-UTRAN has many responsibilities regarding to all related radio functions. The main
features that supports are the following:
 Radio Resource Management:
The RRM objective is to make the mobility feasible in cellular wireless networks so that the
network with the help of the UE takes care of the mobility without user intervention. RRM
covers all functions related to the radio bearers, such as radio bearer control, radio admission
control, radio mobility control, scheduling and dynamic allocation of resources to UEs in both
uplink and downlink.
 IP Header Compression:
This helps to ensure efficient use of the radio interface by compressing the IP packet headers
which could otherwise represent a significant overhead, especially for small packets such as
VoIP.
One of the main functions of PDCP (Packet Data Convergence Protocol) is header
compression using the Robust Header Compression (ROHC) protocol defined by the IETF. In
LTE, header compression is very important because there is no support for the transport of
voice services via the Circuit-Switched (CS) domain.
 Security:
Security is a very important feature of all 3GPP radio access technologies. LTE provides
security in a similar way to its predecessors UMTS and GSM. Because of the sensitivity of
signaling messages exchanged between the eNodeB itself and the terminal, or between the
MME and the terminal, all this set of information is protected against eavesdropping and
alteration.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
11
The implementation of security architecture of LTE is carried out by two functions: Ciphering
of both control plane (RRC) data and user plane data, and Integrity Protection which is used
for control plane (RRC) data only. Ciphering is used in order to protect the data streams from
being received by a third party, while Integrity Protection allows the receiver to detect packet
insertion or replacement. RRC always activates both functions together, either following
connection establishment or as part of the handover to LTE.
 Connectivity to the EPC:
This function consists of the signaling towards the MME and the bearer path towards the S-
GW. All of the above-mentioned functions are concentrated in the eNodeB as in LTE all the
radio controller functions are gathered in the eNodeB. This concentration helps different
protocol layers interact with each other better and will end up in decreased latency and
increase in efficiency.
On the network side, all of these functions reside in the eNodeB’s, each of which can be
responsible for managing multiple cells. Unlike some of the previous second and third
generation technologies, LTE integrates the radio controller function into the eNodeB. This
allows tight interaction between the different protocol layers of the radio access network
(RAN), thus reducing latency and improving efficiency. Furthermore, as LTE does not
support soft handover there is no need for a centralized data-combining function in the
network. One consequence of the lack of a centralized controller node is that, as the UE
moves, the network must transfer all information related to a UE, that is, the UE context,
together with any buffered data, from one eNodeB to another. Mechanisms are therefore
needed to avoid data loss during handover.
Figure 4: Functional Split between E-UTRAN and EPC [5].
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
12
1.3.3. The User Equipment (UE):
The end user communicates using a UE. The UE can be a handheld device like a smart phone
or it can be a device which is embedded in a laptop. The UE is divided into two parts: the
Universal Subscriber Identity Module (USIM) and the rest of the UE, which is called
Terminal Equipment (TE).
The USIM is an application with the purpose of identification and authentication of the user
for obtaining security keys. This application is placed into a removable smart card called a
universal integrated circuit card (UICC).
The UE in general is the end-user platform that by the use of signaling with the network, sets
up, maintains, and removes the necessary communication links. The UE is also assisting in
the handover procedure and sends reports about terminal location to the network.
1.4. E-UTRAN Network Interfaces:
There are two interfaces concerned in handover procedure in LTE for UEs in active mode,
which are X2 and S1 interfaces. Both interfaces can be used in handover procedures, but with
different purposes.
1.4.1. X2 Interface:
The X2 interface has a key role in the intra-LTE handover operation. The source eNodeB will
use the X2 interface to send the Handover Request message to the target eNodeB. If the X2
interface does not exist between the two eNodeB’s in question, then procedures need to be
initiated to set one up before handover can be achieved [3].
The Handover Request message initiates the target eNodeB to reserve resources and it will
send the Handover Request Acknowledgement message assuming resources are found.
There are different information elements provided (some optional) on the handover Request
message, such as:
 Requested SAE bearers to be handed over.
 Handover restrictions list, which may restrict following handovers for the UE.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
13
 Last visited cells the UE has been connected to, if the UE historical information
collection functionality is enabled. This has been considered to be useful in avoiding
the Ping-Pong effects between different cells when the target eNodeB is given
information on how the serving eNodeB has been changing in the past. Thus actions
can be taken to limit frequent X2 User Plane.
Figure 5: Protocol stack for the user-plane and control-plane at X2 interface [3].
1.4.2. S1 Interface:
The radio network signaling over S1 consists of the S1 Application Part (S1AP).The S1AP
protocol handles all procedures between the EPC and E-UTRAN. It is also capable of
carrying messages transparently between the EPC and the UE. Over the S1 interface the S1AP
protocol primarily supports general E-UTRAN procedures from the EPC, transfers
transparent non-access signaling and performs the mobility function. The figure below
shows the protocol stack for the user-plane and control-plane at S1 interface [3].
Figure 6: Protocol stack for the user-plane and control-plane at S1 interface [3].
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
14
1.5. LTE Protocol Architecture:
The overall radio interface protocol architecture for LTE can be divided into User Plane
Protocols and Control Plane Protocols. The U-UTRAN protocol stack is depicted in the figure
7.
Figure 7: E-UTRAN Protocol Stack [8].
1.5.1. User Plane:
An IP packet is tunneled between the P-GW and the eNodeB to be transmitted towards the
UE. Different tunneling protocols can be used. The tunneling protocol used by 3GPP is called
the GPRS tunneling protocol (GTP) [8].
The LTE Layer 2 user-plane protocol stack is composed of three sub layers: Packet Data
Convergence Protocol (PDCP), Radio Link Control (RLC) and Medium Access Control
(MAC). These sub layers are terminated in the eNodeB on the network side.
1.5.2. Control Plane:
Control plane and User plane have common protocols which perform the same functions
except that for the control plane protocols there is no header compression. In the access
stratum protocol stack and above the PDCP, there is the Radio Resource Control (RRC)
protocol which is considered as a “Layer 3” protocol. RRC sends signaling messages between
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
15
the eNodeB and UE for establishing and configuring the radio bearers of all lower layers in
the access stratum.
1.5.2.1. Radio Resource Control (RRC):
The RRC (Radio Resource Control) layer is a key signaling protocol which supports many
functions between the terminal and the eNodeB. The RRC protocol enables the transfer of
common NAS information which is applicable to all UEs as well as dedicated NAS
information which is applicable only to a specific UE. In addition, for UEs in RRC_IDLE,
RRC supports notification of incoming calls.
The key features of RRC are the following:
 Broadcast of System Information: Handles the broadcasting of system
information, which includes NAS common information. Some of the system
information is applicable only for UE’s in RRC-IDLE while other system
information is also applicable for UEs in RRC-CONNECTED.
 RRC Connection Management: Covers all procedures related to the
establishment, modification and release of an RRC connection, including paging,
initial security activation, establishment of Signaling Radio Bearers (SRB’s) and of
radio bearers carrying user data (Data Radio Bearers, DRB’s), handover within LTE
(including transfer of UE RRC context information), configuration of the lower
protocol layers, access class barring and radio link failure.
Establishment and release of radio resources: This relates to the allocation of resources
for the transport of signaling messages or user data between the terminal and eNodeB.
 Paging: this is performed through the PCCH logical control channel. The prominent
usage of paging is to page the UE’s that are in RRC-IDLE. Paging can also be used to
notify UE’s both in RRC-IDLE and RRC-CONNECTED modes about system
information changes or SIB10 and SIB11 transfers.
 Transmission of signaling messages to and from the EPC: these messages (known as
NAS for Non Access Stratum) are transferred to and from the terminal via the RRC;
they are, however, treated by RRC as transparent messages.
 Handover: the handover is triggered by the eNodeB, based on the received
measurement reports from the UE. Handover is classified in different types based on
the origin and destination of the handover. The handover can start and end in the E-
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
16
UTRAN, it can start in the E-UTRAN and end in another Radio Access Technology
(RAT), or it can start from another RAT and end in E-UTRAN.
The RRC also supports a set of functions related to end-user mobility for terminals in
RRC Connected state. This includes:
 Measurement control: This refers to the configuration of measurements to be
performed by the terminal as well as the method to report them to the eNodeB.
 Support of inter-cell mobility procedures: which are also known as handover
 User context transfer: between eNodeB at handover.
1.5.2.2. Radio Resource Control States:
The main function of the RRC protocol is to manage the connection between the terminal and
the EUTRAN access network. To achieve this, RRC protocol states have been defined and
they are depicted in the figure below. Each of them actually corresponds to the states of the
connection, and describes how the network and the terminal shall handle special functions
like terminal mobility, paging message processing and network system information
broadcasting [14].
In E-UTRAN, the RRC state machine is very simple and limited to two states only: RRC-
IDLE, and RRC-CONNECTED.
Figure 8: The RRC States [14]
In the RRC-IDLE state, there is no connection between the terminal and the eNodeB,
meaning that the terminal is actually not known by the E-UTRAN Access Network. The
terminal user is inactive from an application level perspective, which does not mean at all that
nothing happens at the radio interface level. Nevertheless, the terminal behavior is specified in
order to save as much battery power as possible and is actually limited to three main items:
 Periodic decoding of System Information Broadcast by E-UTRAN: this process is
required in case the information is dynamically updated by the network.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
17
 Decoding of paging messages: so that the terminal can further connect to the network
in case of an incoming session.
 Cell reselection: the terminal periodically evaluates the best cell it should camp on
through its own radio measurements and based on network System Information
parameters. When the condition is reached, the terminal autonomously performs a
selection of a new serving cell.
In the RRC-CONNECTED state, there is an active connection between the terminal and the
eNodeB, which implies a communication context being stored within the eNodeB for this
terminal. Both sides can exchange user data and or signaling messages over logical channels.
Unlike the RRC-IDLE state, the terminal location is known at the cell level. Terminal
mobility is under the control of the network using the handover procedure, which decision is
based on many possible criteria including measurement reported by the terminal of by the
physical layer of the eNodeB itself.
1.6. Self-Organizing Networks:
A self-organizing Network (SON) is an automation technology designed to make the
planning, configuration, management, optimization and healing of mobile radio access
networks simpler and faster. SON functionality and behavior has been defined and specified
in generally accepted mobile industry recommendations produced by organizations such as
3GPP and the NGMN.
SON has been codified within 3GPP Release 8 and subsequent specifications in a series of
standards including 36.902, as well as public white papers outlining use cases from the
NGMN. The first technology making use of SON features will be Long Term Evolution
(LTE), but the technology has also been retro-fitted to older radio access technologies such as
Universal Mobile Telecommunications System (UMTS). The LTE specification inherently
supports SON features like Automatic Neighbor Relation (ANR) detection, which is the 3GPP
LTE Rel. 8 flagship feature.
Newly added base stations should be self-configured in line with a "plug-and-play" paradigm,
while all operational base stations will regularly self-optimize parameters and algorithmic
behavior in response to observed network performance and radio conditions. Furthermore,
self-healing mechanisms can be triggered to temporarily compensate for a detected equipment
outage, while awaiting a more permanent solution.
Self-organizing network functionalities are commonly divided into three major sub-functional
groups, each containing a wide range of decomposed use cases:
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
18
 Self-configuration functions:
Self-configuration strives towards the "plug-and-play" paradigm in the way that new base
stations shall automatically be configured and integrated into the network. This means both
connectivity establishment, and download of configuration parameters are software. Self-
configuration is typically supplied as part of the software delivery with each radio cell by
equipment vendors. When a new base station is introduced into the network and powered on,
it gets immediately recognized and registered by the network. The neighboring base stations
then automatically adjust their technical parameters (such as emission power, antenna tilt,
etc.) in order to provide the required coverage and capacity, and, in the same time, avoid the
interference.
 Self-optimization functions:
Every base station contains hundreds of configuration parameters that control various aspects
of the cell site. Each of these can be altered to change network behavior, based on
observations of both the base station itself, and measurements at the mobile station or handset.
One of the first SON features establishes neighbor relations automatically (ANR), while
others optimize random access parameters or mobility robustness in terms of handover
oscillations. A very illustrative use case is the automatic switch-off of a percent of base
stations during the night hours. The neighboring base station would then re-configure their
parameters in order to keep the entire area covered by signal. In case of a sudden growth in
connectivity demand for any reason, the "sleeping" base stations "wake up" almost
instantaneously. This mechanism leads to significant energy savings for operators.
 Self-healing functions:
When some nodes in the network become inoperative, self-healing mechanisms aim at
reducing the impacts from the failure, for example by adjusting parameters and algorithms in
adjacent cells so that other nodes can support the users that were supported by the failing
node. In legacy networks, the failing base stations are at times hard to identify and a
significant amount of time and resources is required to fix it. This function of SON permits to
spot such a failing base stations immediately in order to take further measures, and ensure no
or insignificant degradation of service for the users.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
19
Table 3: Operational benefits by SON.
Self-Configuration  Flexibility in logistics (eNodeB not site specific).
 Reduced site / parameter planning.
 Simplified installation; less prone to errors.
 No/minimum drive tests.
 Faster rollout.
Self-Optimization  Increased network quality and performance.
 Parameter optimization reduced maintenance, site visits.
Self-Healing  Error self-detection and mitigation.
 Speed up maintenance.
 Reduce outage time.
Conclusion:
In LTE-Advanced focus is on higher capacity: the driving force to further develop LTE
towards LTE–Advanced. LTE-Advanced provides higher bitrates in a cost efficient way and,
at the same time, completely fulfill the requirements set by ITU for IMT Advanced, also
referred to as 4G. In the next chapter, we will pay particular attention to the handover.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
20
Chapter II: Handover in LTE-Advanced
Introduction:
Mobility is an essential component of mobile cellular communication systems because it
offers clear benefits to the end users: low delay services such as voice or real time video
connections can be maintained while moving even in high speed trains.
Handover is one of the key procedures for ensuring that the users move freely through the
network while still being connected and being offered quality services. Since its success
rate is a key indicator of user satisfaction, it is vital that this procedure happens as fast and as
seamlessly as possible. Hence, optimizing the handover procedure to get the required
performance is considered an important issue in LTE networks.
In this context, we study in this chapter the Handover by its characteristics and different types.
2.1. Handover Definition and Characteristics:
The process of handover is very important in mobile telecommunications. It involves moving
the resource allocation for a mobile phone or a piece of UE from one base station to another.
This process is used to provide better Quality-of-Service (QoS) to customers by allowing
them to continue to use provided services even after moving out of range of the original
serving base station. It is important that handovers are performed quickly, cause little-to-no
disruption to the user's experience and are completed with a very high success rate. If a
handover is unsuccessful it is likely that an on-going call will be dropped due to there not
being enough resources available on a base station (known as an eNodeB in LTE) or the if
Received Signal Strength (RSS) to the UE drops below a certain threshold needed to maintain
the call. This threshold, in LTE, is known as the noise or and has a value of -97.5dB.
Handovers are stated to take roughly 0.25 seconds to complete after the decision has been
made for a handover to take place [17].
Depending on the required QoS, a seamless handover or a lossless handover is performed as
appropriate for each radio bearer. The descriptions of each of them are presented below.
2.1.1. Seamless Handover:
The objective of seamless handover is to provide a given QoS when the UE moves from the
coverage of one cell to the coverage of another cell. In LTE seamless handover is applied to
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
21
all radio bearers carrying control plane data and for user plane radio bearers mapped on RLC-
UM. These types of data are typically reasonably tolerant of losses but less tolerant of delay,
(e.g. voice services). Therefore seamless handover should minimize the complexity and delay
although some SDUs might be lost [4].
In the seamless handover, PDCP entities including the header compression contexts are reset,
and the COUNT values are set to zero. As a new key is anyway generated at handover, there
is no security reason to keep the COUNT values. On the UE side, all the PDCP SDUs that
have not been transmitted yet will be sent to the target cell after handover. PDCP SDUs for
which the transmission has not been started can be forwarded via X2 interface towards the
target eNodeB. Unacknowledged PDCP SDUs will be lost. This minimizes the handover
complexity because no context (i.e. configuration information) has to be transferred between
the source and the target eNodeB.
2.1.2. Lossless Handover:
Lossless handover means that no data should be lost during handover. This is achieved by
performing retransmission of PDCP PDUs for which reception has not been acknowledged by
the UE before the UE detaches from the source cell to make a handover. In lossless handover,
in-sequence delivery during handover can be ensured by using PDCP Data PDUs sequence
numbers. Lossless handover can be very suitable for delay-tolerant services like file
downloads that the loss of PDCP SDUs can enormously decrease the data rate because of
TCP reaction.
Lossless handover is applied for user plane and for some control plane radio bearers that are
mapped on RLC-AM. In lossless handover, on the UE side the header compression protocol is
reset because its context is not forwarded from the source eNodeB to the target eNodeB, but
the PDCP SDUs' sequence numbers and the COUNT values are not reset [4]. To ensure
lossless handover in the uplink, the PDCP PDUs stored in the PDCP retransmission buffer are
retransmitted by the RLC protocol based on the PDCP SNs which are maintained during the
handover and deliver them to the gateway in the correct sequence.
In order to ensure lossless handover in the downlink, the source eNodeB
forwards the uncompressed PDCP SDUs for which reception has not yet been
acknowledged by the UE to the target eNodeB for retransmission in the
downlink.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
22
2.2. Types of Handover:
The handover is triggered by the eNodeB, based on the received measurement reports from
the UE. Handover is classified in different types based on the origination and destination of
the handover. The handover can start and end in the E-UTRAN, it can start in the E-UTRAN
and end in another Radio Access Technology (RAT), or it can start from another RAT and
end in E-UTRAN [15].
Handover is classified as:
 Intra-frequency intra-LTE handover.
 Inter-frequency intra-LTE handover.
 Inter-RAT towards LTE handover.
 Inter-RAT towards UTRAN handover.
 Inter-RAT towards GERAN handover.
 Inter-RAT towards cdma2000 system handover.
2.2.1. Intra LTE Handover: Horizontal Handover:
In intra LTE handover, which is focused by this project, both the origination and destination
eNodeB’s are within the LTE system. In this type of handover, the RRC connection
reconfiguration message acts as a handover command. The interface between eNodeB’s is an
X2 interface. Upon handover, the source eNodeB sends an X2 handover request message to
the target eNodeB in order to make it ready for the coming handover.
2.2.2. Vertical Handover:
There have been tremendous breakthroughs recorded in the last decade in the historical
evolution of the wireless communication networks. The complex nature of the wireless
environment has made the technology difficult or almost impossible for the network to be
efficient in providing esteemed users high data rate and good Quality of Service (QoS)
requirements. In trying to accomplish these demands, fourth generation (4G) wireless systems
engage in collaborating heterogeneous wireless technologies to allow users get connected
anywhere and at all times. The heterogeneity of the wireless networks involves the integration
of diverse radio access technologies (RAT) such as LTE/LTE-Advanced, UMTS, HSPA,
GPRS, GSM, WiMAX and WiFi. The purpose of integrating these independent networks is to
realize the demand for high data rate and good QoS to support multimedia streaming at
precision levels.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
23
Consequently, the issue of seamless handover, high QoS support, resource allocation,
mobility management and security must be appropriately addressed before achieving these
requirements. As one of the strategies in achieving this purpose, handover mechanism is
introduced and could be defined as a process of reassigning resources as a result of Mobile
user equipment (UE) movement when it switches from one technology to another. An intra-
technology handover process mainly based on the received signal strength (RSS) levels,
is known as Horizontal Handover (HHO) and occurs when the UE switches access points
(APs) or eNodeBs while maintaining the same network. On the other hand, UE switching
their connections to a different network of abstracting proficiencies are termed Vertical
Handover (VHO). This has become possible because of the emergence of multitude
overlapping wireless networks which makes the handover process more complex.
2.3. Handover Techniques:
Handover can be categorized as: Soft handover and hard handover also known as
Make-Before-Break and Break-Before-Make respectively.
2.3.1. Soft handover, Make-Before-Break:
Soft handover is a category of handover procedures where the radio links are added and
abandoned in such manner that the UE always keeps at least one radio link to the UTRAN.
Soft and softer handover were introduced in WCDMA architecture. There is a centralized
controller called Radio Network Controller (RNC) to perform handover control for each UE
in the architecture of WCDMA. It is possible for a UE to simultaneously connect to two or
more cells (or cell sectors) during a call. If the cells the UE connected are from the same
physical site, it is referred as softer handover [10].
In handover aspect, soft handover is suitable for maintaining an active session, preventing
voice call dropping, and resetting a packet session. However, the soft handover requires much
more complicated signaling, procedures and system architecture such as in the WCDMA
network.
2.3.2. Hard handover, Break-Before-Make:
Hard handover is a category of handover procedures where all the old radio links in the UE
are abandoned before the new radio links are established. The hard handover is commonly
used when dealing with handovers in the legacy wireless systems. The hard handover requires
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
24
a user to break the existing connection with the current cell (source cell) and make a new
connection to the target cell [10].
In LTE only hard handover is supported, meaning that there is a short interruption in service
when the handover is performed.
2.4. Handover Procedure:
Depending on whether any EPC entity is involved in preparing and executing of a handover
between a source eNodeB and a target eNodeB or not, an LTE handover can be either X2
handover using X2 interface or S1 handover using S1 interface.
Figure 9 shows how a source eNodeB decides on a handover type, X2 or S1, when a handover
is triggered.
Figure 9: Decision on Handover Type.
Handover procedure in LTE can be divided into three phases: handover preparation, handover
execution and handover completion [4]. The procedure starts with the measurement reporting
of a handover event by the User Equipment (UE) to the serving evolved Node B (eNodeB).
The Evolved Packet Core (EPC) is not involved in handover procedure for the control plane
handling, i.e. preparation messages are directly exchanged between the eNodeB’s [1]. That is
the case when X2 interface is deployed, otherwise MME will be used for HO signaling.
The handover procedure with the basic handover scenario is depicted in Figure 10.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
25
Figure 10: Intra-MME/Serving Gateway handover [9].
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
26
 Handover preparation:
During the handover preparation, data flows between UE and the core network as usual. This
phase includes messaging such as measurement control, which defines the UE measurement
parameters and then the measurement report sent accordingly as the triggering criteria is
satisfied. Handover decision is then made at the serving eNodeB, which requests a handover
to the target cell and performs admission control. Handover request is then acknowledged by
the target eNodeB.
 Handover execution:
Handover execution phase is started when the source eNodeB sends a handover command to
UE. During this phase, data is forwarded from the source to the target eNodeB, which buffers
the packets. UE then needs to synchronize to the target cell and perform a random access to
the target cell to obtain UL allocation and timing advance as well as other necessary
parameters. Finally, the UE sends a handover confirm message to the target eNodeB after
which the target eNodeB can start sending the forwarded data to the UE [1].
 Handover completion:
In the final phase, the target eNodeB informs the MME that the user plane path has changed.
S-GW is then notified to update the user plane path. At this point, the data starts flowing on
the new path to the target eNodeB. Finally all radio and control plane resources are released in
the source eNodeB.
A more detailed description of the intra-MME/Serving Gateway HO procedure is given
below:
1. Based on the area restriction information, the source eNodeB configures the UE
measurement procedure.
2. MEASUREMENT REPORT is sent by the UE after it is triggered based on some
rules.
3. The decision for handover is taken by the source eNodeB based on
MEASUREMENTREPORT and RRM information.
4. HANDOVER REQUEST message is sent to the target eNodeB by the source eNodeB
containing all the necessary information to prepare the HO at the target side.
5. RAB QoS information. Performing admission control is to increase the likelihood of a
successful HO, in that the target eNodeB decides if the resources can be granted or
not. In case the resources can be granted, the target eNodeB configures the required
resources according to the received E-RAB QoS information then reserves a Cell
Radio Network Temporary Identifier (C-RNTI) and a RACH preamble for the UE.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
27
6. The target eNodeB prepares HO and then sends the HANDOVER REQUEST
ACKNOWLEDGE to the source eNodeB. There is a transparent container in the
HANDOVER REQUEST ACKNOWLEDGE message which is aimed to be sent to
the UE as an RRC message for performing the handover. The container includes a new
C-RNTI, target eNodeB security algorithm identifiers for the selected security
algorithms, may include a dedicated RACH preamble, and possibly some other
parameters like RNL/TNL information for the forwarding tunnels. If there is a need
for data forwarding, the source eNodeB can start forwarding the data to the target
eNodeB as soon as it sends the handover command towards the UE.
Steps 7 to 16 are designed to avoid data loss during HO:
7. To perform the handover the target eNodeB generates the RRC message, i.e. RRC
Connection Reconfiguration message including the mobility Control Information. This
message is sent towards the UE by the source eNodeB.
8. The SN STATUS TRANSFER message is sent by the source eNodeB to the target
eNodeB. In that message, the information about uplink PDCP SN receiver status and
the downlink PDCP SN transmitter status of E-RABs are provided. The PDCP SN of
the first missing UL SDU is included in the uplink PDCP SN receiver status. The next
PDCP SN that the target eNodeB shall assign to the new SDUs is indicated by the
downlink PDCP SN transmitter status.
At this point, data forwarding of user plane downlink packets can use either a
“seamless mode” minimizing the interruption time during the move of the UE, or a
“lossless mode” not tolerating packet loss at all. The source eNodeB may decide to
operate one of these two modes on a per EPS bearer basis, based on the QoS received
over X2 for this bearer.
9. After reception of the RRC Connection Reconfiguration message including the
mobility Control Information by the UE, the UE tries to perform synchronization to
the target eNodeB and to access the target cell via RACH. If a dedicated RACH
preamble was assigned for the UE, it can use a contention free procedure; otherwise it
shall use a contention based procedure. In the sense of security, the target eNodeB
specific keys are derived by the UE and the selected security algorithms are
configured to be used in the target cell.
10. The target eNodeB responds based on timing advance and uplink allocation.
11. After the UE is successfully accessed to the target cell, it sends the RRC Connection
Reconfiguration Complete message for handover confirmation, The C- RNTI sent in
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
28
the RRC Connection Reconfiguration Complete message is verified by the target
eNodeB and afterwards the target eNodeB can now begin sending data to the UE.
12. A PATH SWITCH message is sent to MME by the target eNodeB to inform that the
UE has changed cell.
13. UPDATE USER PLANE REQUEST message is sent by the MME to the Serving
Gateway.
14. The Serving Gateway switches the downlink data path to the target eNodeB and sends
one or more end marker" packets on the old path to the source eNodeB to indicate no
more packets will be transmitted on this path. Then U-plane/TNL resources towards
the source eNodeB can be released.
15. An UPDATE USER PLANE RESPONSE message is sent to the MME by the Serving
Gateway.
16. The MME sends the PATH SWITCH ACKNOWLEDGE message to confirm the
PATH SWITCH message.
17. The target eNodeB sends UE CONTEXT RELEASE to the source eNodeB to inform
the success of handover to it. The target eNodeB sends this message to the source
eNodeB after the PATH SWITCH ACKNOWLEDGE is received by the target
eNodeB from the MME.
18. After the source eNodeB receives the UE CONTEXT RELEASE message, it can
release the radio and C-plane related resources. If there is ongoing data forwarding it
can continue.
Figure 11: Handover Timing [8]
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
29
2.5. Handover Measurements:
The handover procedure in LTE-Advanced, which is a part of the RRM, is based on the UE’s
measurements. Handover decisions are usually based on the downlink channel measurements
which consist of Reference Signal Received Power (RSRP) and Reference Signal Received
Quality (RSRQ) made in the UE and sent to the eNodeB regularly [12]. The descriptions of
each of them are presented following:
 Reference Signal Received Power (RSRP):
The RSRP measurement provides cell-specific signal strength metric. This measurement is
used mainly to rank different LTE-Advanced candidate cells according to their signal strength
and is used as an input for handover and cell reselection decisions. RSRP is defined for a
specific cell as the linear average received power (in Watts) of the signals that carry cell-
specific Reference Signals (RS) within the considered measurement frequency bandwidth [4].
 Reference Signal Received Quality (RSRQ):
This measurement is intended to provide a cell-specific signal quality metric. Similarly to
RSRP, this metric is used mainly to rank different LTE candidate cells according to their
signal quality. This measurement is used as an input for handover and cell reselection
decisions, for example in scenarios for which RSRP measurements do not provide sufficient
information to perform reliable mobility decisions.
The RSRQ is defined as:
𝑅𝑆𝑅𝑄 =
𝑁.𝑅𝑆𝑅𝑃
𝑅𝑆𝑆𝐼
(1)
Where N is the number of Resource Blocks (RBs) of the LTE-Advanced carrier RSSI
measurement bandwidth. The measurements in the numerator and denominator are made over
the same set of resource blocks. While RSRP is an indicator of the wanted signal strength,
RSRQ additionally takes the interference level into account due to the inclusion of RSSI.
RSRQ therefore enables the combined effect of signal strength and interference to be reported
in an efficient way [4].
Besides RSRP/RSRQ, handover technology has other decision criterions, such as:
 Signal Noise Ratio (SNR):
The SNR is a measurement that compares the level of a desired signal to the level of
background noise (unwanted signal). It is defined as the ratio of signal power and the noise
power. A ratio higher than 1:1 indicates more signal than noise.
𝑆𝑁𝑅 =
𝑃 𝑠𝑖𝑔𝑛𝑎𝑙
𝑃 𝑛𝑜𝑖𝑠𝑒
(2)
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
30
Where P is average power. Both signal and noise power must be measured at the same or
equivalent points in a system, and within the same system bandwidth [16].
 Carrier-to-Interference Ratio (CIR):
CIR expressed in decibels (dB) is a measurement of signaling effectiveness and it is defined
as the ratio of the power in the carrier to the power of the interference signal.
 Signal Interference plus Noise Ratio (SINR):
This metric is used to optimize the transmit power level for a target quality of service
assisting with handover decisions. Accurate SINR estimation provides a more efficient system
and a higher user-perceived quality of service.
SINR is defined as the ratio of signal power to the combined noise and interference power:
𝑆𝐼𝑁𝑅 =
𝑃 𝑠𝑖𝑔𝑛𝑎𝑙
𝑃 𝑛𝑜𝑖𝑠𝑒+ 𝑃 𝑖𝑛𝑡𝑒𝑟𝑓𝑒𝑟𝑒𝑛𝑐𝑒
(3)
Where P is the averaged power, values are commonly quoted in dB.
 Received Signal Strength Indicator (RSSI):
The LTE carrier RSSI is defined as the total received wideband power observed by the UE
from all sources, including co-channel serving and non-serving cells, adjacent channel
interference and thermal noise within the measurement bandwidth specified by the 3GPP.
LTE-Advanced carrier RSSI is not reported as a measurement in its own right, but is used as
an input to the LTE-Advanced RSRQ measurement [4].
As mentioned earlier, handover measurements in LTE-Advanced are done at the downlink
reference symbols in the frame structure as shown in Figure 12. However, handover decision
can also be based on the uplink measurements. This study focuses on downlink handover
measurements.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
31
Figure 12: Downlink reference signal structure for LTE-Advanced.
The averaging of fast fading over all the reference symbols is done at Layer 1 and hence is
called L1 filtering (Figure 13). The use of scalable bandwidth in LTE allows doing the
handover measurement on different bandwidth.
Figure 13: Handover measurement filtering and reporting [10].
2.6. Handover Parameters:
The handover procedure has different parameters which are used to enhance its performance
and setting these parameters to the optimal values is a very important task. In LTE the
triggering of handover is usually based on measurement of link quality and some other
parameters in order to improve the performance. The most important ones include [13]:
 Handover initiation threshold level RSRP and RSRQ:
This level is used for handover initiation. When the handover threshold decreases, the
probability of a late handover decreases and the Ping-Pong effect increases. It can be varied
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
32
according to different scenarios and propagation conditions to make theses trade-offs and
obtain a better performance.
 Hysteresis margin:
The Hysteresis margin also called HO margin is the main parameter that governs the HO
algorithm between two eNodeB’s. The handover is initiated if the link quality of another cell
is better than current link quality by a hysteresis value. It is used to avoid ping-pong effects.
However, it can increase handover failure since it can also prevent necessary handovers.
 Time-to-Trigger (TTT):
When applying Time-to-Trigger, the handover is initiated only if the triggering requirement is
fulfilled for a time interval. This parameter can decrease the number of unnecessary
handovers and effectively avoid Ping-Pong effects. But it can also delay the handover which
then increase the probability of handover failures.
 The length and shape of averaging window:
The effect of the channel variation due to fading should be minimized in handover decision.
Averaging window can be used to filter it out. Both the length and the shape of the window
can affect the handover initiation. Long windows reduce the number of handovers but
increase the delay. The shape of the windows, e.g. rectangular or exponential shape, can also
affect the number of handovers and probability of unnecessary handovers.
The listed parameters will affect directly the handover initiations and hence they can be tuned
according to certain design goals. However there are other parameters like the measurement
report period which can also have an impact on the handover initiations.
Figure 14: Handover triggering procedure [11].
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
33
In summary, the starting point of the handover triggering procedure is the measurements
performed by the UE. These are done periodically as defined by the measurement period
parameter configured at the eNodeB. When a condition is reached in which the serving cell
RSRP drops an amount of the configured HO offset, usually 2-3dB, below the measured
neighbor cell, a timer is started.
In case this condition lasts the amount of the Time to Trigger (TTT) value, a measurement
report is sent to the eNodeB, which initiates the handover by sending a handover command to
the UE. In case the reporting conditions change and no longer satisfy the triggering conditions
before the timer reaches the TTT value, a measurement report will not be sent and new
measurement calculations and timers are started [11].
2.7. Time To Trigger & Hysteresis:
In this project LTE, two main parameters are studied in the handover process. These
parameters are the Time-to-Trigger (TTT) and Hysteresis (hys). The hys is used to dene how
much better the RSS of a neighboring base station must be than the serving base station for a
handover to be considered. The values of hys are defined in Decibels (dB) and range from 0
to 10dB in 0.5dB increments, this results in there being 21 different values of hys. The full
range of hys values can be seen in Table 4.
Table 4: Table of the different LTE hys values.
Index
hys (dB)
0
0.0
1
0.5
2
1.0
3
1.5
4
2.0
5
2.5
6
3.0
7
3.5
8
4.0
9
4.5
10
5.0
Index
hys (dB)
11
5.5
12
6.0
13
6.5
14
7.0
15
7.5
16
8.0
17
8.5
18
9.0
19
9.5
20
10
The TTT is a length of time, defined in seconds, that is used to define how long a neighboring
base station must be considered better than the serving base station for. There are 16 different
values of TTT ranging from 0 to 5.12 seconds. Unlike with hys., the TTT values do not
increase linearly; instead they increase exponential with smaller increases at the lower values
and bigger increases at the larger values. The full list of TTT values can be seen in Table 5
and a graph of how the TTT values increase can be seen in Figure 11.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
34
Table 5: Table of the different LTE TTT values.
Index
TTT (s)
0
0.0
1
0.04
2
0.064
3
0.08
4
0.1
5
0.128
6
0.16
7
0.256
8
0.32
Index
TTT (s)
9
0.48
10
0.512
11
0.64
12
1.024
13
1.280
14
2.56
15
5.12
There are 336 different combinations of TTT and hys values. Having such a large range of
combinations means that pairs of values can mean that a neighboring eNodeB has to be better
by a large value of hys but for a small value of TTT or vice-versa. This makes for an
interesting dynamic for which pairs of values will work the best in any given environment.
In LTE there are eight different triggers defined for initiating handovers. Table 6 shows
different trigger events and how they are defined [18].
Table 6: Table of the different LTE Trigger types and their criteria.
Event Type Trigger Criteria
A1 Serving becomes better than a threshold.
A2 Serving becomes worse than a threshold.
A3 Neighbor becomes offset better than Primary Cell (PCell).
A4 Neighbor becomes better than threshold.
A5 PCell becomes worse than threshold1 and neighbor becomes
better than threshold2.
A6 Neighbor becomes offset better than Secondary Cell (SCell).
B1 Inter RAT neighbor becomes better than threshold.
B2 PCell becomes worse than threshold1 and inter RAT
neighborbecomes better than threshold2.
Out of the eight triggers the A3 event is the most common and its definition is that a
neighboring eNodeB must give the UE better Reference Signal Received Power (RSRP) by an
amount defined by the hys., for a length of time defined by the TTT. [19] The A3 event can
be represented by the following equation:
𝑅𝑆𝑅𝑃𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟𝑖𝑛𝑔 + 𝐻𝑦𝑠 > 𝑅𝑆𝑅𝑃𝑠𝑒𝑟𝑣𝑖𝑛𝑔 (4)
When a handover event is triggered a measurement report is sent from the UE to the Serving
eNodeB. The measurement report contains the information required for the Serving eNodeB
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
35
to make a decision on whether to initiate a handover or not. The full, high-level, procedure for
a LTE handover is as follows:
1. If a Neighboring eNodeB is found to be better than the Serving eNodeB a
measurement report is sent by the UE to the Serving eNodeB.
2. The Serving eNodeB considers the information in the measurement report and decides
whether or not a handover should take place.
3. If it is decided that a handover should take place then a message is sent to the
Neighboring eNodeB to prepare resources for the UE.
4. Once the resources are ready for the UE the new Serving eNodeB sends a message to
the old eNodeB to release the resources it previously had for the UE.
5. Finally a message is sent to the MME to finalize the handover process.
Conclusion:
The handover parameters need to be optimized for good performance. Too low handover
offset and TTT values in fading conditions result in back and forth ping- pong handovers
between the cells. Too high values then can be the cause of call drops during handovers as the
radio conditions get too bad for transmission in the serving cell.
In the last chapter, we will explain our proposed solution to optimize the Handover
parameters.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
36
Chapter III: Machine Learning and Handover Parameter
Optimization simulation
Introduction:
Optimizing handover is a major activity in network operations, with Hysteresis and Time-to-
Trigger as the main control parameters. For each HO, depending on the Hys-TTT tuple also
called the trigger point, either: a success, Ping-Pong, or Radio Link failure occurs.
Along this chapter, we will describe the Q-Learning, present our proposed approach for
Handover optimization and finish by simulation results.
3.1. Q-Learning overview:
3.1.1. Machine Learning:
Machine learning is a form of Artificial Intelligence (AI) that involves designing and studying
systems and algorithms with the ability to learn from data. This field of AI has many
applications within research (such as system optimization), products (such as image
recognition) and advertising (such as adverts that use a user's browsing history). There are
many different paradigms that machine learning algorithms use. Algorithms can use training
sets to train an algorithm to give appropriate outputs; other algorithms look for patterns in
data; while others use the notion of rewards to find out if an action could be considered
correct or not [20]. Three of the most popular types of machine learning algorithms are:
 Supervised learning is where an algorithm is trained using a training set of data.
This set of data includes inputs and the known outputs for those inputs. The training
set is used to fine-tune the parameters in the algorithm. The purpose of this kind of
algorithm is to learn a general mapping between inputs and outputs so that the
algorithm can give an accurate result for an input with an unknown output. This
type of algorithm is generally used in classification systems.
 Unsupervised learning algorithms only know about the inputs they are given. The
goal of such an algorithm is to try and find patterns or structure within the input
data. Such an algorithm would be given inputs and any patterns that are contained
would become more and more visible the more inputs the algorithm is given.
 Reinforcement learning uses an intelligent agent to perform actions within an
environment. Any such action will yield a reward to the agent and the agent's goal
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
37
is to learn about how the environment reacts to any given action. The agent then
uses this knowledge to try and maximize its reward gains.
3.1.2. Reinforcement Learning:
In reinforcement learning an intelligent agent is learning what action to do at any given time
to maximize the notion of a reward. In the beginning the agent has no knowledge of what
action it should take from any state within the learning environment. It must instead learn
through trial and error, exploring all possible actions and finding the ones that perform the
best.
The trade-of between exploration and exploitation is one of the main features of
reinforcement and can greatly affect the performance of a chosen algorithm. A reinforcement
learning algorithm must contemplate this trade-off of whether to exploit an action that
resulted in a large reward or to explore other actions with the possibility of receiving a greater
reward.
Another main feature of reinforcement learning is that the problem in question is taken into
context as a whole. This is different from other types of ma- chine learning algorithms, as
they will not considered how the results of any sub-problems may affect the problem as a
whole.
The basic elements required for reinforcement learning is as follows:
 A Model (M) of the environment that consists of a set of States (S) and Actions
(A).
 A reward function (R).
 A value function (V).
 A policy (P).
The model of the environment is used to mimic the behavior of the environment, such as
predicting the next state and reward from a state and taken action. Models are generally used
for planning by deciding what action to take while considering future rewards.
The reward function defines how good or bad an action is from a state. It is also used to
define the immediate reward the agent can expect to receive. Generally a mapping between a
state-action pair and a numerical value is used to define the reward that the agent would gain.
The reward values are used to define the policy where the best value of state-action pair is
used to define the action to take from a state.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
38
While the reward function defines the immediate reward that can be gained from a state, the
value function defines how good a state will be long-term. This difference can create possible
conflicts of interest for an agent; so while its goal is to collect as much reward as possible, it
has to weigh up the options of picking a state that may provide a lot of up front reward but not
much future reward against a state with a lot of future reward but not much immediate reward.
The policy is a mapping between a state and the best action to be taken from that state at any
given time. Policies can be simple or complex; with a simple policy consisting of a lookup
table, while more complex policies can involve search processes. In general most policies
begin stochastic so that the agent can start to learn what actions are more optimal. [11]
3.1.3. Q-Learning:
Q-Learning is a type of reinforcement learning algorithm where an agent tries to discover an
optimal policy from its history of interactions within an environment. What makes
Q-Learning so powerful is that it will always learn the optimal policy (which action a to take
from a state s) for a problem regardless of the policy it follows, as long as there is no limit on
the number of times the agent can try an action. Due to this ability to always learn the optimal
policy, Q-Learning is known as an Off-Policy learner. The history of interactions of an agent
can be shown as a sequence of State-Action-Rewards:
< s0, a0, r1, s1, a1, r2, s2, a2... >
This can be described as the agent was in State 0, did Action 0, received Reward 0 and
transitioned into State 1; then did Action 1, received Reward 1 and transitioned into State 2;
and so on.
The history of interactions can be treated as a sequence of experiences, with each experience
being a tuple.
< s, a, r, s >
The meaning of the tuple is that the agent was in State s, did Action a, received Reward r and
transitioned in State s. The experiences are what the agent uses to determine what the optimal
action to take is at a given time.
The basic process of a Q-Learning algorithm can be seen in Algorithm 3.1. The general
process requires that the learning agent is given a set of states, a set of actions, a discount
factor γ and step size α. The agent also keeps a table of Q-Values, denoted by Q(s,a) where s
is a state and a is an action from that state. A Q-Value is also an average of all the experiences
the agent has with a specific state-action pair. This allows for good and bad experiences to be
averaged out to giving a reasonable estimation of the actual value of state-action pair.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
39
The process of averaging out experiences is done using Temporal Differences.
It could be said that the best way to estimate the next value in a list is to take the average of
all the previous values. Equation 4 shows this process.
𝐴 𝑘 =
(𝑣1 +⋯+ 𝑣 𝑘)
𝑘
(5)
Therefore
𝑘 𝐴 𝑘 = 𝑣1 + ⋯ + 𝑣 𝑘
= (𝑘 − 1)𝐴 𝑘−1 + 𝑣 𝑘 (6)
Then dividing by k gives:
𝐴 𝑘 = (1 −
1
𝑘
) 𝐴 𝑘−1 +
𝑣 𝑘
𝑘
(7)
Then let αk = 1/k:
𝐴 𝑘 = (1 − 𝛼 𝑘)𝐴 𝑘−1+ 𝛼 𝑘 + 𝑣 𝑘
= 𝐴 𝑘−1 + 𝛼 𝑘(𝑣 𝑘 − 𝐴 𝑘−1 ) (8)
The part of Equation 8 where the difference vk − Ak−1 is seen is known as the Temporal
Difference Error or TD Error. This shows how different the old value Ak-1 is from the new
value vk. The new value of the estimate, Ak, is then the old estimate, Ak-1, plus the TD error
times k. The Q-Values, therefore, are defined using temporal differences and Equation 9
shows the formula to calculate the values, where is a variable between 0 and 1 and defines
the step size of the algorithm. If the step size were 0 then the algorithm would ignore any
rewards received and if the step size were 1 the algorithm would consider the rewards gained
just as much as the previous experiences of a state-action pair. The discount factor is also a
variable between 0 and 1 and defines how much less future rewards will be worth compared
to the current reward. If the discount factor were to be 0, then the future rewards would not be
considered a lot. If the discount factor were to be 1, then the future rewards would be worth as
much as the current rewards. The possible future rewards (maxaQ(s,a)) is the maximum of the
Q-Values of all possible state-action pairs from the action selected.
𝑄[𝑠, 𝑎] = 𝑄[𝑠, 𝑎] + 𝛼 (𝑟 + 𝛾𝑚𝑎𝑥 𝑎′ 𝑄[𝑠′
, 𝑎′] − 𝑄[𝑠, 𝑎]) (9)
The table of Q-Values can either be initialized as empty or with some values pre-set to try and
lead the agent to a specific goal state. Once the agent has initialized these parameters it
observes the starting state. The starting state can either be chosen by random or be a pre-
determined start state for the problem. The agent will then choose an action. Actions are
chosen either stochastically or by a policy. Once an action has been chosen the agent will
carry out the action and receive a reward. This reward is used to update the table of Q-Values
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
40
using Equation 9. Finally the agent moves into the new state and repeats until termination;
which can be either when the agent discovers a goal state or after a certain number of actions
have be taken.
Require:
S is a set of states
A is a set of actions
γ the discount reward factor
α is the learning rate
1: procedure Q-Learning(S, A, γ, α)
2: real array Q[S, A] 3: previous state s
4: previous action a
5: initialize Q[S, A] arbitrarily
6: observe current state s
7: repeat
8: select and carry out an action a
9: observe reward r and state s′
10: Q[s, a] ← Q[s, a] + α (r + γmaxa′ Q[s′, a′] − Q[s, a])
11: s ← s′
12: until termination
13: end procedure
After a Q-Learning algorithm has finished exploring the model of the environment it creates a
policy. The policy is generated by searching across all actions for a state and finding the next
state with the greatest value. The policy is therefore a lookup table that maps a state with the
best possible next state. The policy created can then be used to solve the problem that the Q-
Learning agent was exploring [22].
3.2. Proposed Approach for HO optimization:
3.2.1. Set of states:
The approach taken for optimizing the handover parameters in LTE-Advanced uses a Q-
Learning algorithm based on the process given in Section 3.1. In the approach the model of
the environment has a state for every combination of TTT and hys.; giving a total number of
336 states.
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
41
Table 7: Set of states.
HYS
TTT
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10
0 0.0 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020
1 0.04 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041
2 0.064 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062
3 0.08 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083
4 0.1 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104
5 0.128 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
6 0.16 126 127 128 129 130 131 132 133 134 135(6)
136(5)
137(8)
138 139 140 141 142 143 144 145 146
7 0.256 147 148 149 150 151 152 153 154 155 156(4)
157 158(1)
159 160 161 162 163 164 165 166 167
8 0.32 168 169 170 171 172 173 174 175 176 177(7)
178(2)
179(3)
180 181 182 183 184 185 186 187 188
9 0.48 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209
10 0.512 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230
11 0.64 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251
12 1.024 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272
13 1.280 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293
14 2.56 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314
15 5.12 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
42
3.2.2. Set of actions:
An action within the model can move to any other state that is different by one of the
following changes to the handover parameters:
1. A single value increase of TTT. (1)
2. A single value increase of hys. (2)
3. A single value increase of both TTT and hys. (3)
4. A single value decrease of TTT. (4)
5. A single value decrease of hys. (5)
6. A single value decrease of both TTT and hys. (6)
7. A single value increase of TTT and a single value decrease of hys. (7)
8. A single value increase of hys and a single value decrease of TTT. (8)
For example if the learning agent is in the state 157 where the TTT equals 0.256s and the hys
equals 5.0dB and performed action 3 from the list seen above: a single value increase of both
TTT and hys.), then the new TTT would equal 0.32s and the hys. would equal 5.5dB: state
179. In fact the possible next states for the state 157 are: {135(6)
, 136(5)
, 137(8)
, 156(4)
, 158(1)
,
179(3)
, 178(2)
, 177(7)
}
HYS (dB)
TTT(s)
5.0 5.54.5
7
6
3
4 1
8
2
5
0.16
0.256
0.32
S157
S137S136S135
S156
S177
S178
S179
S158
Figure 15: State 157 possible actions.
The full list of hys. values can be seen in Table 3 and the full list of TTT values can be seen in
Table 4. Having the actions only change the parameters by one increase or decrease of the
Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech.
43
TTT and hys values each time not only allows for more refined optimization of the parameters
but it also makes sure that no large changes can suddenly happen.
3.2.3. Reward:
Due to the nature of this kind of problem, the reward gained by an action is dynamic and is
likely to be different each time it is taken. Rewards are based on the number of drop and ping-
pongs accumulated in the simulation for current state in the environment model. The rewards
are defined by the following equation:
𝑅𝑒𝑤𝑎𝑟𝑑 = 𝐻𝑎𝑛𝑑𝑜𝑣𝑒𝑟𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 /(10 ∗ 𝐷𝑟𝑜𝑝𝑠 + 2 ∗ 𝑃𝑖𝑛𝑔𝑃𝑜𝑛𝑔𝑠) (10)
The coefficients in Equation 10 are given the values of 10 for drops and 2 for ping-pongs.
Drops are extremely bad for the QoS of a communication system so it's given a large value
and the reason ping-pongs are multiplied by 2 to remove the successful handover that was
caused by the Ping-Pong and give the agent a penalty. The reward is given to the agent and
the Q-Value for that state is updated just before the agent selects the next action to take. The
agent then selects new actions in discrete time steps, which allows for the simulation to run
for fixed periods of time with TTT-hys. pairs specified by a state in the environment model.
After the agent has been given enough time to try every action at least once the Q-Learning
agent generates a policy. This policy can then be used to attempt to optimize the handover
parameters by changing the TTT and hys. values after a call is dropped or the connection
ping-pongs between base stations. The Q-Learning agent still receives rewards every time a
call is dropped or the connection ping-pongs while following the generated policy. Doing this
allows for the system to always be learning; even after the initial learning process that
generated the policy.
3.3. Simulation & Performance evaluation:
The simulation is a very important part of the project. It is required to provide the basic
functionality of a LTE network. For simplicity the simulation was broken down into two main
components; the mobile (UE) and the base station (eNodeB). Due to the project revolving
around the handover process in LTE, it made sense for the two main components of the
simulation to be the mobile and the base station; it is the mobile that triggers the measurement
report and the base station that makes the decision on whether a handover should take place or
not. Each base station would also be given its own Q-Learning agent since each base station is
unique. Since the A3 event trigger (Table 5) is the most common it was decided that it would
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks
Handover Parameters Self-optimization by Q-Learning in 4G Networks

More Related Content

What's hot

Factors affecting lte throughput and calculation methodology
Factors affecting lte throughput and calculation methodologyFactors affecting lte throughput and calculation methodology
Factors affecting lte throughput and calculation methodologyAbhijeet Kumar
 
LTE Review - Load Balancing and Interfreq HO
LTE Review - Load Balancing and Interfreq HOLTE Review - Load Balancing and Interfreq HO
LTE Review - Load Balancing and Interfreq HOpaulo_campolina
 
Tems layer3_messages
Tems  layer3_messagesTems  layer3_messages
Tems layer3_messagesbadgirl3086
 
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...16 gsm bss network kpi (inter rat handover success rate) optimization manual[...
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...tharinduwije
 
Rf planning for lte using atoll v1
Rf planning for lte using atoll v1Rf planning for lte using atoll v1
Rf planning for lte using atoll v1Muhammad Rizki
 
Drivetest procedure for network optimization
Drivetest procedure for network optimizationDrivetest procedure for network optimization
Drivetest procedure for network optimizationBallary Venkateswara
 
56699897 wcdma-ran-planning-and-optimization-features-and-algorithms
56699897 wcdma-ran-planning-and-optimization-features-and-algorithms56699897 wcdma-ran-planning-and-optimization-features-and-algorithms
56699897 wcdma-ran-planning-and-optimization-features-and-algorithmsShiv Chaudhary
 
huawei-lte-kpi-ref
huawei-lte-kpi-refhuawei-lte-kpi-ref
huawei-lte-kpi-refAbd Yehia
 
Basic dt gsm ok
Basic dt gsm okBasic dt gsm ok
Basic dt gsm okbonaruce
 
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...SudheeraIndrajith
 
Zte umts load-monitoring and expansion guide
Zte umts load-monitoring and expansion guideZte umts load-monitoring and expansion guide
Zte umts load-monitoring and expansion guideAlfri Dinata
 
3 g parameter ericsson
3 g parameter ericsson3 g parameter ericsson
3 g parameter ericssonMitul Shah
 
RSCP RSSI EC/NO CQI
RSCP RSSI EC/NO CQIRSCP RSSI EC/NO CQI
RSCP RSSI EC/NO CQIFaraz Husain
 
Optimisation guide line ver1.1
Optimisation guide line ver1.1Optimisation guide line ver1.1
Optimisation guide line ver1.1Chandra Deria
 
LTE Basic Principle
LTE Basic PrincipleLTE Basic Principle
LTE Basic PrincipleTaiz Telecom
 

What's hot (20)

2G Handover Details (Huawei)
2G Handover Details (Huawei)2G Handover Details (Huawei)
2G Handover Details (Huawei)
 
Factors affecting lte throughput and calculation methodology
Factors affecting lte throughput and calculation methodologyFactors affecting lte throughput and calculation methodology
Factors affecting lte throughput and calculation methodology
 
LTE Review - Load Balancing and Interfreq HO
LTE Review - Load Balancing and Interfreq HOLTE Review - Load Balancing and Interfreq HO
LTE Review - Load Balancing and Interfreq HO
 
Tems layer3_messages
Tems  layer3_messagesTems  layer3_messages
Tems layer3_messages
 
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...16 gsm bss network kpi (inter rat handover success rate) optimization manual[...
16 gsm bss network kpi (inter rat handover success rate) optimization manual[...
 
Rf planning for lte using atoll v1
Rf planning for lte using atoll v1Rf planning for lte using atoll v1
Rf planning for lte using atoll v1
 
Drivetest procedure for network optimization
Drivetest procedure for network optimizationDrivetest procedure for network optimization
Drivetest procedure for network optimization
 
56699897 wcdma-ran-planning-and-optimization-features-and-algorithms
56699897 wcdma-ran-planning-and-optimization-features-and-algorithms56699897 wcdma-ran-planning-and-optimization-features-and-algorithms
56699897 wcdma-ran-planning-and-optimization-features-and-algorithms
 
Part 2 planning of 3G
Part 2  planning of 3GPart 2  planning of 3G
Part 2 planning of 3G
 
huawei-lte-kpi-ref
huawei-lte-kpi-refhuawei-lte-kpi-ref
huawei-lte-kpi-ref
 
Basic dt gsm ok
Basic dt gsm okBasic dt gsm ok
Basic dt gsm ok
 
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...
422738668-LTE-Downlink-Throughput-Optimization-Based-on-Performance-Data [Rep...
 
Huawei 3 g_capacity_optimization
Huawei 3 g_capacity_optimizationHuawei 3 g_capacity_optimization
Huawei 3 g_capacity_optimization
 
Zte umts load-monitoring and expansion guide
Zte umts load-monitoring and expansion guideZte umts load-monitoring and expansion guide
Zte umts load-monitoring and expansion guide
 
Part 3 optimization 3G
Part 3 optimization 3GPart 3 optimization 3G
Part 3 optimization 3G
 
3 g parameter ericsson
3 g parameter ericsson3 g parameter ericsson
3 g parameter ericsson
 
HSIFLS
HSIFLSHSIFLS
HSIFLS
 
RSCP RSSI EC/NO CQI
RSCP RSSI EC/NO CQIRSCP RSSI EC/NO CQI
RSCP RSSI EC/NO CQI
 
Optimisation guide line ver1.1
Optimisation guide line ver1.1Optimisation guide line ver1.1
Optimisation guide line ver1.1
 
LTE Basic Principle
LTE Basic PrincipleLTE Basic Principle
LTE Basic Principle
 

Viewers also liked

Handoff parameters
Handoff parametersHandoff parameters
Handoff parametersAJAL A J
 
Distributed Deep Q-Learning
Distributed Deep Q-LearningDistributed Deep Q-Learning
Distributed Deep Q-LearningLyft
 
Radio Measurements in LTE
Radio Measurements in LTERadio Measurements in LTE
Radio Measurements in LTESofian .
 
Encoding Robotic Sensor States for Q-Learning using the
Encoding Robotic Sensor States for Q-Learning using the Encoding Robotic Sensor States for Q-Learning using the
Encoding Robotic Sensor States for Q-Learning using the butest
 
Lte network chart_poster
Lte network chart_posterLte network chart_poster
Lte network chart_posterDipeshHShah
 
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based accelerationHye-min Ahn
 
4 g LTE, LTE Advance
4 g LTE, LTE Advance 4 g LTE, LTE Advance
4 g LTE, LTE Advance Sajid Marwat
 
Analysis of vertical and horizontal handoff
Analysis of vertical and horizontal handoff Analysis of vertical and horizontal handoff
Analysis of vertical and horizontal handoff Tauseef khan
 
SON,self optimized network
SON,self optimized networkSON,self optimized network
SON,self optimized networksivakumar D
 
An Introduction to Self-Organizing Networks (SON)
An Introduction to Self-Organizing Networks (SON)An Introduction to Self-Organizing Networks (SON)
An Introduction to Self-Organizing Networks (SON)eXplanoTech
 
Huawei - Lte handover troubleshooting
Huawei - Lte handover troubleshootingHuawei - Lte handover troubleshooting
Huawei - Lte handover troubleshootingnavaidkhan
 
2 g case analsyis handover training-20060901-a-2.0
2 g case analsyis handover training-20060901-a-2.02 g case analsyis handover training-20060901-a-2.0
2 g case analsyis handover training-20060901-a-2.0Mery Koto
 
Cluster optimization procedure v1
Cluster optimization procedure v1Cluster optimization procedure v1
Cluster optimization procedure v1Terra Sacrifice
 
Internship report main
Internship report mainInternship report main
Internship report mainAJAL A J
 

Viewers also liked (20)

Handover In 4 G Networks
Handover In 4 G NetworksHandover In 4 G Networks
Handover In 4 G Networks
 
Handoff parameters
Handoff parametersHandoff parameters
Handoff parameters
 
20121120 handover in lte
20121120 handover in lte20121120 handover in lte
20121120 handover in lte
 
Distributed Deep Q-Learning
Distributed Deep Q-LearningDistributed Deep Q-Learning
Distributed Deep Q-Learning
 
Radio Measurements in LTE
Radio Measurements in LTERadio Measurements in LTE
Radio Measurements in LTE
 
Encoding Robotic Sensor States for Q-Learning using the
Encoding Robotic Sensor States for Q-Learning using the Encoding Robotic Sensor States for Q-Learning using the
Encoding Robotic Sensor States for Q-Learning using the
 
Lte network chart_poster
Lte network chart_posterLte network chart_poster
Lte network chart_poster
 
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
 
4 g LTE, LTE Advance
4 g LTE, LTE Advance 4 g LTE, LTE Advance
4 g LTE, LTE Advance
 
Analysis of vertical and horizontal handoff
Analysis of vertical and horizontal handoff Analysis of vertical and horizontal handoff
Analysis of vertical and horizontal handoff
 
SON,self optimized network
SON,self optimized networkSON,self optimized network
SON,self optimized network
 
Handover 3g
Handover 3gHandover 3g
Handover 3g
 
Deep Q-Learning
Deep Q-LearningDeep Q-Learning
Deep Q-Learning
 
An Introduction to Self-Organizing Networks (SON)
An Introduction to Self-Organizing Networks (SON)An Introduction to Self-Organizing Networks (SON)
An Introduction to Self-Organizing Networks (SON)
 
Huawei - Lte handover troubleshooting
Huawei - Lte handover troubleshootingHuawei - Lte handover troubleshooting
Huawei - Lte handover troubleshooting
 
Telecommunications Kpi
Telecommunications  KpiTelecommunications  Kpi
Telecommunications Kpi
 
2 g case analsyis handover training-20060901-a-2.0
2 g case analsyis handover training-20060901-a-2.02 g case analsyis handover training-20060901-a-2.0
2 g case analsyis handover training-20060901-a-2.0
 
Cluster optimization procedure v1
Cluster optimization procedure v1Cluster optimization procedure v1
Cluster optimization procedure v1
 
Internship report main
Internship report mainInternship report main
Internship report main
 
Conflict mgt in nursing
Conflict mgt in nursingConflict mgt in nursing
Conflict mgt in nursing
 

Similar to Handover Parameters Self-optimization by Q-Learning in 4G Networks

Wireless 4G LTE Network Lte future mobiletech_wp
Wireless 4G LTE Network Lte future mobiletech_wpWireless 4G LTE Network Lte future mobiletech_wp
Wireless 4G LTE Network Lte future mobiletech_wpCMR WORLD TECH
 
3 Ways To Accelerate Your Transformation to Cloud Provider
3 Ways To Accelerate Your Transformation to Cloud Provider3 Ways To Accelerate Your Transformation to Cloud Provider
3 Ways To Accelerate Your Transformation to Cloud ProviderJuniper Networks UKI
 
Demystifying LTE Performance Management and Optimization
Demystifying LTE Performance Management and OptimizationDemystifying LTE Performance Management and Optimization
Demystifying LTE Performance Management and OptimizationOpeyemi Praise
 
B.Tech. Summer Training Report
B.Tech. Summer Training ReportB.Tech. Summer Training Report
B.Tech. Summer Training ReportShashank Narayan
 
Critical Success Factors for 4G LTE Launching in Taiwan
Critical Success Factors for 4G LTE Launching in TaiwanCritical Success Factors for 4G LTE Launching in Taiwan
Critical Success Factors for 4G LTE Launching in TaiwanTeerasit Songtis
 
Design and Simulation of Local Area Network Using Cisco Packet Tracer
Design and Simulation of Local Area Network Using Cisco Packet TracerDesign and Simulation of Local Area Network Using Cisco Packet Tracer
Design and Simulation of Local Area Network Using Cisco Packet TracerAbhi abhishek
 
GSM Handover Optimization.pdf
GSM Handover Optimization.pdfGSM Handover Optimization.pdf
GSM Handover Optimization.pdfKhurram Rafique
 
Increasing customer base using mpls technology in yemennet abdulrahman,2014
Increasing customer base using mpls technology in yemennet abdulrahman,2014Increasing customer base using mpls technology in yemennet abdulrahman,2014
Increasing customer base using mpls technology in yemennet abdulrahman,2014Abdulrahman Abutaleb
 
4 g lte as wan solution white paper
4 g lte as wan solution white paper4 g lte as wan solution white paper
4 g lte as wan solution white paperMark Zumwalt
 
AndriodMobileComputingAssignment
AndriodMobileComputingAssignmentAndriodMobileComputingAssignment
AndriodMobileComputingAssignmentRebecca Patient
 
Lte-Dimensionnement EMERSON EDUARDO RODRIGUES
Lte-Dimensionnement EMERSON EDUARDO RODRIGUESLte-Dimensionnement EMERSON EDUARDO RODRIGUES
Lte-Dimensionnement EMERSON EDUARDO RODRIGUESEMERSON EDUARDO RODRIGUES
 
SCP front page interview of Mr. Apurva Mankad, WebXpress CEO
SCP front page interview of Mr. Apurva Mankad, WebXpress CEOSCP front page interview of Mr. Apurva Mankad, WebXpress CEO
SCP front page interview of Mr. Apurva Mankad, WebXpress CEOWebXpress.IN
 
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...Cisco Service Provider Mobility
 

Similar to Handover Parameters Self-optimization by Q-Learning in 4G Networks (20)

Wireless 4G LTE Network Lte future mobiletech_wp
Wireless 4G LTE Network Lte future mobiletech_wpWireless 4G LTE Network Lte future mobiletech_wp
Wireless 4G LTE Network Lte future mobiletech_wp
 
3 Ways To Accelerate Your Transformation to Cloud Provider
3 Ways To Accelerate Your Transformation to Cloud Provider3 Ways To Accelerate Your Transformation to Cloud Provider
3 Ways To Accelerate Your Transformation to Cloud Provider
 
Term paper
Term paperTerm paper
Term paper
 
Demystifying LTE Performance Management and Optimization
Demystifying LTE Performance Management and OptimizationDemystifying LTE Performance Management and Optimization
Demystifying LTE Performance Management and Optimization
 
QoS in an LTE network
QoS in an LTE networkQoS in an LTE network
QoS in an LTE network
 
LAN Proposal
LAN Proposal LAN Proposal
LAN Proposal
 
etd7288_MHamidirad
etd7288_MHamidiradetd7288_MHamidirad
etd7288_MHamidirad
 
finalwithrec4
finalwithrec4finalwithrec4
finalwithrec4
 
Lan network with Redundancy
Lan network with RedundancyLan network with Redundancy
Lan network with Redundancy
 
B.Tech. Summer Training Report
B.Tech. Summer Training ReportB.Tech. Summer Training Report
B.Tech. Summer Training Report
 
Critical Success Factors for 4G LTE Launching in Taiwan
Critical Success Factors for 4G LTE Launching in TaiwanCritical Success Factors for 4G LTE Launching in Taiwan
Critical Success Factors for 4G LTE Launching in Taiwan
 
Design and Simulation of Local Area Network Using Cisco Packet Tracer
Design and Simulation of Local Area Network Using Cisco Packet TracerDesign and Simulation of Local Area Network Using Cisco Packet Tracer
Design and Simulation of Local Area Network Using Cisco Packet Tracer
 
GSM Handover Optimization.pdf
GSM Handover Optimization.pdfGSM Handover Optimization.pdf
GSM Handover Optimization.pdf
 
Increasing customer base using mpls technology in yemennet abdulrahman,2014
Increasing customer base using mpls technology in yemennet abdulrahman,2014Increasing customer base using mpls technology in yemennet abdulrahman,2014
Increasing customer base using mpls technology in yemennet abdulrahman,2014
 
4 g lte as wan solution white paper
4 g lte as wan solution white paper4 g lte as wan solution white paper
4 g lte as wan solution white paper
 
AndriodMobileComputingAssignment
AndriodMobileComputingAssignmentAndriodMobileComputingAssignment
AndriodMobileComputingAssignment
 
Lte-Dimensionnement EMERSON EDUARDO RODRIGUES
Lte-Dimensionnement EMERSON EDUARDO RODRIGUESLte-Dimensionnement EMERSON EDUARDO RODRIGUES
Lte-Dimensionnement EMERSON EDUARDO RODRIGUES
 
LTE Dimensioning
LTE DimensioningLTE Dimensioning
LTE Dimensioning
 
SCP front page interview of Mr. Apurva Mankad, WebXpress CEO
SCP front page interview of Mr. Apurva Mankad, WebXpress CEOSCP front page interview of Mr. Apurva Mankad, WebXpress CEO
SCP front page interview of Mr. Apurva Mankad, WebXpress CEO
 
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...
The Quality on 4G Networks Is Like an Attractive Person Nobody Wants to Date ...
 

Recently uploaded

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Handover Parameters Self-optimization by Q-Learning in 4G Networks

  • 1.
  • 2. I Handover Parameters Self-optimization by Q-Learning in 4G Networks Realized by: Supervised by: Mohamed Raafat OMRI PhD. Maissa BOUJELBEN July 12, 2016
  • 3. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. i Dedication I dedicate my dissertation work to my family and my friends. A special feeling of gratitude to my loving parents, Salah and Sghaira Omri whose words of encouragement and push for tenacity ring in my ears. My sisters Kaouther, Lamia, Soumaya, Leila and my brother Lotfi have never left my side and are very special. I dedicate this dissertation to my friends who have supported me throughout the process. I will always appreciate all they have done. I also dedicate this work and give special thanks to my lovely fiancée Safa.
  • 4. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. ii ACKNOWLEDGEMENTS I would like to thank my supervisor PhD. Maissa Boujelben for her help and guidance throughout my progress in this project. I would like to acknowledge and thank Mr. Walid Douagi, head of Telecom Department, PhD. Talel Zouari, my school ESPRIT and ESPRIT TECH for allowing me to conduct my project and providing the requested assistance. Special thanks go to the members of the jury. I must acknowledge as well the many friends, colleagues and teachers who assisted, advised, and supported my engineering studies and writing efforts over the years. Finally I would like to acknowledge my family for their unlimited support and help.
  • 5. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. iii Abstract With more and more customers using mobile communications it is important for the service providers to give their customers the best Quality-of-Service (QoS) they can afford. Many providers have taken to improving their networks and make them more appealing to customers. One such improvement that providers can deliver to their customers is to enhance reliability of the network meaning that customers' calls are less likely to be dropped by the network. This dissertation explores improving the reliability of a 4G network by optimizing the parameters used in handover. The process of handover within mobile communication networks is very important since it allows users to move around freely while still staying connected to the network. The most important parameters used in the handover process are the Time-to-Trigger (TTT) and Hysteresis (hys). These parameters are used to determine whether a base station is better than the serving base station by enough offset to warrant a handover taking place. The challenge in optimizing the handover parameters is that there is a fine balance that needs to be struck between calls being dropped due to a handover failing and the connection switching back and forth between two base stations, unnecessarily, wasting the network resources. In this project, we propose to use a machine learning technique known as Q-Learning to optimize the handover parameters by generating a policy that can be followed to adjust the parameters as needed. It was found that the implemented Q-Learning algorithm was capable of improving the Handover performance by minimizing the chosen Handover- related Key Performance Indicators (KPI). Key words: LTE-Advanced, Handover, Q-learning Algorithm, Hysteresis margin, Time-To- Trigger, Self-Optimization Network.
  • 6. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. iv Table of contents General Introduction .................................................................................................................. 1 Chapter I: LTE-Advanced Overview ......................................................................................... 3 Introduction......................................................................................................................... 3 1.1. Requirements and Targets for LTE-Advanced............................................................ 3 1.2. LTE Enabling Technologies. ....................................................................................... 5 1.2.1. Downlink OFDMA (Orthogonal Frequency Division Multiple Access).................. 5 1.2.2. Uplink SC-FDMA (Single Carrier Frequency Division Multiple Access)............... 6 1.2.3. LTE-A Channel Bandwidths and resource elements................................................ 7 1.3. LTE-Advanced Network Architecture......................................................................... 7 1.3.1. The Core Network: Evolved Packet Core (EPC)...................................................... 8 1.3.2. The Access Network E-UTRAN............................................................................... 9 1.3.3. The User Equipment (UE). ..................................................................................... 12 1.4. E-UTRAN Network Interfaces...................................................................................... 12 1.4.1. X2 Interface. ........................................................................................................... 12 1.4.2. S1 Interface ............................................................................................................. 13 1.5. LTE Protocol Architecture ............................................................................................ 14 1.5.1. User Plane ............................................................................................................... 14 1.5.2. Control Plane .......................................................................................................... 14 1.5.2.1. Radio Resource Control (RRC)............................................................................... 15 1.5.2.2. Radio Resource Control States................................................................................ 16 1.6. Self-Organizing Networks............................................................................................. 17 Conclusion............................................................................................................................ 19 Chapter II: Handover in LTE-Advanced.................................................................................. 20 Introduction. ......................................................................................................................... 20 2.1. Handover Definition and Characteristics ...................................................................... 20 2.1.1. Seamless Handover................................................................................................. 20 2.1.2. Lossless Handover .................................................................................................. 21 2.2. Types of Handover ........................................................................................................ 22 2.2.1. Intra LTE Handover: Horizontal Handover............................................................ 22 2.2.2. Vertical Handover................................................................................................... 22 2.3. Handover Techniques.................................................................................................... 23
  • 7. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. v 2.3.1. Soft handover, Make-Before-Break........................................................................ 23 2.3.2. Hard handover, Break-Before-Make: ..................................................................... 23 2.4. Handover Procedure ...................................................................................................... 24 2.5. Handover Measurements ............................................................................................... 29 2.6. Handover Parameters..................................................................................................... 31 2.7. Time To Trigger & Hysteresis....................................................................................... 33 Conclusion............................................................................................................................ 35 Chapter III: Machine Learning and Handover Parameter Optimization simulation................ 36 Introduction. ......................................................................................................................... 36 3.1. Q-Learning overview..................................................................................................... 36 3.1.1. Machine Learning. .................................................................................................. 36 3.1.2. Reinforcement Learning. ........................................................................................ 37 3.1.3. Q-Learning.............................................................................................................. 38 3.2. Proposed Approach for HO optimization:..................................................................... 40 3.2.1. Set of states ............................................................................................................. 40 3.2.2. Set of actions........................................................................................................... 42 3.2.3. Reward. ................................................................................................................... 43 3.3. Simulation & Performance evaluation: ......................................................................... 43 3.3.1. Simulation parameters............................................................................................. 44 3.3.2. Simulation results.................................................................................................... 48 Conclusion............................................................................................................................ 52 General Conclusion.................................................................................................................. 53 References ................................................................................................................................ 54
  • 8. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. vi List of Figures Figure 1: Orthogonal Frequency Division Multiple Access…………………………….. 6 Figure 2: LTE SAE Evolved Packet Core………………………………………………. 8 Figure 3: E-UTRAN Architecture………………………………………………………. 9 Figure 4: Functional Split between E-UTRAN and EPC……………………………….. 11 Figure 5: Protocol stack for the user-plane and control-plane at X2 interface………….. 13 Figure 6: Protocol stack for the user-plane and control-plane at S1 interface …………. 13 Figure 7: E-UTRAN Protocol Stack…………………………………………………….. 14 Figure 8: The RRC States……………………………………………………………….. 16 Figure 9: Decision on Handover Type………………………………………………….. 24 Figure 10: Intra-MME/Serving Gateway Handover ……………………………………. 25 Figure 11: Handover Timing ……………………………................................................ 28 Figure 12: Downlink reference signal structure for LTE-Advanced …………………... 31 Figure 13: Handover measurement filtering and reporting …………………………… 31 Figure 14: Handover triggering procedure …………………………………………….. 32 Figure 15: State 157 possible actions…………………………………………………… 42 Figure 16: Illustration of Coverage within the Simulation Area……………………….. 45 Figure 17: Illustration of how the TTT values changed over time for large values when UE travelling at walking speeds………………………………………………………… 49 Figure 18: Comparison of TTT Optimization for Walking Speeds (Starting Point 5.12s)……………………………………………………………………………………… 50 Figure 19: Graph of Optimized vs. Non-Optimized Results for Starting Point TTT=0s hys.=0dB when UE traveling at walking speeds……………………………….. 51 Figure 20: Graph of Optimized vs. Non-Optimized Results for Starting Point TTT=0.256s hys.=5dB when UE traveling at walking speeds…………………………… 52
  • 9. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. vii List of Tables Table 1: LTE-Advanced development history…………………………………………... 3 Table 2: Number of PRBs …………….………………………………………………... 6 Table 3: Operational benefits by SON…………………………………………………... 19 Table 4: Table of the different LTE hys. values………………………………………. 33 Table 5: Table of the different LTE TTT values……………………………………….. 34 Table 6: Table of the different LTE Trigger types and their criteria…………………… 34 Table 7: Set of states…………………………………………………………………….. 41 Table 8: Simulation parameters………………………………………………………… 47
  • 10. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. viii Abbreviations 3G 3rd Generation (Cellular Systems) 3GPP Third Generation Partnership Project 4G 4th Generation (Cellular Systems) AC Admission Control ACK Acknowledgement (in ARQ protocols) AI Artificial Intelligence AM Acknowledged mode AGWA Access Gateway AS Access Stratum BS Base Station CDF Cumulative Distribution Function CDMA Code Division Multiple Access CQI Channel Quality Indicator CS Circuit-Switched dB Decibel DFT Discrete Fourier Transform DL Downlink DRB Data Radio Bearer eNodeB Enhanced Node B (3GPP Base Station) EPC Evolved Packet Core E-UTRAN Evolved Universal Terrestrial Radio Access FDD Frequency Division Duplex GPRS General Packet Radio Service GSM Global System for Mobile communications HO Handover HOM HO margin HSDPA High Speed Downlink Packet Access HSS Home Subscriber Server HYS Hysteresis IMS Multimedia Sub-system IP Internet Protocol
  • 11. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. ix ITU International Telecommunication Union ITU-T ITU Telecommunication Standardization Sector LTE-Advanced Long Term Evolution Advanced MAC Medium Access Control MME Mobility Management Entity NACK Negative Acknowledgement NAS Non-Access Stratum NGMN Next Generation Mobile Networks OFDM Orthogonal Frequency Division Multiplexing OFDMA Orthogonal Frequency Division Multiple Access OTP Optimum Trigger Point PAPR Peak-to-Average Power Ratio PCRF Policy and Charging Rules Function PDCP Packet-Data Convergence Protocol PDN Packet Data Network PDU Protocol Data Unit PGW PDN Gateway QoS Quality of Service RAN Radio Access Network RB Resource Block RF Radio Frequency RLC Radio Link Protocol RNC Radio Network Controller ROHC RObust Header Compression RRC Radio Resource Control RRM Radio Resource Management RSRP Reference Signal Received Power RSRQ Reference Signal Received Quality RSS Received Signal Strength RSSI Received Signal Strength Indicator SAE System Architecture Evolution S1 The interface between eNodeB and Access Gateway S1AP S1 Application Part SC-FDMA Single Carrier - Frequency Division Multiple Access
  • 12. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. x SGW Serving Gateway SINR Signal-to-Interference-plus-Noise Ratio SIR Signal-to-Interference Ratio SN Sequence Number SON Self-Organizing Network SRB Signaling Radio Bearers TE Terminal Equipment TM Transparent Mode TTI Transmission Time Interval TTT Time-to-Trigger UE User Equipment, the 3GPP name for the mobile terminal UL Uplink UM Unacknowledged Mode UMTS Universal Mobile Telecommunication System USIM Universal Subscriber Identity Module VoIP Voice over IP X2 Interface between eNodeB’s
  • 13. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 1 General Introduction In recent years, there has been enormous growth in mobile telecommunications traffic in line with the rapid spread of smart phone devices. The cellular networks are evolving to meet the future requirements of data rate, coverage and capacity. LTE Advanced is a mobile communication standard and a major enhancement of the Long Term Evolution (LTE) standard. It was formally submitted as a candidate 4G system to ITU-T in late 2009 as meeting the requirements of the IMT-Advanced standard, and was standardized by the 3rd Generation Partnership Project (3GPP) in March 2011 as 3GPP Release 10. One of the important LTE Advanced benefits is the ability to take advantage of advanced topology networks; optimized heterogeneous networks with a mix of macrocells with low power nodes such as picocells, femtocells and new relay nodes. The next significant performance leap in wireless networks will come from making the most of topology, and brings the network closer to the user by adding many of these low power nodes. LTE-Advanced further improves the capacity and coverage, and ensures user fairness. LTE-Advanced also introduces multicarrier to be able to use ultra wide bandwidth, up to 100 MHz of spectrum supporting very high data rates. Mobility aspect for the enhancement is an important Long Term Evolution technology since it should support mobility for various mobile speeds up to 350km/h or even up to 500km/h. With the moving speed even higher, the handover procedure will be more frequent and fast; therefore, the handover performance becomes more crucial especially for real time services [11]. One of the main goals of LTE-Advanced or any wireless system for that matter is to provide fast and seamless handover from one cell (a source cell) to another (a target cell). The service should be maintained during the handover procedure, data transfer should not be delayed or should not be lost; otherwise performance will be dramatically degraded. This is especially applicable for LTE-Advanced systems because of the distributed nature of the LTE radio access network architecture which consists of just one type of node, the base station, known in LTE-Advanced as the eNodeB [7]. In LTE-Advanced there are also some predefined handover conditions for triggering the handover procedure as well as some goals regarding handover design and optimization such as decreasing the total number of handovers in the whole system by predicting the handover, decreasing the number of ping pong handovers, and having fast and seamless handover.
  • 14. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 2 Hence, optimizing the handover procedure to get the required performance is considered as one important issue in LTE-Advanced networks [11]. Actually, many studies are carried out to achieve improvements in LTE-Advanced handover, with different HO algorithms and which take several stages for different cases, but certainly all of them are done in order to get optimum handover mechanisms that can handle the smooth handover on cell boundaries of the LTE-Advanced network. The main objective of this project is to develop a Q-learning algorithm to self-optimize the parameters used in the handover process of 4G networks. In this project we have three chapters: the first chapter contains an overview of LTE technology; the main characteristics and functionalities of the system are described as well as the enabling technologies, network architecture and protocol. In the second chapter, we introduce the general concepts of handover and we describe the whole HO procedure. The optimization and design principles as well as the variables used as inputs and the different HO parameters also explained and finally the third chapter discusses our proposed approach. First, we present the machine Learning explaining thus the reinforcement learning and the Q- Learning. Then we discuss the handover parameter optimization. Finally we present the simulation parameters and the obtained results.
  • 15. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 3 Chapter I: LTE-Advanced Overview Introduction: In LTE-Advanced networks, focus is on higher capacity: The driving force to further develop LTE towards LTE–Advanced - LTE Release10 was to provide higher bitrates in a cost efficient way and, at the same time, completely fulfill the requirements set by ITU for IMT Advanced, also referred to as 4G. In this chapter, we will present the LTE-Advanced technologies, resource elements and the network architecture by citing the different key components. 1.1. Requirements and Targets for LTE-Advanced: 3GPP completed the process of defining LTE-Advanced for radio access, so that the technology systems remain competitive in the future. The 3GPP has identified a set of high level requirements that have already been exceeded so far. The following target requirements were agreed among operators and vendors at the project to define the evolution of 3G networks. Table 1: LTE-Advanced development history. WCDMA (UMTS) HSPA HSDPA/HSUPA HSPA+ LTE LTE-A Max downlink speed (bps) 384 K 14 M 28 M 100 M 1 G Max uplink speed (bps) 128 K 5.7 M 11 M 50 M 100 M Latency round trip time (approx.) 150 ms 100 ms 50 ms max ~10 ms Less than 5 ms 3GPP releases Rel 99/5 Rel 5/6 Rel 7 Rel 8/9 Rel 10 Approx. years of initial roll out 2003/4 2005/6 HSDPA 2007/8 HSUPA 2008/9 2009/10 2011 Access methodology CDMA CDMA CDMA OFDMA/SC- FDMA OFDMA/SC- FDMA Some of key LTE-Advanced requirements related to data rate, throughput, latency, and mobility are provided below [3]:
  • 16. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 4  Peak data rate: o 1 Gbps data rate will be achieved by 4 ‐by‐ 4 MIMO and transmission bandwidth wider than approximately 70 MHz.  Peak Spectrum efficiency: o DL: Rel. 8 LTE satisfies IMT ‐Advanced requirement: 30 bps/Hz. o UL: Need to double from Release 8 to satisfy IMT‐Advanced requirement: 15 bps/Hz and 30 bps/Hz in Rel 10.  Capacity and cell‐edge user throughput: o Target for LTE‐Advanced was set considering gain of 1.4 to 1.6 from Release 8 LTE performance.  Spectrum flexibility: In addition to the bands currently defined for LTE Release 8, TR 36.913 identifies the following new bands: o 450–470 MHz band o 698–862 MHz band o 790–862 MHz band o 2.3–2.4 GHz band o 3.4–4.2 GHz band o 4.4–4.99 GHz band Some of these bands are now formally included in the 3GPP Release 9 and Release 10 specifications. Note that frequency bands are considered release independent features, which means that it is acceptable to deploy an earlier release product in a band not defined until a later release. LTE-Advanced is designed to operate in spectrum allocations of different sizes, including allocations wider than the 20 MHz in Release 8, in order to achieve higher performance and target data rates. Although it is desirable to have bandwidths greater than 20 MHz deployed in adjacent spectrum, the limited availability of spectrum means that aggregation from different bands is necessary to meet the higher bandwidth requirements. This option has been allowed for in the IMT-Advanced specifications.  Mobility: o E-UTRAN should be optimized for low mobile speed from 0 to 15 km/h. o Higher mobile speed between 15 and 120 km/h should be supported with high performance.
  • 17. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 5 o Mobility across the cellular network shall be maintained at speeds from 120 km/h to 350 km/h (or even up to 500 km/h depending on the frequency band).  Coverage: o Throughput, spectrum efficiency and mobility targets above should be met for 5 km cells, and with a slight degradation for 30 km cells. Cells range up to 100 km should not be precluded. o Available for paired and unpaired spectrum arrangements. 1.2. LTE Enabling Technologies: LTE has introduced a number of new technologies when compared to the previous cellular systems. They enable LTE-Advanced to operate more efficiently with respect to the use of spectrum, and also to provide much higher data rates that are being required. A major difference of LTE-Advanced in comparison to its 3GPP ancestors is the radio interface; Orthogonal Frequency Division Multiple Access (OFDMA) and Single Carrier Frequency Division Multiple Access (SC-FDMA) are used for the downlink and uplink respectively, as radio access schemes [6]. 1.2.1. Downlink OFDMA (Orthogonal Frequency Division Multiple Access): OFDMA is a variant of OFDM (Orthogonal Frequency Division Multiplexing) and it is the downlink access technology. One of the most important advantages is the intrinsic orthogonality provided by OFDMA to the users within a cell, which translates into an almost null level of intra-cell interference. Therefore, inter-cell interference is the limiting factor when high reuse levels are intended. In this case, cell-edge users are especially susceptible to the effects of inter-cell interference. OFDMA divides the wide available bandwidth into many narrow and mutually orthogonal subcarriers and transmits the data in parallel streams. The smallest transmission unit in the downlink LTE-Advanced system is known as a Physical Resource Block (PRB).
  • 18. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 6 Figure 1: Orthogonal Frequency Division Multiple Access [5]. A resource block contains 12 subcarriers, regardless of the overall LTE-Advanced signal bandwidth. They also cover one slot in the time frame; this means that different LTE- Advanced signal bandwidths will have different numbers of resource blocks. Table 2: Number of PRBs. Channel Bandwidth (MHz) 1.4 3 5 10 15 20 Number of PRBs 6 15 25 50 75 100 The OFDM signal used in LTE-Advanced comprises a maximum of 2048 different sub- carriers having a spacing of 15 kHz. Although it is mandatory for the mobiles to have capability to be able to receive all 2048 sub-carriers, not all need to be transmitted by the base station (eNodeB) which only needs to be able to support the transmission of 72 sub-carriers. In this way all mobiles will be able to talk to any base station. 1.2.2. Uplink SC-FDMA (Single Carrier Frequency Division Multiple Access): For the LTE-Advanced uplink, a different concept is used for the access technique. Although still using a form of OFDMA technology, the implementation is called Single Carrier Frequency Division Multiple Access (SC-FDMA). The main task of this scheme is to assign communication resources to multiple users. The major difference to other schemes is that it
  • 19. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 7 performs DFT (Discrete Fourier Transform) operation on time domain modulated data before going into OFDM modulation. One of the key parameters that affect all mobiles is that of battery life. Even though battery performance is improving all the time, it is still necessary to ensure that the mobiles use as little battery power as possible. With the RF power amplifier that transmits the radio frequency signal via the antenna to the base station being the highest power item within the mobile, it is necessary that it operates in as efficient mode as possible. This can be significantly affected by the form of radio frequency modulation and signal format. Signals that have a high peak to average ratio and require linear amplification do not lend themselves to the use of efficient RF power amplifiers [5]. 1.2.3. LTE-A Channel Bandwidths and resource elements: One of the key parameters associated with the use of OFDM within LTE-Advanced is the choice of bandwidth. The available bandwidth influences a variety of decisions including the number of carriers that can be accommodated in the OFDM signal and in turn this influences elements including the symbol length and so forth [6]. LTE can support 6 kinds of bandwidth and obviously, to higher bandwidth we will obtain greater channel capacity: 1.4 MHz, 3MHz, 5MHz, 10MHz, 15 MHz and 20MHz. In addition to this, the subcarriers are spaced 15 kHz apart from each other. To maintain orthogonality, this gives a symbol rate of 1 / 15 kHz = of 66.7 µs. Each subcarrier is able to carry data at a maximum rate of 15 ksps (kilo symbols per second). This gives a 20 MHz bandwidth system a raw symbol rate of 18 Msps. In turn this is able to provide a raw data rate of 108 Mbps as each symbol using 64QAM is able to represent six bits. 1.3. LTE-Advanced Network Architecture: LTE-A has been designed to support only packet switched services, in contrast to the circuit- switched model of previous cellular systems. It aims to provide seamless Internet Protocol (IP) connectivity between User Equipment (UE) and the Packet Data Network (PDN), without any disruption to the end users’ applications during mobility [2]. While the term “LTE” encompasses the evolution of the Universal Mobile Telecommunications System (UMTS) radio access through the Evolved UTRAN (E- UTRAN), it is accompanied by an evolution of the non-radio aspects under the term “System Architecture Evolution” (SAE).
  • 20. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 8 Together LTE-Advanced and SAE comprise the Evolved Packet System (EPS). This EPS in turn includes the EPC (Evolved Packet Core) on the core side and E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) on the access side [2]. In addition to these two components, User Equipment (UE) and Services Domain are also very important subsystems of LTE architecture. 1.3.1. The Core Network: Evolved Packet Core (EPC): The core network is responsible for the overall control of the UE and establishment of the bearers. The Evolved Packet Core is the main element of the LTE-Advanced SAE network. This consists of four main elements and connects to the eNodeB’s as shown in the diagram below. Figure 2: LTE-Advanced SAE Evolved Packet Core [6].  Mobility Management Entity (MME): The MME is the main control node for the LTE SAE access network, handling a number of features, it can therefore be seen that the SAE MME provides a considerable level of overall control functionality. The protocols running between the UE and the CN are known as the Non Access Stratum (NAS) protocols. The main functions supported by the MME can be classified as:  Functions related to bearer management – This includes the establishment, maintenance and release of the bearers and is handled by the session management layer in the NAS protocol.
  • 21. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 9  Functions related to connection management – This includes the establishment of the connection and security between the network and UE and is handled by the connection or mobility management layer in the NAS protocol layer.  Serving Gateway (SGW): The Serving Gateway, SGW, is a data plane element within the LTE SAE. Its main purpose is to manage the user plane mobility and it also acts as the main border between the Radio Access Network, RAN and the core network. The SGW also maintains the data paths between the eNodeB’s and the PDN Gateways. In this way the SGW forms an interface for the data packet network at the E-UTRAN.  PDN Gateway (PGW): The LTE SAE PDN (Packet Data Network) gateway provides connectivity for the UE to external packet data networks, fulfilling the function of entry and exit point for UE data. The UE may have connectivity with more than one PGW for accessing multiple PDNs.  Home Subscription Server (HSS): The HSS is a database server which is located in the operator's premises. All the user subscription information is stored in the HSS. The HSS also contains the records of the user location and has the original copy of the user subscription profile. The HSS is interacting with the MME, and it needs to be connected to all the MMEs in the network that controls the UE. 1.3.2. The Access Network E-UTRAN: The E-UTRAN is the Access Network of LTE and simply consists of a network of eNodeB’s that are connected to each other via X2 interface as illustrated in Figure 3. The eNodeB’s are also connected to the EPC via S1 interface, more specifically to the MME by means of the S1-MME interface and to the S-GW by means of the S1-U interface. Figure 3: E-UTRAN Architecture [9].
  • 22. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 10  eNodeB: The eNodeB is a radio base station of a LTE network that controls all radio-related functions in the fixed part of the system. These radio base stations are distributed throughout the coverage region and each of them is placed near a radio antenna. One of the biggest differences between LTE network and legacy mobile communication system 3G is a base station. Practically, an eNodeB provides bridging between the UE and EPC. All the radio protocols that are used in the access link are terminated in the eNodeB. The eNodeB does ciphering/deciphering in the user plane as well as IP header compression/decompression. The eNodeB also has some responsibilities in the control plane such as radio resource management and performing control over the usage of radio resources. The E-UTRAN has many responsibilities regarding to all related radio functions. The main features that supports are the following:  Radio Resource Management: The RRM objective is to make the mobility feasible in cellular wireless networks so that the network with the help of the UE takes care of the mobility without user intervention. RRM covers all functions related to the radio bearers, such as radio bearer control, radio admission control, radio mobility control, scheduling and dynamic allocation of resources to UEs in both uplink and downlink.  IP Header Compression: This helps to ensure efficient use of the radio interface by compressing the IP packet headers which could otherwise represent a significant overhead, especially for small packets such as VoIP. One of the main functions of PDCP (Packet Data Convergence Protocol) is header compression using the Robust Header Compression (ROHC) protocol defined by the IETF. In LTE, header compression is very important because there is no support for the transport of voice services via the Circuit-Switched (CS) domain.  Security: Security is a very important feature of all 3GPP radio access technologies. LTE provides security in a similar way to its predecessors UMTS and GSM. Because of the sensitivity of signaling messages exchanged between the eNodeB itself and the terminal, or between the MME and the terminal, all this set of information is protected against eavesdropping and alteration.
  • 23. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 11 The implementation of security architecture of LTE is carried out by two functions: Ciphering of both control plane (RRC) data and user plane data, and Integrity Protection which is used for control plane (RRC) data only. Ciphering is used in order to protect the data streams from being received by a third party, while Integrity Protection allows the receiver to detect packet insertion or replacement. RRC always activates both functions together, either following connection establishment or as part of the handover to LTE.  Connectivity to the EPC: This function consists of the signaling towards the MME and the bearer path towards the S- GW. All of the above-mentioned functions are concentrated in the eNodeB as in LTE all the radio controller functions are gathered in the eNodeB. This concentration helps different protocol layers interact with each other better and will end up in decreased latency and increase in efficiency. On the network side, all of these functions reside in the eNodeB’s, each of which can be responsible for managing multiple cells. Unlike some of the previous second and third generation technologies, LTE integrates the radio controller function into the eNodeB. This allows tight interaction between the different protocol layers of the radio access network (RAN), thus reducing latency and improving efficiency. Furthermore, as LTE does not support soft handover there is no need for a centralized data-combining function in the network. One consequence of the lack of a centralized controller node is that, as the UE moves, the network must transfer all information related to a UE, that is, the UE context, together with any buffered data, from one eNodeB to another. Mechanisms are therefore needed to avoid data loss during handover. Figure 4: Functional Split between E-UTRAN and EPC [5].
  • 24. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 12 1.3.3. The User Equipment (UE): The end user communicates using a UE. The UE can be a handheld device like a smart phone or it can be a device which is embedded in a laptop. The UE is divided into two parts: the Universal Subscriber Identity Module (USIM) and the rest of the UE, which is called Terminal Equipment (TE). The USIM is an application with the purpose of identification and authentication of the user for obtaining security keys. This application is placed into a removable smart card called a universal integrated circuit card (UICC). The UE in general is the end-user platform that by the use of signaling with the network, sets up, maintains, and removes the necessary communication links. The UE is also assisting in the handover procedure and sends reports about terminal location to the network. 1.4. E-UTRAN Network Interfaces: There are two interfaces concerned in handover procedure in LTE for UEs in active mode, which are X2 and S1 interfaces. Both interfaces can be used in handover procedures, but with different purposes. 1.4.1. X2 Interface: The X2 interface has a key role in the intra-LTE handover operation. The source eNodeB will use the X2 interface to send the Handover Request message to the target eNodeB. If the X2 interface does not exist between the two eNodeB’s in question, then procedures need to be initiated to set one up before handover can be achieved [3]. The Handover Request message initiates the target eNodeB to reserve resources and it will send the Handover Request Acknowledgement message assuming resources are found. There are different information elements provided (some optional) on the handover Request message, such as:  Requested SAE bearers to be handed over.  Handover restrictions list, which may restrict following handovers for the UE.
  • 25. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 13  Last visited cells the UE has been connected to, if the UE historical information collection functionality is enabled. This has been considered to be useful in avoiding the Ping-Pong effects between different cells when the target eNodeB is given information on how the serving eNodeB has been changing in the past. Thus actions can be taken to limit frequent X2 User Plane. Figure 5: Protocol stack for the user-plane and control-plane at X2 interface [3]. 1.4.2. S1 Interface: The radio network signaling over S1 consists of the S1 Application Part (S1AP).The S1AP protocol handles all procedures between the EPC and E-UTRAN. It is also capable of carrying messages transparently between the EPC and the UE. Over the S1 interface the S1AP protocol primarily supports general E-UTRAN procedures from the EPC, transfers transparent non-access signaling and performs the mobility function. The figure below shows the protocol stack for the user-plane and control-plane at S1 interface [3]. Figure 6: Protocol stack for the user-plane and control-plane at S1 interface [3].
  • 26. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 14 1.5. LTE Protocol Architecture: The overall radio interface protocol architecture for LTE can be divided into User Plane Protocols and Control Plane Protocols. The U-UTRAN protocol stack is depicted in the figure 7. Figure 7: E-UTRAN Protocol Stack [8]. 1.5.1. User Plane: An IP packet is tunneled between the P-GW and the eNodeB to be transmitted towards the UE. Different tunneling protocols can be used. The tunneling protocol used by 3GPP is called the GPRS tunneling protocol (GTP) [8]. The LTE Layer 2 user-plane protocol stack is composed of three sub layers: Packet Data Convergence Protocol (PDCP), Radio Link Control (RLC) and Medium Access Control (MAC). These sub layers are terminated in the eNodeB on the network side. 1.5.2. Control Plane: Control plane and User plane have common protocols which perform the same functions except that for the control plane protocols there is no header compression. In the access stratum protocol stack and above the PDCP, there is the Radio Resource Control (RRC) protocol which is considered as a “Layer 3” protocol. RRC sends signaling messages between
  • 27. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 15 the eNodeB and UE for establishing and configuring the radio bearers of all lower layers in the access stratum. 1.5.2.1. Radio Resource Control (RRC): The RRC (Radio Resource Control) layer is a key signaling protocol which supports many functions between the terminal and the eNodeB. The RRC protocol enables the transfer of common NAS information which is applicable to all UEs as well as dedicated NAS information which is applicable only to a specific UE. In addition, for UEs in RRC_IDLE, RRC supports notification of incoming calls. The key features of RRC are the following:  Broadcast of System Information: Handles the broadcasting of system information, which includes NAS common information. Some of the system information is applicable only for UE’s in RRC-IDLE while other system information is also applicable for UEs in RRC-CONNECTED.  RRC Connection Management: Covers all procedures related to the establishment, modification and release of an RRC connection, including paging, initial security activation, establishment of Signaling Radio Bearers (SRB’s) and of radio bearers carrying user data (Data Radio Bearers, DRB’s), handover within LTE (including transfer of UE RRC context information), configuration of the lower protocol layers, access class barring and radio link failure. Establishment and release of radio resources: This relates to the allocation of resources for the transport of signaling messages or user data between the terminal and eNodeB.  Paging: this is performed through the PCCH logical control channel. The prominent usage of paging is to page the UE’s that are in RRC-IDLE. Paging can also be used to notify UE’s both in RRC-IDLE and RRC-CONNECTED modes about system information changes or SIB10 and SIB11 transfers.  Transmission of signaling messages to and from the EPC: these messages (known as NAS for Non Access Stratum) are transferred to and from the terminal via the RRC; they are, however, treated by RRC as transparent messages.  Handover: the handover is triggered by the eNodeB, based on the received measurement reports from the UE. Handover is classified in different types based on the origin and destination of the handover. The handover can start and end in the E-
  • 28. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 16 UTRAN, it can start in the E-UTRAN and end in another Radio Access Technology (RAT), or it can start from another RAT and end in E-UTRAN. The RRC also supports a set of functions related to end-user mobility for terminals in RRC Connected state. This includes:  Measurement control: This refers to the configuration of measurements to be performed by the terminal as well as the method to report them to the eNodeB.  Support of inter-cell mobility procedures: which are also known as handover  User context transfer: between eNodeB at handover. 1.5.2.2. Radio Resource Control States: The main function of the RRC protocol is to manage the connection between the terminal and the EUTRAN access network. To achieve this, RRC protocol states have been defined and they are depicted in the figure below. Each of them actually corresponds to the states of the connection, and describes how the network and the terminal shall handle special functions like terminal mobility, paging message processing and network system information broadcasting [14]. In E-UTRAN, the RRC state machine is very simple and limited to two states only: RRC- IDLE, and RRC-CONNECTED. Figure 8: The RRC States [14] In the RRC-IDLE state, there is no connection between the terminal and the eNodeB, meaning that the terminal is actually not known by the E-UTRAN Access Network. The terminal user is inactive from an application level perspective, which does not mean at all that nothing happens at the radio interface level. Nevertheless, the terminal behavior is specified in order to save as much battery power as possible and is actually limited to three main items:  Periodic decoding of System Information Broadcast by E-UTRAN: this process is required in case the information is dynamically updated by the network.
  • 29. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 17  Decoding of paging messages: so that the terminal can further connect to the network in case of an incoming session.  Cell reselection: the terminal periodically evaluates the best cell it should camp on through its own radio measurements and based on network System Information parameters. When the condition is reached, the terminal autonomously performs a selection of a new serving cell. In the RRC-CONNECTED state, there is an active connection between the terminal and the eNodeB, which implies a communication context being stored within the eNodeB for this terminal. Both sides can exchange user data and or signaling messages over logical channels. Unlike the RRC-IDLE state, the terminal location is known at the cell level. Terminal mobility is under the control of the network using the handover procedure, which decision is based on many possible criteria including measurement reported by the terminal of by the physical layer of the eNodeB itself. 1.6. Self-Organizing Networks: A self-organizing Network (SON) is an automation technology designed to make the planning, configuration, management, optimization and healing of mobile radio access networks simpler and faster. SON functionality and behavior has been defined and specified in generally accepted mobile industry recommendations produced by organizations such as 3GPP and the NGMN. SON has been codified within 3GPP Release 8 and subsequent specifications in a series of standards including 36.902, as well as public white papers outlining use cases from the NGMN. The first technology making use of SON features will be Long Term Evolution (LTE), but the technology has also been retro-fitted to older radio access technologies such as Universal Mobile Telecommunications System (UMTS). The LTE specification inherently supports SON features like Automatic Neighbor Relation (ANR) detection, which is the 3GPP LTE Rel. 8 flagship feature. Newly added base stations should be self-configured in line with a "plug-and-play" paradigm, while all operational base stations will regularly self-optimize parameters and algorithmic behavior in response to observed network performance and radio conditions. Furthermore, self-healing mechanisms can be triggered to temporarily compensate for a detected equipment outage, while awaiting a more permanent solution. Self-organizing network functionalities are commonly divided into three major sub-functional groups, each containing a wide range of decomposed use cases:
  • 30. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 18  Self-configuration functions: Self-configuration strives towards the "plug-and-play" paradigm in the way that new base stations shall automatically be configured and integrated into the network. This means both connectivity establishment, and download of configuration parameters are software. Self- configuration is typically supplied as part of the software delivery with each radio cell by equipment vendors. When a new base station is introduced into the network and powered on, it gets immediately recognized and registered by the network. The neighboring base stations then automatically adjust their technical parameters (such as emission power, antenna tilt, etc.) in order to provide the required coverage and capacity, and, in the same time, avoid the interference.  Self-optimization functions: Every base station contains hundreds of configuration parameters that control various aspects of the cell site. Each of these can be altered to change network behavior, based on observations of both the base station itself, and measurements at the mobile station or handset. One of the first SON features establishes neighbor relations automatically (ANR), while others optimize random access parameters or mobility robustness in terms of handover oscillations. A very illustrative use case is the automatic switch-off of a percent of base stations during the night hours. The neighboring base station would then re-configure their parameters in order to keep the entire area covered by signal. In case of a sudden growth in connectivity demand for any reason, the "sleeping" base stations "wake up" almost instantaneously. This mechanism leads to significant energy savings for operators.  Self-healing functions: When some nodes in the network become inoperative, self-healing mechanisms aim at reducing the impacts from the failure, for example by adjusting parameters and algorithms in adjacent cells so that other nodes can support the users that were supported by the failing node. In legacy networks, the failing base stations are at times hard to identify and a significant amount of time and resources is required to fix it. This function of SON permits to spot such a failing base stations immediately in order to take further measures, and ensure no or insignificant degradation of service for the users.
  • 31. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 19 Table 3: Operational benefits by SON. Self-Configuration  Flexibility in logistics (eNodeB not site specific).  Reduced site / parameter planning.  Simplified installation; less prone to errors.  No/minimum drive tests.  Faster rollout. Self-Optimization  Increased network quality and performance.  Parameter optimization reduced maintenance, site visits. Self-Healing  Error self-detection and mitigation.  Speed up maintenance.  Reduce outage time. Conclusion: In LTE-Advanced focus is on higher capacity: the driving force to further develop LTE towards LTE–Advanced. LTE-Advanced provides higher bitrates in a cost efficient way and, at the same time, completely fulfill the requirements set by ITU for IMT Advanced, also referred to as 4G. In the next chapter, we will pay particular attention to the handover.
  • 32. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 20 Chapter II: Handover in LTE-Advanced Introduction: Mobility is an essential component of mobile cellular communication systems because it offers clear benefits to the end users: low delay services such as voice or real time video connections can be maintained while moving even in high speed trains. Handover is one of the key procedures for ensuring that the users move freely through the network while still being connected and being offered quality services. Since its success rate is a key indicator of user satisfaction, it is vital that this procedure happens as fast and as seamlessly as possible. Hence, optimizing the handover procedure to get the required performance is considered an important issue in LTE networks. In this context, we study in this chapter the Handover by its characteristics and different types. 2.1. Handover Definition and Characteristics: The process of handover is very important in mobile telecommunications. It involves moving the resource allocation for a mobile phone or a piece of UE from one base station to another. This process is used to provide better Quality-of-Service (QoS) to customers by allowing them to continue to use provided services even after moving out of range of the original serving base station. It is important that handovers are performed quickly, cause little-to-no disruption to the user's experience and are completed with a very high success rate. If a handover is unsuccessful it is likely that an on-going call will be dropped due to there not being enough resources available on a base station (known as an eNodeB in LTE) or the if Received Signal Strength (RSS) to the UE drops below a certain threshold needed to maintain the call. This threshold, in LTE, is known as the noise or and has a value of -97.5dB. Handovers are stated to take roughly 0.25 seconds to complete after the decision has been made for a handover to take place [17]. Depending on the required QoS, a seamless handover or a lossless handover is performed as appropriate for each radio bearer. The descriptions of each of them are presented below. 2.1.1. Seamless Handover: The objective of seamless handover is to provide a given QoS when the UE moves from the coverage of one cell to the coverage of another cell. In LTE seamless handover is applied to
  • 33. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 21 all radio bearers carrying control plane data and for user plane radio bearers mapped on RLC- UM. These types of data are typically reasonably tolerant of losses but less tolerant of delay, (e.g. voice services). Therefore seamless handover should minimize the complexity and delay although some SDUs might be lost [4]. In the seamless handover, PDCP entities including the header compression contexts are reset, and the COUNT values are set to zero. As a new key is anyway generated at handover, there is no security reason to keep the COUNT values. On the UE side, all the PDCP SDUs that have not been transmitted yet will be sent to the target cell after handover. PDCP SDUs for which the transmission has not been started can be forwarded via X2 interface towards the target eNodeB. Unacknowledged PDCP SDUs will be lost. This minimizes the handover complexity because no context (i.e. configuration information) has to be transferred between the source and the target eNodeB. 2.1.2. Lossless Handover: Lossless handover means that no data should be lost during handover. This is achieved by performing retransmission of PDCP PDUs for which reception has not been acknowledged by the UE before the UE detaches from the source cell to make a handover. In lossless handover, in-sequence delivery during handover can be ensured by using PDCP Data PDUs sequence numbers. Lossless handover can be very suitable for delay-tolerant services like file downloads that the loss of PDCP SDUs can enormously decrease the data rate because of TCP reaction. Lossless handover is applied for user plane and for some control plane radio bearers that are mapped on RLC-AM. In lossless handover, on the UE side the header compression protocol is reset because its context is not forwarded from the source eNodeB to the target eNodeB, but the PDCP SDUs' sequence numbers and the COUNT values are not reset [4]. To ensure lossless handover in the uplink, the PDCP PDUs stored in the PDCP retransmission buffer are retransmitted by the RLC protocol based on the PDCP SNs which are maintained during the handover and deliver them to the gateway in the correct sequence. In order to ensure lossless handover in the downlink, the source eNodeB forwards the uncompressed PDCP SDUs for which reception has not yet been acknowledged by the UE to the target eNodeB for retransmission in the downlink.
  • 34. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 22 2.2. Types of Handover: The handover is triggered by the eNodeB, based on the received measurement reports from the UE. Handover is classified in different types based on the origination and destination of the handover. The handover can start and end in the E-UTRAN, it can start in the E-UTRAN and end in another Radio Access Technology (RAT), or it can start from another RAT and end in E-UTRAN [15]. Handover is classified as:  Intra-frequency intra-LTE handover.  Inter-frequency intra-LTE handover.  Inter-RAT towards LTE handover.  Inter-RAT towards UTRAN handover.  Inter-RAT towards GERAN handover.  Inter-RAT towards cdma2000 system handover. 2.2.1. Intra LTE Handover: Horizontal Handover: In intra LTE handover, which is focused by this project, both the origination and destination eNodeB’s are within the LTE system. In this type of handover, the RRC connection reconfiguration message acts as a handover command. The interface between eNodeB’s is an X2 interface. Upon handover, the source eNodeB sends an X2 handover request message to the target eNodeB in order to make it ready for the coming handover. 2.2.2. Vertical Handover: There have been tremendous breakthroughs recorded in the last decade in the historical evolution of the wireless communication networks. The complex nature of the wireless environment has made the technology difficult or almost impossible for the network to be efficient in providing esteemed users high data rate and good Quality of Service (QoS) requirements. In trying to accomplish these demands, fourth generation (4G) wireless systems engage in collaborating heterogeneous wireless technologies to allow users get connected anywhere and at all times. The heterogeneity of the wireless networks involves the integration of diverse radio access technologies (RAT) such as LTE/LTE-Advanced, UMTS, HSPA, GPRS, GSM, WiMAX and WiFi. The purpose of integrating these independent networks is to realize the demand for high data rate and good QoS to support multimedia streaming at precision levels.
  • 35. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 23 Consequently, the issue of seamless handover, high QoS support, resource allocation, mobility management and security must be appropriately addressed before achieving these requirements. As one of the strategies in achieving this purpose, handover mechanism is introduced and could be defined as a process of reassigning resources as a result of Mobile user equipment (UE) movement when it switches from one technology to another. An intra- technology handover process mainly based on the received signal strength (RSS) levels, is known as Horizontal Handover (HHO) and occurs when the UE switches access points (APs) or eNodeBs while maintaining the same network. On the other hand, UE switching their connections to a different network of abstracting proficiencies are termed Vertical Handover (VHO). This has become possible because of the emergence of multitude overlapping wireless networks which makes the handover process more complex. 2.3. Handover Techniques: Handover can be categorized as: Soft handover and hard handover also known as Make-Before-Break and Break-Before-Make respectively. 2.3.1. Soft handover, Make-Before-Break: Soft handover is a category of handover procedures where the radio links are added and abandoned in such manner that the UE always keeps at least one radio link to the UTRAN. Soft and softer handover were introduced in WCDMA architecture. There is a centralized controller called Radio Network Controller (RNC) to perform handover control for each UE in the architecture of WCDMA. It is possible for a UE to simultaneously connect to two or more cells (or cell sectors) during a call. If the cells the UE connected are from the same physical site, it is referred as softer handover [10]. In handover aspect, soft handover is suitable for maintaining an active session, preventing voice call dropping, and resetting a packet session. However, the soft handover requires much more complicated signaling, procedures and system architecture such as in the WCDMA network. 2.3.2. Hard handover, Break-Before-Make: Hard handover is a category of handover procedures where all the old radio links in the UE are abandoned before the new radio links are established. The hard handover is commonly used when dealing with handovers in the legacy wireless systems. The hard handover requires
  • 36. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 24 a user to break the existing connection with the current cell (source cell) and make a new connection to the target cell [10]. In LTE only hard handover is supported, meaning that there is a short interruption in service when the handover is performed. 2.4. Handover Procedure: Depending on whether any EPC entity is involved in preparing and executing of a handover between a source eNodeB and a target eNodeB or not, an LTE handover can be either X2 handover using X2 interface or S1 handover using S1 interface. Figure 9 shows how a source eNodeB decides on a handover type, X2 or S1, when a handover is triggered. Figure 9: Decision on Handover Type. Handover procedure in LTE can be divided into three phases: handover preparation, handover execution and handover completion [4]. The procedure starts with the measurement reporting of a handover event by the User Equipment (UE) to the serving evolved Node B (eNodeB). The Evolved Packet Core (EPC) is not involved in handover procedure for the control plane handling, i.e. preparation messages are directly exchanged between the eNodeB’s [1]. That is the case when X2 interface is deployed, otherwise MME will be used for HO signaling. The handover procedure with the basic handover scenario is depicted in Figure 10.
  • 37. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 25 Figure 10: Intra-MME/Serving Gateway handover [9].
  • 38. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 26  Handover preparation: During the handover preparation, data flows between UE and the core network as usual. This phase includes messaging such as measurement control, which defines the UE measurement parameters and then the measurement report sent accordingly as the triggering criteria is satisfied. Handover decision is then made at the serving eNodeB, which requests a handover to the target cell and performs admission control. Handover request is then acknowledged by the target eNodeB.  Handover execution: Handover execution phase is started when the source eNodeB sends a handover command to UE. During this phase, data is forwarded from the source to the target eNodeB, which buffers the packets. UE then needs to synchronize to the target cell and perform a random access to the target cell to obtain UL allocation and timing advance as well as other necessary parameters. Finally, the UE sends a handover confirm message to the target eNodeB after which the target eNodeB can start sending the forwarded data to the UE [1].  Handover completion: In the final phase, the target eNodeB informs the MME that the user plane path has changed. S-GW is then notified to update the user plane path. At this point, the data starts flowing on the new path to the target eNodeB. Finally all radio and control plane resources are released in the source eNodeB. A more detailed description of the intra-MME/Serving Gateway HO procedure is given below: 1. Based on the area restriction information, the source eNodeB configures the UE measurement procedure. 2. MEASUREMENT REPORT is sent by the UE after it is triggered based on some rules. 3. The decision for handover is taken by the source eNodeB based on MEASUREMENTREPORT and RRM information. 4. HANDOVER REQUEST message is sent to the target eNodeB by the source eNodeB containing all the necessary information to prepare the HO at the target side. 5. RAB QoS information. Performing admission control is to increase the likelihood of a successful HO, in that the target eNodeB decides if the resources can be granted or not. In case the resources can be granted, the target eNodeB configures the required resources according to the received E-RAB QoS information then reserves a Cell Radio Network Temporary Identifier (C-RNTI) and a RACH preamble for the UE.
  • 39. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 27 6. The target eNodeB prepares HO and then sends the HANDOVER REQUEST ACKNOWLEDGE to the source eNodeB. There is a transparent container in the HANDOVER REQUEST ACKNOWLEDGE message which is aimed to be sent to the UE as an RRC message for performing the handover. The container includes a new C-RNTI, target eNodeB security algorithm identifiers for the selected security algorithms, may include a dedicated RACH preamble, and possibly some other parameters like RNL/TNL information for the forwarding tunnels. If there is a need for data forwarding, the source eNodeB can start forwarding the data to the target eNodeB as soon as it sends the handover command towards the UE. Steps 7 to 16 are designed to avoid data loss during HO: 7. To perform the handover the target eNodeB generates the RRC message, i.e. RRC Connection Reconfiguration message including the mobility Control Information. This message is sent towards the UE by the source eNodeB. 8. The SN STATUS TRANSFER message is sent by the source eNodeB to the target eNodeB. In that message, the information about uplink PDCP SN receiver status and the downlink PDCP SN transmitter status of E-RABs are provided. The PDCP SN of the first missing UL SDU is included in the uplink PDCP SN receiver status. The next PDCP SN that the target eNodeB shall assign to the new SDUs is indicated by the downlink PDCP SN transmitter status. At this point, data forwarding of user plane downlink packets can use either a “seamless mode” minimizing the interruption time during the move of the UE, or a “lossless mode” not tolerating packet loss at all. The source eNodeB may decide to operate one of these two modes on a per EPS bearer basis, based on the QoS received over X2 for this bearer. 9. After reception of the RRC Connection Reconfiguration message including the mobility Control Information by the UE, the UE tries to perform synchronization to the target eNodeB and to access the target cell via RACH. If a dedicated RACH preamble was assigned for the UE, it can use a contention free procedure; otherwise it shall use a contention based procedure. In the sense of security, the target eNodeB specific keys are derived by the UE and the selected security algorithms are configured to be used in the target cell. 10. The target eNodeB responds based on timing advance and uplink allocation. 11. After the UE is successfully accessed to the target cell, it sends the RRC Connection Reconfiguration Complete message for handover confirmation, The C- RNTI sent in
  • 40. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 28 the RRC Connection Reconfiguration Complete message is verified by the target eNodeB and afterwards the target eNodeB can now begin sending data to the UE. 12. A PATH SWITCH message is sent to MME by the target eNodeB to inform that the UE has changed cell. 13. UPDATE USER PLANE REQUEST message is sent by the MME to the Serving Gateway. 14. The Serving Gateway switches the downlink data path to the target eNodeB and sends one or more end marker" packets on the old path to the source eNodeB to indicate no more packets will be transmitted on this path. Then U-plane/TNL resources towards the source eNodeB can be released. 15. An UPDATE USER PLANE RESPONSE message is sent to the MME by the Serving Gateway. 16. The MME sends the PATH SWITCH ACKNOWLEDGE message to confirm the PATH SWITCH message. 17. The target eNodeB sends UE CONTEXT RELEASE to the source eNodeB to inform the success of handover to it. The target eNodeB sends this message to the source eNodeB after the PATH SWITCH ACKNOWLEDGE is received by the target eNodeB from the MME. 18. After the source eNodeB receives the UE CONTEXT RELEASE message, it can release the radio and C-plane related resources. If there is ongoing data forwarding it can continue. Figure 11: Handover Timing [8]
  • 41. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 29 2.5. Handover Measurements: The handover procedure in LTE-Advanced, which is a part of the RRM, is based on the UE’s measurements. Handover decisions are usually based on the downlink channel measurements which consist of Reference Signal Received Power (RSRP) and Reference Signal Received Quality (RSRQ) made in the UE and sent to the eNodeB regularly [12]. The descriptions of each of them are presented following:  Reference Signal Received Power (RSRP): The RSRP measurement provides cell-specific signal strength metric. This measurement is used mainly to rank different LTE-Advanced candidate cells according to their signal strength and is used as an input for handover and cell reselection decisions. RSRP is defined for a specific cell as the linear average received power (in Watts) of the signals that carry cell- specific Reference Signals (RS) within the considered measurement frequency bandwidth [4].  Reference Signal Received Quality (RSRQ): This measurement is intended to provide a cell-specific signal quality metric. Similarly to RSRP, this metric is used mainly to rank different LTE candidate cells according to their signal quality. This measurement is used as an input for handover and cell reselection decisions, for example in scenarios for which RSRP measurements do not provide sufficient information to perform reliable mobility decisions. The RSRQ is defined as: 𝑅𝑆𝑅𝑄 = 𝑁.𝑅𝑆𝑅𝑃 𝑅𝑆𝑆𝐼 (1) Where N is the number of Resource Blocks (RBs) of the LTE-Advanced carrier RSSI measurement bandwidth. The measurements in the numerator and denominator are made over the same set of resource blocks. While RSRP is an indicator of the wanted signal strength, RSRQ additionally takes the interference level into account due to the inclusion of RSSI. RSRQ therefore enables the combined effect of signal strength and interference to be reported in an efficient way [4]. Besides RSRP/RSRQ, handover technology has other decision criterions, such as:  Signal Noise Ratio (SNR): The SNR is a measurement that compares the level of a desired signal to the level of background noise (unwanted signal). It is defined as the ratio of signal power and the noise power. A ratio higher than 1:1 indicates more signal than noise. 𝑆𝑁𝑅 = 𝑃 𝑠𝑖𝑔𝑛𝑎𝑙 𝑃 𝑛𝑜𝑖𝑠𝑒 (2)
  • 42. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 30 Where P is average power. Both signal and noise power must be measured at the same or equivalent points in a system, and within the same system bandwidth [16].  Carrier-to-Interference Ratio (CIR): CIR expressed in decibels (dB) is a measurement of signaling effectiveness and it is defined as the ratio of the power in the carrier to the power of the interference signal.  Signal Interference plus Noise Ratio (SINR): This metric is used to optimize the transmit power level for a target quality of service assisting with handover decisions. Accurate SINR estimation provides a more efficient system and a higher user-perceived quality of service. SINR is defined as the ratio of signal power to the combined noise and interference power: 𝑆𝐼𝑁𝑅 = 𝑃 𝑠𝑖𝑔𝑛𝑎𝑙 𝑃 𝑛𝑜𝑖𝑠𝑒+ 𝑃 𝑖𝑛𝑡𝑒𝑟𝑓𝑒𝑟𝑒𝑛𝑐𝑒 (3) Where P is the averaged power, values are commonly quoted in dB.  Received Signal Strength Indicator (RSSI): The LTE carrier RSSI is defined as the total received wideband power observed by the UE from all sources, including co-channel serving and non-serving cells, adjacent channel interference and thermal noise within the measurement bandwidth specified by the 3GPP. LTE-Advanced carrier RSSI is not reported as a measurement in its own right, but is used as an input to the LTE-Advanced RSRQ measurement [4]. As mentioned earlier, handover measurements in LTE-Advanced are done at the downlink reference symbols in the frame structure as shown in Figure 12. However, handover decision can also be based on the uplink measurements. This study focuses on downlink handover measurements.
  • 43. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 31 Figure 12: Downlink reference signal structure for LTE-Advanced. The averaging of fast fading over all the reference symbols is done at Layer 1 and hence is called L1 filtering (Figure 13). The use of scalable bandwidth in LTE allows doing the handover measurement on different bandwidth. Figure 13: Handover measurement filtering and reporting [10]. 2.6. Handover Parameters: The handover procedure has different parameters which are used to enhance its performance and setting these parameters to the optimal values is a very important task. In LTE the triggering of handover is usually based on measurement of link quality and some other parameters in order to improve the performance. The most important ones include [13]:  Handover initiation threshold level RSRP and RSRQ: This level is used for handover initiation. When the handover threshold decreases, the probability of a late handover decreases and the Ping-Pong effect increases. It can be varied
  • 44. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 32 according to different scenarios and propagation conditions to make theses trade-offs and obtain a better performance.  Hysteresis margin: The Hysteresis margin also called HO margin is the main parameter that governs the HO algorithm between two eNodeB’s. The handover is initiated if the link quality of another cell is better than current link quality by a hysteresis value. It is used to avoid ping-pong effects. However, it can increase handover failure since it can also prevent necessary handovers.  Time-to-Trigger (TTT): When applying Time-to-Trigger, the handover is initiated only if the triggering requirement is fulfilled for a time interval. This parameter can decrease the number of unnecessary handovers and effectively avoid Ping-Pong effects. But it can also delay the handover which then increase the probability of handover failures.  The length and shape of averaging window: The effect of the channel variation due to fading should be minimized in handover decision. Averaging window can be used to filter it out. Both the length and the shape of the window can affect the handover initiation. Long windows reduce the number of handovers but increase the delay. The shape of the windows, e.g. rectangular or exponential shape, can also affect the number of handovers and probability of unnecessary handovers. The listed parameters will affect directly the handover initiations and hence they can be tuned according to certain design goals. However there are other parameters like the measurement report period which can also have an impact on the handover initiations. Figure 14: Handover triggering procedure [11].
  • 45. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 33 In summary, the starting point of the handover triggering procedure is the measurements performed by the UE. These are done periodically as defined by the measurement period parameter configured at the eNodeB. When a condition is reached in which the serving cell RSRP drops an amount of the configured HO offset, usually 2-3dB, below the measured neighbor cell, a timer is started. In case this condition lasts the amount of the Time to Trigger (TTT) value, a measurement report is sent to the eNodeB, which initiates the handover by sending a handover command to the UE. In case the reporting conditions change and no longer satisfy the triggering conditions before the timer reaches the TTT value, a measurement report will not be sent and new measurement calculations and timers are started [11]. 2.7. Time To Trigger & Hysteresis: In this project LTE, two main parameters are studied in the handover process. These parameters are the Time-to-Trigger (TTT) and Hysteresis (hys). The hys is used to dene how much better the RSS of a neighboring base station must be than the serving base station for a handover to be considered. The values of hys are defined in Decibels (dB) and range from 0 to 10dB in 0.5dB increments, this results in there being 21 different values of hys. The full range of hys values can be seen in Table 4. Table 4: Table of the different LTE hys values. Index hys (dB) 0 0.0 1 0.5 2 1.0 3 1.5 4 2.0 5 2.5 6 3.0 7 3.5 8 4.0 9 4.5 10 5.0 Index hys (dB) 11 5.5 12 6.0 13 6.5 14 7.0 15 7.5 16 8.0 17 8.5 18 9.0 19 9.5 20 10 The TTT is a length of time, defined in seconds, that is used to define how long a neighboring base station must be considered better than the serving base station for. There are 16 different values of TTT ranging from 0 to 5.12 seconds. Unlike with hys., the TTT values do not increase linearly; instead they increase exponential with smaller increases at the lower values and bigger increases at the larger values. The full list of TTT values can be seen in Table 5 and a graph of how the TTT values increase can be seen in Figure 11.
  • 46. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 34 Table 5: Table of the different LTE TTT values. Index TTT (s) 0 0.0 1 0.04 2 0.064 3 0.08 4 0.1 5 0.128 6 0.16 7 0.256 8 0.32 Index TTT (s) 9 0.48 10 0.512 11 0.64 12 1.024 13 1.280 14 2.56 15 5.12 There are 336 different combinations of TTT and hys values. Having such a large range of combinations means that pairs of values can mean that a neighboring eNodeB has to be better by a large value of hys but for a small value of TTT or vice-versa. This makes for an interesting dynamic for which pairs of values will work the best in any given environment. In LTE there are eight different triggers defined for initiating handovers. Table 6 shows different trigger events and how they are defined [18]. Table 6: Table of the different LTE Trigger types and their criteria. Event Type Trigger Criteria A1 Serving becomes better than a threshold. A2 Serving becomes worse than a threshold. A3 Neighbor becomes offset better than Primary Cell (PCell). A4 Neighbor becomes better than threshold. A5 PCell becomes worse than threshold1 and neighbor becomes better than threshold2. A6 Neighbor becomes offset better than Secondary Cell (SCell). B1 Inter RAT neighbor becomes better than threshold. B2 PCell becomes worse than threshold1 and inter RAT neighborbecomes better than threshold2. Out of the eight triggers the A3 event is the most common and its definition is that a neighboring eNodeB must give the UE better Reference Signal Received Power (RSRP) by an amount defined by the hys., for a length of time defined by the TTT. [19] The A3 event can be represented by the following equation: 𝑅𝑆𝑅𝑃𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟𝑖𝑛𝑔 + 𝐻𝑦𝑠 > 𝑅𝑆𝑅𝑃𝑠𝑒𝑟𝑣𝑖𝑛𝑔 (4) When a handover event is triggered a measurement report is sent from the UE to the Serving eNodeB. The measurement report contains the information required for the Serving eNodeB
  • 47. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 35 to make a decision on whether to initiate a handover or not. The full, high-level, procedure for a LTE handover is as follows: 1. If a Neighboring eNodeB is found to be better than the Serving eNodeB a measurement report is sent by the UE to the Serving eNodeB. 2. The Serving eNodeB considers the information in the measurement report and decides whether or not a handover should take place. 3. If it is decided that a handover should take place then a message is sent to the Neighboring eNodeB to prepare resources for the UE. 4. Once the resources are ready for the UE the new Serving eNodeB sends a message to the old eNodeB to release the resources it previously had for the UE. 5. Finally a message is sent to the MME to finalize the handover process. Conclusion: The handover parameters need to be optimized for good performance. Too low handover offset and TTT values in fading conditions result in back and forth ping- pong handovers between the cells. Too high values then can be the cause of call drops during handovers as the radio conditions get too bad for transmission in the serving cell. In the last chapter, we will explain our proposed solution to optimize the Handover parameters.
  • 48. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 36 Chapter III: Machine Learning and Handover Parameter Optimization simulation Introduction: Optimizing handover is a major activity in network operations, with Hysteresis and Time-to- Trigger as the main control parameters. For each HO, depending on the Hys-TTT tuple also called the trigger point, either: a success, Ping-Pong, or Radio Link failure occurs. Along this chapter, we will describe the Q-Learning, present our proposed approach for Handover optimization and finish by simulation results. 3.1. Q-Learning overview: 3.1.1. Machine Learning: Machine learning is a form of Artificial Intelligence (AI) that involves designing and studying systems and algorithms with the ability to learn from data. This field of AI has many applications within research (such as system optimization), products (such as image recognition) and advertising (such as adverts that use a user's browsing history). There are many different paradigms that machine learning algorithms use. Algorithms can use training sets to train an algorithm to give appropriate outputs; other algorithms look for patterns in data; while others use the notion of rewards to find out if an action could be considered correct or not [20]. Three of the most popular types of machine learning algorithms are:  Supervised learning is where an algorithm is trained using a training set of data. This set of data includes inputs and the known outputs for those inputs. The training set is used to fine-tune the parameters in the algorithm. The purpose of this kind of algorithm is to learn a general mapping between inputs and outputs so that the algorithm can give an accurate result for an input with an unknown output. This type of algorithm is generally used in classification systems.  Unsupervised learning algorithms only know about the inputs they are given. The goal of such an algorithm is to try and find patterns or structure within the input data. Such an algorithm would be given inputs and any patterns that are contained would become more and more visible the more inputs the algorithm is given.  Reinforcement learning uses an intelligent agent to perform actions within an environment. Any such action will yield a reward to the agent and the agent's goal
  • 49. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 37 is to learn about how the environment reacts to any given action. The agent then uses this knowledge to try and maximize its reward gains. 3.1.2. Reinforcement Learning: In reinforcement learning an intelligent agent is learning what action to do at any given time to maximize the notion of a reward. In the beginning the agent has no knowledge of what action it should take from any state within the learning environment. It must instead learn through trial and error, exploring all possible actions and finding the ones that perform the best. The trade-of between exploration and exploitation is one of the main features of reinforcement and can greatly affect the performance of a chosen algorithm. A reinforcement learning algorithm must contemplate this trade-off of whether to exploit an action that resulted in a large reward or to explore other actions with the possibility of receiving a greater reward. Another main feature of reinforcement learning is that the problem in question is taken into context as a whole. This is different from other types of ma- chine learning algorithms, as they will not considered how the results of any sub-problems may affect the problem as a whole. The basic elements required for reinforcement learning is as follows:  A Model (M) of the environment that consists of a set of States (S) and Actions (A).  A reward function (R).  A value function (V).  A policy (P). The model of the environment is used to mimic the behavior of the environment, such as predicting the next state and reward from a state and taken action. Models are generally used for planning by deciding what action to take while considering future rewards. The reward function defines how good or bad an action is from a state. It is also used to define the immediate reward the agent can expect to receive. Generally a mapping between a state-action pair and a numerical value is used to define the reward that the agent would gain. The reward values are used to define the policy where the best value of state-action pair is used to define the action to take from a state.
  • 50. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 38 While the reward function defines the immediate reward that can be gained from a state, the value function defines how good a state will be long-term. This difference can create possible conflicts of interest for an agent; so while its goal is to collect as much reward as possible, it has to weigh up the options of picking a state that may provide a lot of up front reward but not much future reward against a state with a lot of future reward but not much immediate reward. The policy is a mapping between a state and the best action to be taken from that state at any given time. Policies can be simple or complex; with a simple policy consisting of a lookup table, while more complex policies can involve search processes. In general most policies begin stochastic so that the agent can start to learn what actions are more optimal. [11] 3.1.3. Q-Learning: Q-Learning is a type of reinforcement learning algorithm where an agent tries to discover an optimal policy from its history of interactions within an environment. What makes Q-Learning so powerful is that it will always learn the optimal policy (which action a to take from a state s) for a problem regardless of the policy it follows, as long as there is no limit on the number of times the agent can try an action. Due to this ability to always learn the optimal policy, Q-Learning is known as an Off-Policy learner. The history of interactions of an agent can be shown as a sequence of State-Action-Rewards: < s0, a0, r1, s1, a1, r2, s2, a2... > This can be described as the agent was in State 0, did Action 0, received Reward 0 and transitioned into State 1; then did Action 1, received Reward 1 and transitioned into State 2; and so on. The history of interactions can be treated as a sequence of experiences, with each experience being a tuple. < s, a, r, s > The meaning of the tuple is that the agent was in State s, did Action a, received Reward r and transitioned in State s. The experiences are what the agent uses to determine what the optimal action to take is at a given time. The basic process of a Q-Learning algorithm can be seen in Algorithm 3.1. The general process requires that the learning agent is given a set of states, a set of actions, a discount factor γ and step size α. The agent also keeps a table of Q-Values, denoted by Q(s,a) where s is a state and a is an action from that state. A Q-Value is also an average of all the experiences the agent has with a specific state-action pair. This allows for good and bad experiences to be averaged out to giving a reasonable estimation of the actual value of state-action pair.
  • 51. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 39 The process of averaging out experiences is done using Temporal Differences. It could be said that the best way to estimate the next value in a list is to take the average of all the previous values. Equation 4 shows this process. 𝐴 𝑘 = (𝑣1 +⋯+ 𝑣 𝑘) 𝑘 (5) Therefore 𝑘 𝐴 𝑘 = 𝑣1 + ⋯ + 𝑣 𝑘 = (𝑘 − 1)𝐴 𝑘−1 + 𝑣 𝑘 (6) Then dividing by k gives: 𝐴 𝑘 = (1 − 1 𝑘 ) 𝐴 𝑘−1 + 𝑣 𝑘 𝑘 (7) Then let αk = 1/k: 𝐴 𝑘 = (1 − 𝛼 𝑘)𝐴 𝑘−1+ 𝛼 𝑘 + 𝑣 𝑘 = 𝐴 𝑘−1 + 𝛼 𝑘(𝑣 𝑘 − 𝐴 𝑘−1 ) (8) The part of Equation 8 where the difference vk − Ak−1 is seen is known as the Temporal Difference Error or TD Error. This shows how different the old value Ak-1 is from the new value vk. The new value of the estimate, Ak, is then the old estimate, Ak-1, plus the TD error times k. The Q-Values, therefore, are defined using temporal differences and Equation 9 shows the formula to calculate the values, where is a variable between 0 and 1 and defines the step size of the algorithm. If the step size were 0 then the algorithm would ignore any rewards received and if the step size were 1 the algorithm would consider the rewards gained just as much as the previous experiences of a state-action pair. The discount factor is also a variable between 0 and 1 and defines how much less future rewards will be worth compared to the current reward. If the discount factor were to be 0, then the future rewards would not be considered a lot. If the discount factor were to be 1, then the future rewards would be worth as much as the current rewards. The possible future rewards (maxaQ(s,a)) is the maximum of the Q-Values of all possible state-action pairs from the action selected. 𝑄[𝑠, 𝑎] = 𝑄[𝑠, 𝑎] + 𝛼 (𝑟 + 𝛾𝑚𝑎𝑥 𝑎′ 𝑄[𝑠′ , 𝑎′] − 𝑄[𝑠, 𝑎]) (9) The table of Q-Values can either be initialized as empty or with some values pre-set to try and lead the agent to a specific goal state. Once the agent has initialized these parameters it observes the starting state. The starting state can either be chosen by random or be a pre- determined start state for the problem. The agent will then choose an action. Actions are chosen either stochastically or by a policy. Once an action has been chosen the agent will carry out the action and receive a reward. This reward is used to update the table of Q-Values
  • 52. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 40 using Equation 9. Finally the agent moves into the new state and repeats until termination; which can be either when the agent discovers a goal state or after a certain number of actions have be taken. Require: S is a set of states A is a set of actions γ the discount reward factor α is the learning rate 1: procedure Q-Learning(S, A, γ, α) 2: real array Q[S, A] 3: previous state s 4: previous action a 5: initialize Q[S, A] arbitrarily 6: observe current state s 7: repeat 8: select and carry out an action a 9: observe reward r and state s′ 10: Q[s, a] ← Q[s, a] + α (r + γmaxa′ Q[s′, a′] − Q[s, a]) 11: s ← s′ 12: until termination 13: end procedure After a Q-Learning algorithm has finished exploring the model of the environment it creates a policy. The policy is generated by searching across all actions for a state and finding the next state with the greatest value. The policy is therefore a lookup table that maps a state with the best possible next state. The policy created can then be used to solve the problem that the Q- Learning agent was exploring [22]. 3.2. Proposed Approach for HO optimization: 3.2.1. Set of states: The approach taken for optimizing the handover parameters in LTE-Advanced uses a Q- Learning algorithm based on the process given in Section 3.1. In the approach the model of the environment has a state for every combination of TTT and hys.; giving a total number of 336 states.
  • 53. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 41 Table 7: Set of states. HYS TTT 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10 0 0.0 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 1 0.04 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 2 0.064 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 3 0.08 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 4 0.1 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 5 0.128 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 6 0.16 126 127 128 129 130 131 132 133 134 135(6) 136(5) 137(8) 138 139 140 141 142 143 144 145 146 7 0.256 147 148 149 150 151 152 153 154 155 156(4) 157 158(1) 159 160 161 162 163 164 165 166 167 8 0.32 168 169 170 171 172 173 174 175 176 177(7) 178(2) 179(3) 180 181 182 183 184 185 186 187 188 9 0.48 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 10 0.512 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 11 0.64 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 12 1.024 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 13 1.280 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 14 2.56 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 15 5.12 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335
  • 54. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 42 3.2.2. Set of actions: An action within the model can move to any other state that is different by one of the following changes to the handover parameters: 1. A single value increase of TTT. (1) 2. A single value increase of hys. (2) 3. A single value increase of both TTT and hys. (3) 4. A single value decrease of TTT. (4) 5. A single value decrease of hys. (5) 6. A single value decrease of both TTT and hys. (6) 7. A single value increase of TTT and a single value decrease of hys. (7) 8. A single value increase of hys and a single value decrease of TTT. (8) For example if the learning agent is in the state 157 where the TTT equals 0.256s and the hys equals 5.0dB and performed action 3 from the list seen above: a single value increase of both TTT and hys.), then the new TTT would equal 0.32s and the hys. would equal 5.5dB: state 179. In fact the possible next states for the state 157 are: {135(6) , 136(5) , 137(8) , 156(4) , 158(1) , 179(3) , 178(2) , 177(7) } HYS (dB) TTT(s) 5.0 5.54.5 7 6 3 4 1 8 2 5 0.16 0.256 0.32 S157 S137S136S135 S156 S177 S178 S179 S158 Figure 15: State 157 possible actions. The full list of hys. values can be seen in Table 3 and the full list of TTT values can be seen in Table 4. Having the actions only change the parameters by one increase or decrease of the
  • 55. Handover Parameters Self-optimization by Q-Learning in 4G networks ESPRIT Tech. 43 TTT and hys values each time not only allows for more refined optimization of the parameters but it also makes sure that no large changes can suddenly happen. 3.2.3. Reward: Due to the nature of this kind of problem, the reward gained by an action is dynamic and is likely to be different each time it is taken. Rewards are based on the number of drop and ping- pongs accumulated in the simulation for current state in the environment model. The rewards are defined by the following equation: 𝑅𝑒𝑤𝑎𝑟𝑑 = 𝐻𝑎𝑛𝑑𝑜𝑣𝑒𝑟𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 /(10 ∗ 𝐷𝑟𝑜𝑝𝑠 + 2 ∗ 𝑃𝑖𝑛𝑔𝑃𝑜𝑛𝑔𝑠) (10) The coefficients in Equation 10 are given the values of 10 for drops and 2 for ping-pongs. Drops are extremely bad for the QoS of a communication system so it's given a large value and the reason ping-pongs are multiplied by 2 to remove the successful handover that was caused by the Ping-Pong and give the agent a penalty. The reward is given to the agent and the Q-Value for that state is updated just before the agent selects the next action to take. The agent then selects new actions in discrete time steps, which allows for the simulation to run for fixed periods of time with TTT-hys. pairs specified by a state in the environment model. After the agent has been given enough time to try every action at least once the Q-Learning agent generates a policy. This policy can then be used to attempt to optimize the handover parameters by changing the TTT and hys. values after a call is dropped or the connection ping-pongs between base stations. The Q-Learning agent still receives rewards every time a call is dropped or the connection ping-pongs while following the generated policy. Doing this allows for the system to always be learning; even after the initial learning process that generated the policy. 3.3. Simulation & Performance evaluation: The simulation is a very important part of the project. It is required to provide the basic functionality of a LTE network. For simplicity the simulation was broken down into two main components; the mobile (UE) and the base station (eNodeB). Due to the project revolving around the handover process in LTE, it made sense for the two main components of the simulation to be the mobile and the base station; it is the mobile that triggers the measurement report and the base station that makes the decision on whether a handover should take place or not. Each base station would also be given its own Q-Learning agent since each base station is unique. Since the A3 event trigger (Table 5) is the most common it was decided that it would