SlideShare a Scribd company logo
1 of 7
Download to read offline
1
Optimal thresholds for Priority Flow Control in long-
range Fibre Channel over Ethernet links
Ahmed Ayman
Master’s Thesis
Universidad Carlos III de Madrid
Avda. Universidad 30, Leganes (Madrid), Spain
ahmed.ayman11100@gmail.com
Abstract—This paper investigates how Fibre Channel over
Ethernet can be used to extend Storage Area Network (SAN) to a
Metropolitan Area Network (MAN) environment. This paper
studies the performance of Priority Flow Control (PFC) and
Converged Enhanced Ethernet (CEE) in an attempt to obtain a
lossless Ethernet based SAN that costs much less than traditional
Fibre Channel (FC) technologies currently used in SANs. In
particular this paper provides an analytical model for the
optimal thresholds that regulates the PFC flow control
mechanism employed by FCoE and evaluates how link distance
affects the buffer requirements of FCoE switches in MAN
scenarios. The model is then validated by means of simulation.
Keywords—Fibre Channel over Ethernet (FCoE), delay,
Priority Flow Control (PFC), Converged Enhanced Ethernet
(CEE), lossless Ethernet, Metropolitan Area Network (MAN).
I. INTRODUCTION
Nowadays, there is a clear trend for enterprises to use
Storage Area Networks (SAN) to manage and store their data.
This has given rise to a high demand of this kind of networks
[1]. Usually, Data Centers have used more than one type of
technology. Fibre Channel (FC) is the commonly used for the
SANs of these datacenters, while Ethernet is used to
interconnect the servers with a Local Area Network (LAN) [4].
Although this configuration has been working fine for many
years, deploying multiple technologies have many
disadvantages, such as operations, administration and
management (OAM) might get hard and problematic. Also,
there are certain devices like network adapters and switches
that need to be replicated because they are different according
to the technology used [2]. This is needed because IEEE 802.3
Ethernet provides a link layer service that does not guarantee a
lossless transmission, as it relies on retransmission from upper
layers (i.e. TCP) for handling lost or dropped frames. All these
issues are some of the problems of SANs these days that thus
require more time, money and effort [1] [2] [3].
Due to these problems, there is a clear interest in finding a
solution that is more cost-effective by merging the data and
storage networks of data centers into a single infrastructure that
assures both flexibility and low cost, by using some protocol
that provides high performance and reliability without the need
of retransmission when data is lost, which incurs in too much
delay for storage applications.
II. BACKGROUND AND STATE OF THE ART
This section provides some background about Priority Flow
Control (PFC), Converged Enhanced Ethernet (CEE) and Fibre
Channel over Ethernet (FCoE), and gives an overview of the
related work that has been done in this field.
Traditional IEEE 802.3 Ethernet has been used in enterprise
networks for quite some time. However, today’s enterprise data
centers require lossless connection due to SAN requirements
[5]. This is where Converged Enhanced Ethernet (CEE) comes
into play. CEE is an enhancement of classical Ethernet
technology that has been developed to allow the consolidation
of different varieties of applications in data centers into a single
common interconnection. This is very appealing to enterprise
data centers, as it has many advantages. For instance it makes
the administration of the data centers a more easier task, as it
does not have to deal with multiple technologies and the
interconnections between them, but rather a single, common
underlying “fabric”. It also helps enterprises to save a lot of
money and time as now the physical components required in
these data centers belong to the same type of technology and
thus can be unified.
Priority Flow Control (PFC) belongs to this initiative as it
allows to merge different types of data into a single link,
without the need of stopping high priority traffic if the link is
congested, as it occurs in traditional Ethernet flow control. The
main technology that employs (PFC) is Fibre Channel over
Ethernet (FCoE), which encapsulates Fibre Channel storage
data into Ethernet frames. This gives the possibility of
introducing an Ethernet-based SAN into a data center that uses
FC, without the need to alter its current data infrastructure,
making it ideal to be the upgraded technology for traditional
FC. But the most important advantage that PFC provides over
the technologies used in today’s data centers is that it ensures a
lossless interconnection by performing hop-by-hop flow
control. Thus, CEE enables a cost-efficient lossless Ethernet
service over an infrastructure which is flexible, reliable,
presenting high performance and low latency.
2
The main cause of packet loss in a typical Ethernet network
that consists of multiple switching hops between the sender and
the receiver is congestion, since there is no feedback that
indicates the overload of the links between these hops. Thus, it
is normal that transmitters inject frames into the network at a
rate that is faster than the rate that the receiver can forward or
process those frames, which causes the buffers in the receivers
to saturate and eventually drop the packets that are still in flight
on the link, which makes it an unreliable transmission.
The Data Center Bridging (DCB) task group was formed
by the IEEE to address such problems. One of the standards
that they develop is Priority Flow Control (PFC). In particular,
the PFC IEEE 802.1Qbb standard proposes a mechanism that
is based on the basic IEEE 802.3x flow control protocol. More
specifically, it uses two distinctive thresholds at which an
Ethernet switch at each hop along the path of communication
sends a feedback upstream to control the data flow. These are
the buffers’ High Threshold (THigh) and the Low Threshold
(TLow). Those thresholds are associated with 2 types of control
messages, the PAUSE and the RESUME frames. PFC defines
how the system processes the arrival of these control frames in
order to achieve a lossless communication while maintaining
an acceptable QoS.
Figure 1 show how the PAUSE frame used by the PFC
protocol looks like in comparison with a normal MAC control
frame and a IEEE 802.3x PAUSE frame.
Figure 1: 64-byte MAC control frame used by PFC compared to MAC control
frame and IEEE 802.3x PAUSE frame
As shown in Figure 1, the PFC PAUSE frame has up to 8
Class of Service (CoS) fields. This gives a great advantage as
now the system has the ability, not only to control the flow of
all data in the link, but also to control each specific type of
traffic, which allows high priority applications (e.g. SANs) to
coexist on the same link with low priority traffic, by pausing
the low priority applications that are causing the buffers only to
saturate. This significantly enhances the Quality of Service
(QoS) of CEE [1].
The High Threshold (THigh) is set in such a way that when
the buffer exceeds it, the receiver node sends a PAUSE frame
upstream to the previous nodes to alert them to stop sending
more frames on the link as the buffer is about to be filled.
While calculating the THigh, the latency of the link and data rate
play an important role. While the transmitter is in a pause state,
the buffer of the receiver switch is getting depleted as frames
that were buffered are being processed and forwarded. When
the buffer size goes below the Low Threshold (TLow), another
control frame called RESUME is generated and sent upstream
to tell the sender that now there is enough space in the buffer to
start sending data. A more detailed explanation about these
thresholds and how to calculate their optimum values will be
found in the next section.
Figure 2 provides a brief overview of how the PFC
protocol, with the different buffer thresholds, work together as
a system. At first, the host is generating and sending frames
across the link to the switch. The switch stores the incoming
frames in its buffer waiting to be forwarded. When the
occupied storage of the buffer exceeds the High Threshold, the
switch sends back a PAUSE frame on the link to notify the
host to stop sending anymore frames. But while the generated
PAUSE frame reaches the host, the host keeps generating and
sending frames. These frames that were generated between the
time the buffer size reaches the High Threshold and the
PAUSE frame reaches the host must be taken into
consideration while setting the High Threshold of the switch
buffer in order to be able to store them when they arrive to the
switch. While the host is in the pause state, the switch
continues processing and forwarding the buffered frames. This
continues until the occupied storage in the buffer falls below
the Low Threshold. When this happens, the switch sends back
a RESUME frame to the host, to notify it that it is now safe for
it to start sending frames again. As soon as the RESUME
frame reaches the host, it continues generating and forwarding
the frames normally till it receives a PAUSE frame again.
Figure 1: An overview of how the Priority Flow Control (PFC)
protocol uses THigh and TLow to work together as a system
3
III. OPTIMAL BUFFER SIZE FOR CEE PRIORITY FLOW
CONTROL
In this section, we formulate a model to calculate the
optimum High Threshold (THigh) and Low Threshold (TLow), at
which PAUSE and RESUME messages should be generated,
while guaranteeing a lossless connection on Fiber Channel
over Ethernet (FCoE) links using PFC. These equations take
into consideration the different parameters on the links used,
such as the transmission rate, latency of the link, the size of the
data packets generated, etc.
A. High Threshold Buffer Size (THigh)
The High Threshold is set at the receiver buffer of the
intermediate switches along the path of the traffic between the
source node, for example a host, and the sink node, like a
server or storage cabin. This THigh is an essential limit of the
buffer, which is obviously lower than the maximum buffer size
of the switches used. When THigh is reached, a PAUSE message
is sent back to notify the previous node to stop sending more
traffic as the buffer is close to being filled. This approach
assures that there will be no packet loss due to dropping at the
intermediate switches when the receiver buffers are exhausted.
THigh should satisfy two important constraints:
• The first constraint is that PAUSE frames have to
be sent early enough as to predict that the
maximum buffer size will not be exceeded by the
frames received by the switch. It should take into
consideration that there are more frames being
sent by the source and are in flight on the link,
because more frames will be generated and sent by
the source during the propagation time required
for the PAUSE message to be received by the
source to stop sending any additional traffic.
• The second constraint that THigh should satisfy is
that it has to be as high as possible as to not
severely decrease the throughput of the system.
Since the links are idle during the pause period
and not utilized, having a lower than required
THigh, it would be reached faster and hence more
PAUSE messages are generated, which causes the
sender to stop sending traffic more frequently.
This would result in a lower throughput for the
whole system.
It is essential to obtain an optimum value for the THigh at
which no packet dropping at the switches are guaranteed, while
not sacrificing the overall throughput. For that the following
equations are used.
THigh = QMax - QHigh (1)
Where,
QHigh = LPAUSE + 2*LDATA + 2*TProp*RAB (2)
By using both (1) and (2),
THigh = QMax – (LPAUSE + 2*LDATA + 2*TProp*RAB) (3)
Equation (3) is used to calculate the optimum THigh for any
link segment. QMax defines the maximum size of the receiver
buffer. LPAUSE is the size of the PAUSE frame in bits, which is
defined to be 64 Bytes (512 bits). LDATA is the Maximum
Transmission Unit (MTU) of the data packets being sent on the
segment in bits. TProp is the propagation delay of the link in
seconds, which mainly depends its length. RAB is the
transmission rate of the link in bits per second.
QHigh is a predictor term that defines the actual buffer size
needed, given the distance of the link, the transmission rate and
the size of the PAUSE message and the MTU of the data
packets, to be saved by the receiver for the frames that are in
flight and not yet reached it. Hence, for calculating the actual
threshold QHigh should be subtracted from QMax, as shown by
equation (1).
QHigh takes in consideration the round trip time for the
actual response of the PAUSE message. When THigh is reached
by adding the frames into the buffer, the PAUSE message is
sent to the previous hop, but at the moment the PAUSE
message reaches the source, the source might have just started
sending the first bit of the new data packet. As the aim of this
paper is to ensure a lossless mechanism for transmission, the
source cannot stop transmission once it has started and the
transmission must be completed. This scenario was taken into
consideration and hence, twice the maximum transmission
delay was used in the formula, because the same occurs when
sending the PAUSE frame to consider this worst case scenario.
B. Low Threshold Buffer Size ( TLow)
The Low Threshold (TLow) is another important parameter
defined in the PFC buffers of the intermediate nodes along the
path of FCoE traffic. It is triggered or reached when the system
is in the pause state. After a switch’s buffer reaches THigh, a
PAUSE message is sent and the transmission is temporarily
paused, giving time for the intermediate nodes to forward the
frames they have in their buffers. TLow must have a lower value
than THigh, so as the packets in the buffers are being processed,
the buffer is gradually being depleted. When the buffer hits
TLow, a RESUME message is generated and sent back to the
previous node which notifies that the buffer now has free space
to accept more packets. When the RESUME message reaches
the previous hop, it changes its state from a pause state, and it
continues with the transmission as usual. The following
equation describes the optimum value for TLow:
TLow = (
୐ୖ୉ୗ୙୑୉
RAB
+ 2*TPropAB) RBCmax + LDATA (4)
While calculating TLow, it is important to take into
consideration the data rates of both the previous link (RAB), as
well as the maximum rate of the following links (RBCmax), as
the rate of the following link directly effects the speed at which
the switch is forwarding the buffered frames. The higher the
data rate of the following link, the faster the buffered packets
will be transmitted, and hence the faster there will be available
space in the buffer to accept more packets. Having a higher
4
data rate for the following link will allow the buffer to reach
TLow faster. Hence, it makes more sense to start resuming the
sending of the packets, and this will increase the efficiency as
the links are not in an idle state for extended periods of time.
This makes TLow an important parameter when it comes to
increasing the overall throughput of the system, as will be
shown in the next section.
As any buffer size is assumed to be available to be used,
having a QMax as big as possible is of course the best decision
to take. This assures that the THigh is going to be reached after a
longer time, which means that the system will stay in a
transmitting state as long as possible, which increases the
overall efficiency and throughput. On the other side, there is a
minimum value that QMax must be that depends on both QHigh
and TLow. The following equation shows the minimum value of
QMax :
QMax ≥ QHigh + TLow (5)
The maximum size of the buffer QMax must be greater than
or equal to the sum of QHigh and TLow. This constraint is
important as it guarantees that THigh is always greater than TLow
and guarantees that the node during the pausing period will
eventually reach TLow and send the RESUME message so that
the system is not in a permanent pausing state.
IV. EVALUATION
In this section, the theoretical model explained in the
previous section is being validated by means of simulation, and
then by comparing its results to the theoretical values from the
equations. After some research among the various available
simulation tools, OMNeT++ [7] chosen to investigate the
performance of PFC, by creating a simple simulation
environment consisting of mainly three modules, namely a
Host, a Switch and a Storage Server. These modules were
connected using FCoE and FC links. The code was created in
such a way that it will be generic to the case being studied,
which allows investigating multiple scenarios with different
parameters. SIMSANs [8] was another simulation tool that was
chosen initially, but then it was replaced by OMNeT++ as it
did not provide the required configuration capabilities to create
a simulation that covers our investigation.
In this evaluation section, two main points are being
evaluated and their results are introduced. The first is how
varying the High Threshold affects the packet loss. The second,
investigates if the Low Threshold has any significant effect on
the overall throughput of the system.
A. Validating the High Threshold formula (THigh)
In order to validate equation (2) while the constraint (5)
holds, multiple calculations for different data rates and
propagation delays were done. In each of these cases the
optimum THigh was calculated prior to each simulation where a
range of THigh was introduced for each case to provide a graph
that shows how the system behaves while varying the THigh. In
each case, the percentage of packets that were dropped at the
intermediate Switch was accounted. Figure 3 shows how the
network that was used in these simulations looks like.
The equations proposed in the previous section are aimed to
examine the hypothesis whether PFC can be used as a lossless
communication protocol so FCoE can be extended to a
Metropolitan Area Network (MAN) scenario, as the code
created was generic and can be extended to multiple nodes and
networks. However it was decided that using a simple network
as shown in Figure 3 is enough as a first step for studying the
general problem. The Host is continuously generating traffic
and sending it on the link with a rate that can be changed
according to the parameters used for each experiment. Also,
the data frame lengths were fixed to 2240 Bytes as it is the FC
MTU [1].
The simulation was executed for multiple case scenarios,
each with different link data rates and propagation delays.
Simultaneously, equation (2) was used to calculate the
optimum value for the THigh in each of these cases. For this
specific experiment where we were checking the behavior of
varying the THigh with different data rates and propagation
delays, the QMax of the intermediate switch’s buffer was chosen
to be 480 KB as suggested by [1] in an attempt to make the
study coincide with current switch technology. Also, the data
frames that were used were of a fixed length of 2240 Bytes and
the control frames are fixed to 64 Bytes long. Table I shows the
optimum values of THigh for each of the studied cases.
As the size of frames used is fixed, it is easy to represent the
THigh in terms of frames. For a system that uses a variable
length data frame, using the maximum length of the frame will
suffice in the calculation. In Table I, for example
“10Gbps/1ms” the calculated QHigh is greater than QMax, which
breaks the (5) constraint, and thus the THigh is out of range as it
exceeds the actual buffer size. The THigh for the shown in the
table shows how many frames a switch with 480 KB buffer
can accept before the PAUSE control frame will be sent to the
previous node for pausing of transmission. For example, in the
case of “8Gbps/100us”, the maximum number of frames that a
480 KB buffer can hold is 480000/2240 = 214 frames, the THigh
is calculated to be triggered when the buffer is filled with 91
packets. This means that in this case, memory for 214-91=123
frames are allocated and reserved for the frames that are in
flight on the link and those that will be generated and sent
during the period it takes for the PAUSE frame to reach the
previous node.
Figure 2: The simple Storage Area Network used in the simulations to
validate the proposed model
5
TABLE I. THE ANALYTICAL OPTIMUM T_HIGH FOR DIFFERENT DATA RATES
AND PROPAGATION DELAYS (QMAX = 480KB).
Now that the analytical values are known for THigh, multiple
simulations were performed with the same parameters to check
if the equations are sound. Figure 4 shows the results of these
simulations.
Figure 4 shows the percentage of packets lost at the
intermediate Switch when varying THigh for different data rates
and propagation delays. THigh optimum is the threshold at
which before it, packets are dropped at the receiver buffer as
there was no sufficient memory for the frames that are in flight
on the link. This means that starting from THigh the percentage
of packets dropped should be zero. For example in Figure 4, in
the case of “8Gbps,100us” when THigh is set to 0, the
percentage of lost packets is 30%. When THigh is set to 91
frames, the percentage of frames lost first reaches 0%, as
expected because this is the same result given by the equation
(2) and equation (5). These results show that the equation (2) is
sound and it works, as the results from Table I and Figure 4 are
consistent.
B. Validating the Low Threshold formula (TLow)
As stated in the previous section, the Low Threshold’s
main function in this protocol is to optimize the throughput in
the system, namely, utilize the links to the maximum limit
allowed by the protocol while maintaining its main
characteristic of being a lossless communication protocol.
While the system is in a PAUSE state, the links are either not
used in the paused segment of the path of the link, or just used
to send control frames. This means that the transmission of
actual data is not happening. If there is no Low Threshold
being set in the intermediate switches, then the pausing state
that has been started by that switch will be in effect or running
until all the frames that are being stored in the buffer has been
processed and the buffer is completely empty (i.e. TLow = 0).
Introducing a higher Low Threshold will make the pausing
state much shorter. When the system is in a pause state, the
switch will process the frames stored in the buffer and deplete
the buffer till it hits the Low Threshold mark. When the Low
Threshold is reached, the Switch sends a RESUME control
frame upstream on the link to notify the previous hop that the
buffer is free enough, so that the transmission can start again.
When a paused node receives a RESUME frame, it starts
sending the packets that it has either stored in its queue, or
generate new data packets normally till it receives another
PAUSE frame.
TABLE II. THE ANALYTICAL OPTIMUM T_LOW FOR DIFFERENT DATA RATES
AND PROPAGATION DELAYS.
To validate equation (4) that was formulated in the previous
section, the equation was used to get the optimum Low
Threshold in multiple cases, with a range of link data rates and
delays. The optimum values produced by the equation for each
case are then compared to the simulation results for these
cases. The results in Table II show the optimum Lower
Threshold of a host generating packets at 2 Gbps and having a
bottleneck link with a data rate of 1 Gbps as well as the
optimum Low Threshold for a host generating at 5 Gbps with a
bottleneck link of 4 Gbps, as they were calculated using
equation (4). In these experiments THigh was set to the optimal
values.
Link DataRates, Propagation Delay Optimum T_High in number
of 2240 Bytes frames
10Gbps, 1ms QHigh > QMax (214 frames)
8Gbps, 1ms QHigh > QMax (214 frames)
4Gbps, 1ms QHigh > QMax (214 frames)
10Gbps, 100us 113 frames
8Gbps, 100us 91 frames
4Gbps, 100us 46 frames
10Gbps, 10us 13 frames
8Gbps, 10us 10 frames
4Gbps, 10us 6 frames
10Gbps, 1us 3 frames
8Gbps, 1us 3 frames
DataRates / Propagation Delays Optimum TLow in number of
2240Byte frames
2Gbps generating host,
1Gbps bottleneck
112 frames
5Gbps generating host,
4Gbps bottleneck
447 frames
10Gbps generating host,
4Gbps bottleneck
447 frames
Figure 3: Percentage of frames lost at the receiver switch queue while varying THigh
6
To validate the analytical results shown in Table II, a
simulation was done with the same parameters that were used
while calculating the optimum Low Threshold to check
whether the values generated from the equations were in fact
the optimum ones. Figure 5-a shows the configuration of the
network that was used in the first case were the host was
generating packets at a rate of 2 Gbps, while Figure 5-b shows
the other case where the host was generating at a rate of 5Gbps.
The results from these simulations are plotted in Figure 6.
The results from Figure 6 along with the calculated results
from Table II agree with each other. Figure 6 shows how the
throughput varies as a range of Low Thresholds are being set in
the switches. The figure clearly shows the optimum points
where after, the throughput gets steady at the optimum
throughput for the system. Taking for example the case of “2
Gbps generating host”, Table II shows that the optimum Low
Threshold is calculated to be 112 frames. The results from
figure 6 show that the throughput stabilizes to 1 Gbps at the
point corresponding to the Low Threshold set to 113 packets.
To see the throughput of the system to stabilize to be 1 Gbps is
expected as the bottleneck link along the path was a 1 Gbps
link [6]. In the case of “10 Gbps generating host” that uses the
network in Figure 5-b, it is clear that the Low Threshold is the
same as in the previous case. This is predictable as the equation
takes into consideration the link rate rather than the generating
rate. Figure 6 shows that the case “10 Gbps generating host”,
the throughput stabilizes to be 4 Gbps which is the bottleneck
of the link, at the same optimal value as in the case of “5 Gbps
generating host”, although the improvement in the throughput
is very small and almost insignificant. This shows that with
higher rates of packet generation, the effect of Low Threshold
is not noticeable as the number of packets generated is higher,
and hence the pause state does not have the same significant
effect on the Throughput as it is in the case of lower packet
generation rate, in which there is smaller number of packets
generating, and thus stopping for a while have a clear effect on
the overall Throughput. These results shows that the equations
mentioned in Section IV are valid, and can be used to predict
that required resources that should be allocated to support a
specific scenario. Hence, enterprises and network
administrators can easily know the lowest buffer size needed to
provide in order to satisfy a certain demand.
Figure 5-a: Network and link parameters used for a 2 Gbps generating host
Figure 6: How varying the Low Threshold affects the Throughput
Figure 5-b: Network and link parameters used for a 5 Gbps generating host
7
Figure 7 was created to further help enterprises and
network administrators to have a broader perspective about the
needed storage of the buffers in MAN scenarios, either in terms
of distance that needs to be covered, or in terms of delay.
For example if a network administrator wants to connect 2
Data Centers with a 10 Gbps link while using 4 Gbps links
inside the Data Centers, and he wants the inter-DC distance to
be 150 Km, then the required storage buffers should not be less
than around 9000 KB. For the same configuration, if the inter-
DC delay is 1 ms, then the required storage buffers should not
be less than 3500 KB in order for the PFC protocol to function
without problems. Therefore we think that PFC can be
successfully used to extend FCoE to MAN scenarios, although
larger buffers will be needed.
V. CONCLUSION AND FUTURE WORK
The aim of this paper is to study Priority Flow Control
(PFC) for a converged, FCoE-based Data Center. PFC has
multiple advantages over the traditional protocols like
traditional Ethernet or Fibre Channel. The main advantage is
that it enables a cheap lossless communication between the
sender and receiver without retransmission. This paper has
introduced a model that can be used to calculate the High and
Low Thresholds that this protocol uses. A simulator
environment was created using OMNeT++ and was used to
validate whether these equations are actually valid, by a series
of simulation experiments and by comparing the analytical
results with the experimental results produced by the simulator.
The conclusion is that the provided equations are valid and can
be used by enterprises and network administrators to predict
the needed resources in order for them to meet certain
requirements. Also, it was found out that the PFC protocol can
actually be implemented in Metropolitan Area Network
(MAN) scenarios to connect Data Centers in the same city
(100Km – 300Km) and can be upgraded to cover a wider
geographical area when switches with higher buffer capacity
are available.
Regarding how this study may be further improved, future
work may include upgrading the PFC implementations to test
more complex network architectures with multiple
interconnected nodes, rather than just investigating a single
path. Also, further studies to provide a model for the
throughput of the protocol given some parameters of the
network may be helpful to enterprises to achieve the target
Quality of Service (QoS) with smaller buffers. Testing the PFC
protocol while taking into consideration variable length frames
rather than just fixed length frames will further help
understanding if this protocol can be used in everyday
scenarios.
ACKNOWLEDGEMENTS
The author would like to acknowledge Prof. Manuel
Urueña for his continuous guidance and knowledge, as well as
the Author’s Family for their ever ending support and patience
during the duration of this thesis.
VI. REFERENCES
[1] Cisco Systems, “Priority Flow Control: Build Reliable Layer 2
Infrastructure” white paper, 2009.
[2] M. Ko, D. Eisenhauer and R. Recio, 'A case for Convergence
Enhanced Ethernet: Requirements and applications', IEEE
Communications Society, pp. 5702--5707, 2008.
[3] EMC2
, “Introduction to Fibre Channel over Ethernet (FCoE)”,
2011, white paper.
[4] G. De Los Santos, M. Urueña, A. Muñoz, J. Hernandez 'Buffer
Design Under Bursty Traffic with Applications in FCoE Storage
Area Networks', IEEE Communications Letters, vol 17, no 2,
pp. 413--416, 2013.
[5] P. Kale, A. Tumma, H. Kshirsagar, P. Ramrakhyani and T.
Vinode, 'Fibre Channel over Ethernet: A beginners perspective',
IEEE-International Conference, pp. 438--443, 2011.
[6] J. Jaffe, 'Bottleneck flow control', IEEE Transactions on
Communications, vol 29, no 7, pp. 954--962, 1981.
[7] Official Website of OMNet++, http:// http://www.omnetpp.org/
[8] Official Website of SimSANs, http://www.simsans.org/
Figure 7: Storage needed for the buffers for a given distance or delay

More Related Content

What's hot

Networking Related
Networking RelatedNetworking Related
Networking Related
ZunAib Ali
 
Frame relay
Frame relay Frame relay
Frame relay
balub4
 
Packet switching
Packet switchingPacket switching
Packet switching
asimnawaz54
 

What's hot (20)

Frame Relay Chapter 04
Frame Relay Chapter 04Frame Relay Chapter 04
Frame Relay Chapter 04
 
INTRODUCTION TO NETWORK LAYER
INTRODUCTION TO NETWORK LAYER INTRODUCTION TO NETWORK LAYER
INTRODUCTION TO NETWORK LAYER
 
10 high speedla-ns
10 high speedla-ns10 high speedla-ns
10 high speedla-ns
 
AN EXPLICIT LOSS AND HANDOFF NOTIFICATION SCHEME IN TCP FOR CELLULAR MOBILE S...
AN EXPLICIT LOSS AND HANDOFF NOTIFICATION SCHEME IN TCP FOR CELLULAR MOBILE S...AN EXPLICIT LOSS AND HANDOFF NOTIFICATION SCHEME IN TCP FOR CELLULAR MOBILE S...
AN EXPLICIT LOSS AND HANDOFF NOTIFICATION SCHEME IN TCP FOR CELLULAR MOBILE S...
 
Networking Related
Networking RelatedNetworking Related
Networking Related
 
HIGH SPEED NETWORKS
HIGH SPEED NETWORKSHIGH SPEED NETWORKS
HIGH SPEED NETWORKS
 
9 lan
9 lan9 lan
9 lan
 
Frame relay
Frame relay Frame relay
Frame relay
 
A THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKS
A THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKSA THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKS
A THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKS
 
IEEE 802.11 Architecture and Services
IEEE 802.11 Architecture and ServicesIEEE 802.11 Architecture and Services
IEEE 802.11 Architecture and Services
 
Frame Relay
Frame RelayFrame Relay
Frame Relay
 
10 Circuit Packet
10 Circuit Packet10 Circuit Packet
10 Circuit Packet
 
TCP Fairness for Uplink and Downlink Flows in WLANs
TCP Fairness for Uplink and Downlink Flows in WLANsTCP Fairness for Uplink and Downlink Flows in WLANs
TCP Fairness for Uplink and Downlink Flows in WLANs
 
Network Layer,Computer Networks
Network Layer,Computer NetworksNetwork Layer,Computer Networks
Network Layer,Computer Networks
 
8 Packet Switching
8 Packet Switching8 Packet Switching
8 Packet Switching
 
VIRTUAL CIRCUIT NETWORKS, atm , frame relay
VIRTUAL CIRCUIT NETWORKS, atm , frame relayVIRTUAL CIRCUIT NETWORKS, atm , frame relay
VIRTUAL CIRCUIT NETWORKS, atm , frame relay
 
Unit 5
Unit 5Unit 5
Unit 5
 
Chapter#3
Chapter#3Chapter#3
Chapter#3
 
Multiplexing and switching(TDM ,FDM, Data gram, circuit switching)
Multiplexing and switching(TDM ,FDM, Data gram, circuit switching)Multiplexing and switching(TDM ,FDM, Data gram, circuit switching)
Multiplexing and switching(TDM ,FDM, Data gram, circuit switching)
 
Packet switching
Packet switchingPacket switching
Packet switching
 

Viewers also liked

TIP OF THE DAY series about DIP
TIP OF THE DAY series about DIPTIP OF THE DAY series about DIP
TIP OF THE DAY series about DIP
Darshana Samanpura
 
Talk on Ramanujan
Talk on RamanujanTalk on Ramanujan
Talk on Ramanujan
S Sridhar
 

Viewers also liked (18)

TIP OF THE DAY series about DIP
TIP OF THE DAY series about DIPTIP OF THE DAY series about DIP
TIP OF THE DAY series about DIP
 
Podcast y su uso en la educación
Podcast y su uso en la educaciónPodcast y su uso en la educación
Podcast y su uso en la educación
 
Class 7 bangladesh & global studies capter 9 class 2
Class 7 bangladesh & global studies capter 9 class 2Class 7 bangladesh & global studies capter 9 class 2
Class 7 bangladesh & global studies capter 9 class 2
 
Class 7 bangladesh & global studies capter 10 class 2
Class 7 bangladesh & global studies capter 10 class 2Class 7 bangladesh & global studies capter 10 class 2
Class 7 bangladesh & global studies capter 10 class 2
 
Cory Hunkele PPP
Cory Hunkele PPPCory Hunkele PPP
Cory Hunkele PPP
 
组合 1
组合 1组合 1
组合 1
 
Triple f health club group 2
Triple f health club group 2Triple f health club group 2
Triple f health club group 2
 
Plamanul postoperator
Plamanul postoperatorPlamanul postoperator
Plamanul postoperator
 
Kajian Pendidikan Menyongsong Bonus Demografi
Kajian Pendidikan Menyongsong Bonus DemografiKajian Pendidikan Menyongsong Bonus Demografi
Kajian Pendidikan Menyongsong Bonus Demografi
 
Derecho inquilinario
Derecho inquilinarioDerecho inquilinario
Derecho inquilinario
 
Class 7 bangladesh & global studies chapter 11 class 1
Class 7 bangladesh & global studies chapter 11 class 1 Class 7 bangladesh & global studies chapter 11 class 1
Class 7 bangladesh & global studies chapter 11 class 1
 
Class 7 bangladesh & global studies capter 11 class 2
Class 7 bangladesh & global studies capter 11 class 2Class 7 bangladesh & global studies capter 11 class 2
Class 7 bangladesh & global studies capter 11 class 2
 
La Cronica Baja california mexico
La Cronica Baja california mexicoLa Cronica Baja california mexico
La Cronica Baja california mexico
 
Talk on Ramanujan
Talk on RamanujanTalk on Ramanujan
Talk on Ramanujan
 
Mapa conceptual intro
Mapa conceptual introMapa conceptual intro
Mapa conceptual intro
 
Active pasive-voice-continuous
Active pasive-voice-continuousActive pasive-voice-continuous
Active pasive-voice-continuous
 
Starbucks
StarbucksStarbucks
Starbucks
 
Educación y derechos humanos.
Educación y derechos humanos.Educación y derechos humanos.
Educación y derechos humanos.
 

Similar to AhmedAymanMastersThesis

A novel pause count backoff algorithm for channel access
A novel pause count backoff algorithm for channel accessA novel pause count backoff algorithm for channel access
A novel pause count backoff algorithm for channel access
ambitlick
 
A20345606_Shah_Bonus_Report
A20345606_Shah_Bonus_ReportA20345606_Shah_Bonus_Report
A20345606_Shah_Bonus_Report
Panth Shah
 
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
IJERA Editor
 
A distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase theA distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase the
Kamal Spring
 

Similar to AhmedAymanMastersThesis (20)

Frame relay
Frame relayFrame relay
Frame relay
 
A dynamic performance-based_flow_control
A dynamic performance-based_flow_controlA dynamic performance-based_flow_control
A dynamic performance-based_flow_control
 
Quality of Service for Video Streaming using EDCA in MANET
Quality of Service for Video Streaming using EDCA in MANETQuality of Service for Video Streaming using EDCA in MANET
Quality of Service for Video Streaming using EDCA in MANET
 
High speed Networking
High speed NetworkingHigh speed Networking
High speed Networking
 
Data link control
Data link controlData link control
Data link control
 
A novel pause count backoff algorithm for channel access
A novel pause count backoff algorithm for channel accessA novel pause count backoff algorithm for channel access
A novel pause count backoff algorithm for channel access
 
Traffic Engineering in Metro Ethernet
Traffic Engineering in Metro EthernetTraffic Engineering in Metro Ethernet
Traffic Engineering in Metro Ethernet
 
A20345606_Shah_Bonus_Report
A20345606_Shah_Bonus_ReportA20345606_Shah_Bonus_Report
A20345606_Shah_Bonus_Report
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
 
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
Proposition of an Adaptive Retransmission Timeout for TCP in 802.11 Wireless ...
 
Unified Computing In Servers
Unified Computing In ServersUnified Computing In Servers
Unified Computing In Servers
 
Frame
FrameFrame
Frame
 
MyThesis
MyThesisMyThesis
MyThesis
 
ENHANCEMENT OF TCP FAIRNESS IN IEEE 802.11 NETWORKS
ENHANCEMENT OF TCP FAIRNESS IN IEEE 802.11 NETWORKSENHANCEMENT OF TCP FAIRNESS IN IEEE 802.11 NETWORKS
ENHANCEMENT OF TCP FAIRNESS IN IEEE 802.11 NETWORKS
 
802 tutorial
802 tutorial802 tutorial
802 tutorial
 
Media Access and Internetworking
Media Access and InternetworkingMedia Access and Internetworking
Media Access and Internetworking
 
A distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase theA distributed three hop routing protocol to increase the
A distributed three hop routing protocol to increase the
 
Detailed Simulation of Large-Scale Wireless Networks
Detailed Simulation of Large-Scale Wireless NetworksDetailed Simulation of Large-Scale Wireless Networks
Detailed Simulation of Large-Scale Wireless Networks
 
Networking Articles Overview
Networking Articles OverviewNetworking Articles Overview
Networking Articles Overview
 
CCNA Report
CCNA ReportCCNA Report
CCNA Report
 

AhmedAymanMastersThesis

  • 1. 1 Optimal thresholds for Priority Flow Control in long- range Fibre Channel over Ethernet links Ahmed Ayman Master’s Thesis Universidad Carlos III de Madrid Avda. Universidad 30, Leganes (Madrid), Spain ahmed.ayman11100@gmail.com Abstract—This paper investigates how Fibre Channel over Ethernet can be used to extend Storage Area Network (SAN) to a Metropolitan Area Network (MAN) environment. This paper studies the performance of Priority Flow Control (PFC) and Converged Enhanced Ethernet (CEE) in an attempt to obtain a lossless Ethernet based SAN that costs much less than traditional Fibre Channel (FC) technologies currently used in SANs. In particular this paper provides an analytical model for the optimal thresholds that regulates the PFC flow control mechanism employed by FCoE and evaluates how link distance affects the buffer requirements of FCoE switches in MAN scenarios. The model is then validated by means of simulation. Keywords—Fibre Channel over Ethernet (FCoE), delay, Priority Flow Control (PFC), Converged Enhanced Ethernet (CEE), lossless Ethernet, Metropolitan Area Network (MAN). I. INTRODUCTION Nowadays, there is a clear trend for enterprises to use Storage Area Networks (SAN) to manage and store their data. This has given rise to a high demand of this kind of networks [1]. Usually, Data Centers have used more than one type of technology. Fibre Channel (FC) is the commonly used for the SANs of these datacenters, while Ethernet is used to interconnect the servers with a Local Area Network (LAN) [4]. Although this configuration has been working fine for many years, deploying multiple technologies have many disadvantages, such as operations, administration and management (OAM) might get hard and problematic. Also, there are certain devices like network adapters and switches that need to be replicated because they are different according to the technology used [2]. This is needed because IEEE 802.3 Ethernet provides a link layer service that does not guarantee a lossless transmission, as it relies on retransmission from upper layers (i.e. TCP) for handling lost or dropped frames. All these issues are some of the problems of SANs these days that thus require more time, money and effort [1] [2] [3]. Due to these problems, there is a clear interest in finding a solution that is more cost-effective by merging the data and storage networks of data centers into a single infrastructure that assures both flexibility and low cost, by using some protocol that provides high performance and reliability without the need of retransmission when data is lost, which incurs in too much delay for storage applications. II. BACKGROUND AND STATE OF THE ART This section provides some background about Priority Flow Control (PFC), Converged Enhanced Ethernet (CEE) and Fibre Channel over Ethernet (FCoE), and gives an overview of the related work that has been done in this field. Traditional IEEE 802.3 Ethernet has been used in enterprise networks for quite some time. However, today’s enterprise data centers require lossless connection due to SAN requirements [5]. This is where Converged Enhanced Ethernet (CEE) comes into play. CEE is an enhancement of classical Ethernet technology that has been developed to allow the consolidation of different varieties of applications in data centers into a single common interconnection. This is very appealing to enterprise data centers, as it has many advantages. For instance it makes the administration of the data centers a more easier task, as it does not have to deal with multiple technologies and the interconnections between them, but rather a single, common underlying “fabric”. It also helps enterprises to save a lot of money and time as now the physical components required in these data centers belong to the same type of technology and thus can be unified. Priority Flow Control (PFC) belongs to this initiative as it allows to merge different types of data into a single link, without the need of stopping high priority traffic if the link is congested, as it occurs in traditional Ethernet flow control. The main technology that employs (PFC) is Fibre Channel over Ethernet (FCoE), which encapsulates Fibre Channel storage data into Ethernet frames. This gives the possibility of introducing an Ethernet-based SAN into a data center that uses FC, without the need to alter its current data infrastructure, making it ideal to be the upgraded technology for traditional FC. But the most important advantage that PFC provides over the technologies used in today’s data centers is that it ensures a lossless interconnection by performing hop-by-hop flow control. Thus, CEE enables a cost-efficient lossless Ethernet service over an infrastructure which is flexible, reliable, presenting high performance and low latency.
  • 2. 2 The main cause of packet loss in a typical Ethernet network that consists of multiple switching hops between the sender and the receiver is congestion, since there is no feedback that indicates the overload of the links between these hops. Thus, it is normal that transmitters inject frames into the network at a rate that is faster than the rate that the receiver can forward or process those frames, which causes the buffers in the receivers to saturate and eventually drop the packets that are still in flight on the link, which makes it an unreliable transmission. The Data Center Bridging (DCB) task group was formed by the IEEE to address such problems. One of the standards that they develop is Priority Flow Control (PFC). In particular, the PFC IEEE 802.1Qbb standard proposes a mechanism that is based on the basic IEEE 802.3x flow control protocol. More specifically, it uses two distinctive thresholds at which an Ethernet switch at each hop along the path of communication sends a feedback upstream to control the data flow. These are the buffers’ High Threshold (THigh) and the Low Threshold (TLow). Those thresholds are associated with 2 types of control messages, the PAUSE and the RESUME frames. PFC defines how the system processes the arrival of these control frames in order to achieve a lossless communication while maintaining an acceptable QoS. Figure 1 show how the PAUSE frame used by the PFC protocol looks like in comparison with a normal MAC control frame and a IEEE 802.3x PAUSE frame. Figure 1: 64-byte MAC control frame used by PFC compared to MAC control frame and IEEE 802.3x PAUSE frame As shown in Figure 1, the PFC PAUSE frame has up to 8 Class of Service (CoS) fields. This gives a great advantage as now the system has the ability, not only to control the flow of all data in the link, but also to control each specific type of traffic, which allows high priority applications (e.g. SANs) to coexist on the same link with low priority traffic, by pausing the low priority applications that are causing the buffers only to saturate. This significantly enhances the Quality of Service (QoS) of CEE [1]. The High Threshold (THigh) is set in such a way that when the buffer exceeds it, the receiver node sends a PAUSE frame upstream to the previous nodes to alert them to stop sending more frames on the link as the buffer is about to be filled. While calculating the THigh, the latency of the link and data rate play an important role. While the transmitter is in a pause state, the buffer of the receiver switch is getting depleted as frames that were buffered are being processed and forwarded. When the buffer size goes below the Low Threshold (TLow), another control frame called RESUME is generated and sent upstream to tell the sender that now there is enough space in the buffer to start sending data. A more detailed explanation about these thresholds and how to calculate their optimum values will be found in the next section. Figure 2 provides a brief overview of how the PFC protocol, with the different buffer thresholds, work together as a system. At first, the host is generating and sending frames across the link to the switch. The switch stores the incoming frames in its buffer waiting to be forwarded. When the occupied storage of the buffer exceeds the High Threshold, the switch sends back a PAUSE frame on the link to notify the host to stop sending anymore frames. But while the generated PAUSE frame reaches the host, the host keeps generating and sending frames. These frames that were generated between the time the buffer size reaches the High Threshold and the PAUSE frame reaches the host must be taken into consideration while setting the High Threshold of the switch buffer in order to be able to store them when they arrive to the switch. While the host is in the pause state, the switch continues processing and forwarding the buffered frames. This continues until the occupied storage in the buffer falls below the Low Threshold. When this happens, the switch sends back a RESUME frame to the host, to notify it that it is now safe for it to start sending frames again. As soon as the RESUME frame reaches the host, it continues generating and forwarding the frames normally till it receives a PAUSE frame again. Figure 1: An overview of how the Priority Flow Control (PFC) protocol uses THigh and TLow to work together as a system
  • 3. 3 III. OPTIMAL BUFFER SIZE FOR CEE PRIORITY FLOW CONTROL In this section, we formulate a model to calculate the optimum High Threshold (THigh) and Low Threshold (TLow), at which PAUSE and RESUME messages should be generated, while guaranteeing a lossless connection on Fiber Channel over Ethernet (FCoE) links using PFC. These equations take into consideration the different parameters on the links used, such as the transmission rate, latency of the link, the size of the data packets generated, etc. A. High Threshold Buffer Size (THigh) The High Threshold is set at the receiver buffer of the intermediate switches along the path of the traffic between the source node, for example a host, and the sink node, like a server or storage cabin. This THigh is an essential limit of the buffer, which is obviously lower than the maximum buffer size of the switches used. When THigh is reached, a PAUSE message is sent back to notify the previous node to stop sending more traffic as the buffer is close to being filled. This approach assures that there will be no packet loss due to dropping at the intermediate switches when the receiver buffers are exhausted. THigh should satisfy two important constraints: • The first constraint is that PAUSE frames have to be sent early enough as to predict that the maximum buffer size will not be exceeded by the frames received by the switch. It should take into consideration that there are more frames being sent by the source and are in flight on the link, because more frames will be generated and sent by the source during the propagation time required for the PAUSE message to be received by the source to stop sending any additional traffic. • The second constraint that THigh should satisfy is that it has to be as high as possible as to not severely decrease the throughput of the system. Since the links are idle during the pause period and not utilized, having a lower than required THigh, it would be reached faster and hence more PAUSE messages are generated, which causes the sender to stop sending traffic more frequently. This would result in a lower throughput for the whole system. It is essential to obtain an optimum value for the THigh at which no packet dropping at the switches are guaranteed, while not sacrificing the overall throughput. For that the following equations are used. THigh = QMax - QHigh (1) Where, QHigh = LPAUSE + 2*LDATA + 2*TProp*RAB (2) By using both (1) and (2), THigh = QMax – (LPAUSE + 2*LDATA + 2*TProp*RAB) (3) Equation (3) is used to calculate the optimum THigh for any link segment. QMax defines the maximum size of the receiver buffer. LPAUSE is the size of the PAUSE frame in bits, which is defined to be 64 Bytes (512 bits). LDATA is the Maximum Transmission Unit (MTU) of the data packets being sent on the segment in bits. TProp is the propagation delay of the link in seconds, which mainly depends its length. RAB is the transmission rate of the link in bits per second. QHigh is a predictor term that defines the actual buffer size needed, given the distance of the link, the transmission rate and the size of the PAUSE message and the MTU of the data packets, to be saved by the receiver for the frames that are in flight and not yet reached it. Hence, for calculating the actual threshold QHigh should be subtracted from QMax, as shown by equation (1). QHigh takes in consideration the round trip time for the actual response of the PAUSE message. When THigh is reached by adding the frames into the buffer, the PAUSE message is sent to the previous hop, but at the moment the PAUSE message reaches the source, the source might have just started sending the first bit of the new data packet. As the aim of this paper is to ensure a lossless mechanism for transmission, the source cannot stop transmission once it has started and the transmission must be completed. This scenario was taken into consideration and hence, twice the maximum transmission delay was used in the formula, because the same occurs when sending the PAUSE frame to consider this worst case scenario. B. Low Threshold Buffer Size ( TLow) The Low Threshold (TLow) is another important parameter defined in the PFC buffers of the intermediate nodes along the path of FCoE traffic. It is triggered or reached when the system is in the pause state. After a switch’s buffer reaches THigh, a PAUSE message is sent and the transmission is temporarily paused, giving time for the intermediate nodes to forward the frames they have in their buffers. TLow must have a lower value than THigh, so as the packets in the buffers are being processed, the buffer is gradually being depleted. When the buffer hits TLow, a RESUME message is generated and sent back to the previous node which notifies that the buffer now has free space to accept more packets. When the RESUME message reaches the previous hop, it changes its state from a pause state, and it continues with the transmission as usual. The following equation describes the optimum value for TLow: TLow = ( ୐ୖ୉ୗ୙୑୉ RAB + 2*TPropAB) RBCmax + LDATA (4) While calculating TLow, it is important to take into consideration the data rates of both the previous link (RAB), as well as the maximum rate of the following links (RBCmax), as the rate of the following link directly effects the speed at which the switch is forwarding the buffered frames. The higher the data rate of the following link, the faster the buffered packets will be transmitted, and hence the faster there will be available space in the buffer to accept more packets. Having a higher
  • 4. 4 data rate for the following link will allow the buffer to reach TLow faster. Hence, it makes more sense to start resuming the sending of the packets, and this will increase the efficiency as the links are not in an idle state for extended periods of time. This makes TLow an important parameter when it comes to increasing the overall throughput of the system, as will be shown in the next section. As any buffer size is assumed to be available to be used, having a QMax as big as possible is of course the best decision to take. This assures that the THigh is going to be reached after a longer time, which means that the system will stay in a transmitting state as long as possible, which increases the overall efficiency and throughput. On the other side, there is a minimum value that QMax must be that depends on both QHigh and TLow. The following equation shows the minimum value of QMax : QMax ≥ QHigh + TLow (5) The maximum size of the buffer QMax must be greater than or equal to the sum of QHigh and TLow. This constraint is important as it guarantees that THigh is always greater than TLow and guarantees that the node during the pausing period will eventually reach TLow and send the RESUME message so that the system is not in a permanent pausing state. IV. EVALUATION In this section, the theoretical model explained in the previous section is being validated by means of simulation, and then by comparing its results to the theoretical values from the equations. After some research among the various available simulation tools, OMNeT++ [7] chosen to investigate the performance of PFC, by creating a simple simulation environment consisting of mainly three modules, namely a Host, a Switch and a Storage Server. These modules were connected using FCoE and FC links. The code was created in such a way that it will be generic to the case being studied, which allows investigating multiple scenarios with different parameters. SIMSANs [8] was another simulation tool that was chosen initially, but then it was replaced by OMNeT++ as it did not provide the required configuration capabilities to create a simulation that covers our investigation. In this evaluation section, two main points are being evaluated and their results are introduced. The first is how varying the High Threshold affects the packet loss. The second, investigates if the Low Threshold has any significant effect on the overall throughput of the system. A. Validating the High Threshold formula (THigh) In order to validate equation (2) while the constraint (5) holds, multiple calculations for different data rates and propagation delays were done. In each of these cases the optimum THigh was calculated prior to each simulation where a range of THigh was introduced for each case to provide a graph that shows how the system behaves while varying the THigh. In each case, the percentage of packets that were dropped at the intermediate Switch was accounted. Figure 3 shows how the network that was used in these simulations looks like. The equations proposed in the previous section are aimed to examine the hypothesis whether PFC can be used as a lossless communication protocol so FCoE can be extended to a Metropolitan Area Network (MAN) scenario, as the code created was generic and can be extended to multiple nodes and networks. However it was decided that using a simple network as shown in Figure 3 is enough as a first step for studying the general problem. The Host is continuously generating traffic and sending it on the link with a rate that can be changed according to the parameters used for each experiment. Also, the data frame lengths were fixed to 2240 Bytes as it is the FC MTU [1]. The simulation was executed for multiple case scenarios, each with different link data rates and propagation delays. Simultaneously, equation (2) was used to calculate the optimum value for the THigh in each of these cases. For this specific experiment where we were checking the behavior of varying the THigh with different data rates and propagation delays, the QMax of the intermediate switch’s buffer was chosen to be 480 KB as suggested by [1] in an attempt to make the study coincide with current switch technology. Also, the data frames that were used were of a fixed length of 2240 Bytes and the control frames are fixed to 64 Bytes long. Table I shows the optimum values of THigh for each of the studied cases. As the size of frames used is fixed, it is easy to represent the THigh in terms of frames. For a system that uses a variable length data frame, using the maximum length of the frame will suffice in the calculation. In Table I, for example “10Gbps/1ms” the calculated QHigh is greater than QMax, which breaks the (5) constraint, and thus the THigh is out of range as it exceeds the actual buffer size. The THigh for the shown in the table shows how many frames a switch with 480 KB buffer can accept before the PAUSE control frame will be sent to the previous node for pausing of transmission. For example, in the case of “8Gbps/100us”, the maximum number of frames that a 480 KB buffer can hold is 480000/2240 = 214 frames, the THigh is calculated to be triggered when the buffer is filled with 91 packets. This means that in this case, memory for 214-91=123 frames are allocated and reserved for the frames that are in flight on the link and those that will be generated and sent during the period it takes for the PAUSE frame to reach the previous node. Figure 2: The simple Storage Area Network used in the simulations to validate the proposed model
  • 5. 5 TABLE I. THE ANALYTICAL OPTIMUM T_HIGH FOR DIFFERENT DATA RATES AND PROPAGATION DELAYS (QMAX = 480KB). Now that the analytical values are known for THigh, multiple simulations were performed with the same parameters to check if the equations are sound. Figure 4 shows the results of these simulations. Figure 4 shows the percentage of packets lost at the intermediate Switch when varying THigh for different data rates and propagation delays. THigh optimum is the threshold at which before it, packets are dropped at the receiver buffer as there was no sufficient memory for the frames that are in flight on the link. This means that starting from THigh the percentage of packets dropped should be zero. For example in Figure 4, in the case of “8Gbps,100us” when THigh is set to 0, the percentage of lost packets is 30%. When THigh is set to 91 frames, the percentage of frames lost first reaches 0%, as expected because this is the same result given by the equation (2) and equation (5). These results show that the equation (2) is sound and it works, as the results from Table I and Figure 4 are consistent. B. Validating the Low Threshold formula (TLow) As stated in the previous section, the Low Threshold’s main function in this protocol is to optimize the throughput in the system, namely, utilize the links to the maximum limit allowed by the protocol while maintaining its main characteristic of being a lossless communication protocol. While the system is in a PAUSE state, the links are either not used in the paused segment of the path of the link, or just used to send control frames. This means that the transmission of actual data is not happening. If there is no Low Threshold being set in the intermediate switches, then the pausing state that has been started by that switch will be in effect or running until all the frames that are being stored in the buffer has been processed and the buffer is completely empty (i.e. TLow = 0). Introducing a higher Low Threshold will make the pausing state much shorter. When the system is in a pause state, the switch will process the frames stored in the buffer and deplete the buffer till it hits the Low Threshold mark. When the Low Threshold is reached, the Switch sends a RESUME control frame upstream on the link to notify the previous hop that the buffer is free enough, so that the transmission can start again. When a paused node receives a RESUME frame, it starts sending the packets that it has either stored in its queue, or generate new data packets normally till it receives another PAUSE frame. TABLE II. THE ANALYTICAL OPTIMUM T_LOW FOR DIFFERENT DATA RATES AND PROPAGATION DELAYS. To validate equation (4) that was formulated in the previous section, the equation was used to get the optimum Low Threshold in multiple cases, with a range of link data rates and delays. The optimum values produced by the equation for each case are then compared to the simulation results for these cases. The results in Table II show the optimum Lower Threshold of a host generating packets at 2 Gbps and having a bottleneck link with a data rate of 1 Gbps as well as the optimum Low Threshold for a host generating at 5 Gbps with a bottleneck link of 4 Gbps, as they were calculated using equation (4). In these experiments THigh was set to the optimal values. Link DataRates, Propagation Delay Optimum T_High in number of 2240 Bytes frames 10Gbps, 1ms QHigh > QMax (214 frames) 8Gbps, 1ms QHigh > QMax (214 frames) 4Gbps, 1ms QHigh > QMax (214 frames) 10Gbps, 100us 113 frames 8Gbps, 100us 91 frames 4Gbps, 100us 46 frames 10Gbps, 10us 13 frames 8Gbps, 10us 10 frames 4Gbps, 10us 6 frames 10Gbps, 1us 3 frames 8Gbps, 1us 3 frames DataRates / Propagation Delays Optimum TLow in number of 2240Byte frames 2Gbps generating host, 1Gbps bottleneck 112 frames 5Gbps generating host, 4Gbps bottleneck 447 frames 10Gbps generating host, 4Gbps bottleneck 447 frames Figure 3: Percentage of frames lost at the receiver switch queue while varying THigh
  • 6. 6 To validate the analytical results shown in Table II, a simulation was done with the same parameters that were used while calculating the optimum Low Threshold to check whether the values generated from the equations were in fact the optimum ones. Figure 5-a shows the configuration of the network that was used in the first case were the host was generating packets at a rate of 2 Gbps, while Figure 5-b shows the other case where the host was generating at a rate of 5Gbps. The results from these simulations are plotted in Figure 6. The results from Figure 6 along with the calculated results from Table II agree with each other. Figure 6 shows how the throughput varies as a range of Low Thresholds are being set in the switches. The figure clearly shows the optimum points where after, the throughput gets steady at the optimum throughput for the system. Taking for example the case of “2 Gbps generating host”, Table II shows that the optimum Low Threshold is calculated to be 112 frames. The results from figure 6 show that the throughput stabilizes to 1 Gbps at the point corresponding to the Low Threshold set to 113 packets. To see the throughput of the system to stabilize to be 1 Gbps is expected as the bottleneck link along the path was a 1 Gbps link [6]. In the case of “10 Gbps generating host” that uses the network in Figure 5-b, it is clear that the Low Threshold is the same as in the previous case. This is predictable as the equation takes into consideration the link rate rather than the generating rate. Figure 6 shows that the case “10 Gbps generating host”, the throughput stabilizes to be 4 Gbps which is the bottleneck of the link, at the same optimal value as in the case of “5 Gbps generating host”, although the improvement in the throughput is very small and almost insignificant. This shows that with higher rates of packet generation, the effect of Low Threshold is not noticeable as the number of packets generated is higher, and hence the pause state does not have the same significant effect on the Throughput as it is in the case of lower packet generation rate, in which there is smaller number of packets generating, and thus stopping for a while have a clear effect on the overall Throughput. These results shows that the equations mentioned in Section IV are valid, and can be used to predict that required resources that should be allocated to support a specific scenario. Hence, enterprises and network administrators can easily know the lowest buffer size needed to provide in order to satisfy a certain demand. Figure 5-a: Network and link parameters used for a 2 Gbps generating host Figure 6: How varying the Low Threshold affects the Throughput Figure 5-b: Network and link parameters used for a 5 Gbps generating host
  • 7. 7 Figure 7 was created to further help enterprises and network administrators to have a broader perspective about the needed storage of the buffers in MAN scenarios, either in terms of distance that needs to be covered, or in terms of delay. For example if a network administrator wants to connect 2 Data Centers with a 10 Gbps link while using 4 Gbps links inside the Data Centers, and he wants the inter-DC distance to be 150 Km, then the required storage buffers should not be less than around 9000 KB. For the same configuration, if the inter- DC delay is 1 ms, then the required storage buffers should not be less than 3500 KB in order for the PFC protocol to function without problems. Therefore we think that PFC can be successfully used to extend FCoE to MAN scenarios, although larger buffers will be needed. V. CONCLUSION AND FUTURE WORK The aim of this paper is to study Priority Flow Control (PFC) for a converged, FCoE-based Data Center. PFC has multiple advantages over the traditional protocols like traditional Ethernet or Fibre Channel. The main advantage is that it enables a cheap lossless communication between the sender and receiver without retransmission. This paper has introduced a model that can be used to calculate the High and Low Thresholds that this protocol uses. A simulator environment was created using OMNeT++ and was used to validate whether these equations are actually valid, by a series of simulation experiments and by comparing the analytical results with the experimental results produced by the simulator. The conclusion is that the provided equations are valid and can be used by enterprises and network administrators to predict the needed resources in order for them to meet certain requirements. Also, it was found out that the PFC protocol can actually be implemented in Metropolitan Area Network (MAN) scenarios to connect Data Centers in the same city (100Km – 300Km) and can be upgraded to cover a wider geographical area when switches with higher buffer capacity are available. Regarding how this study may be further improved, future work may include upgrading the PFC implementations to test more complex network architectures with multiple interconnected nodes, rather than just investigating a single path. Also, further studies to provide a model for the throughput of the protocol given some parameters of the network may be helpful to enterprises to achieve the target Quality of Service (QoS) with smaller buffers. Testing the PFC protocol while taking into consideration variable length frames rather than just fixed length frames will further help understanding if this protocol can be used in everyday scenarios. ACKNOWLEDGEMENTS The author would like to acknowledge Prof. Manuel Urueña for his continuous guidance and knowledge, as well as the Author’s Family for their ever ending support and patience during the duration of this thesis. VI. REFERENCES [1] Cisco Systems, “Priority Flow Control: Build Reliable Layer 2 Infrastructure” white paper, 2009. [2] M. Ko, D. Eisenhauer and R. Recio, 'A case for Convergence Enhanced Ethernet: Requirements and applications', IEEE Communications Society, pp. 5702--5707, 2008. [3] EMC2 , “Introduction to Fibre Channel over Ethernet (FCoE)”, 2011, white paper. [4] G. De Los Santos, M. Urueña, A. Muñoz, J. Hernandez 'Buffer Design Under Bursty Traffic with Applications in FCoE Storage Area Networks', IEEE Communications Letters, vol 17, no 2, pp. 413--416, 2013. [5] P. Kale, A. Tumma, H. Kshirsagar, P. Ramrakhyani and T. Vinode, 'Fibre Channel over Ethernet: A beginners perspective', IEEE-International Conference, pp. 438--443, 2011. [6] J. Jaffe, 'Bottleneck flow control', IEEE Transactions on Communications, vol 29, no 7, pp. 954--962, 1981. [7] Official Website of OMNet++, http:// http://www.omnetpp.org/ [8] Official Website of SimSANs, http://www.simsans.org/ Figure 7: Storage needed for the buffers for a given distance or delay